Browse datasets

The EGA archives a large number of datasets, a few of which are publicly available. In order to gain access to the dataset(s) of your choice, please contact the Data Access Committee (DAC) whose details are reflected on the dataset page under the section marked, "Who controls access to this dataset". Once you have been granted permissions to access the dataset(s) by the appropriate DAC you will then be able to download your desired data.

Dataset ID	Description	Technology	Samples
EGAD00000000001	WTCCC1 project samples from 1958 British Birth Cohort	Affymetrix 500K	1504
EGAD00000000002	WTCCC1 project samples from UK National Blood Service	Affymetrix 500K	1500
EGAD00000000003	WTCCC1 project Bipolar Disorder (BD) samples		1
EGAD00000000004	WTCCC1 project Coronary Artery Disease (CAD) samples		1
EGAD00000000005	WTCCC1 project Inflammatory Bowel Disease (IBD) samples		1
EGAD00000000006	WTCCC1 project Hypertension (HT) samples		1
EGAD00000000007	WTCCC1 project Rheumatooid arthritis (RA) samples		1
EGAD00000000008	WTCCC1 project Type 1 Diabetes (T1D) samples		1
EGAD00000000009	WTCCC1 project Type 2 Diabetes (T2D) samples		1
EGAD00000000010	WTCCC1 project Ankylosing Spondylitis (AS) samples	Illumina 15K	957
EGAD00000000011	WTCCC1 project Autoimmune Thyroid Disease (ATD) samples	Illumina 15K	900
EGAD00000000012	WTCCC1 project Multiple Sclerosis (MS) samples		975
EGAD00000000013	WTCCC1 project Breast cancer (BC) samples	Illumina 15K	1004
EGAD00000000014	WTCCC1 project samples from 1958 British Birth Cohort		1504
EGAD00000000015	WTCCC project African control samples	Affymetrix 500K	1496
EGAD00000000016	WTCCC project Tuberculosis (TB) samples	Affymetrix 500K	1498
EGAD00000000017	Cord blood control samples from Gambia		-
EGAD00000000018	Severe malaria cases from Gambia		-
EGAD00000000019	840 families where both parents have been genotyped together with the child with severe malaria		1
EGAD00000000020	685 families where both parents have been genotyped together with the child with severe malaria		-
EGAD00000000021	WTCCC2 project samples from 1958 British Birth Cohort		3000
EGAD00000000022	WTCCC2 project samples from 1958 British Birth Cohort		3000
EGAD00000000023	WTCCC2 project samples from National Blood Donors (NBS) Cohort		1
EGAD00000000024	WTCCC2 project samples from National Blood Donors (NBS) Cohort		1
EGAD00000000025	WTCCC2 project Ulcerative Colitis (UC) samples	Affymetrix 6.0	2869
EGAD00000000026	Randomly-selected, unrelated individuals	Illumina 610-Quad	518
EGAD00000000027	eQTL data for European newborns	Ilumina HumanHap550-2v3_B-Beadstudio	176
EGAD00000000028	Aggregate results from a GWAS study on 3352 cases abd 3145 controls		6497
EGAD00000000029	Aggregate results from a case-control study on stroke and ischemic stroke.		19602
EGAD00000000030	T1DGC project 1958 British Birth Cohort samples		2604
EGAD00000000031	HLA genotyping of 1958 British Birth Cohort samples		1
EGAD00000000032	NcOEDG Helsinki 1 samples		1
EGAD00000000033	NcOEDG Helsinki 2 samples		1
EGAD00000000034	NcOEDG Helsinki 3 samples		1
EGAD00000000035	NcOEDG Helsinki 4 samples		1
EGAD00000000036	NcOEDG Stockholm 1 samples		1
EGAD00000000037	NcOEDG Stockholm 2 samples		1
EGAD00000000038	NcOEDG Stockholm 3 samples		1
EGAD00000000039	NcOEDG Malmo - Lund samples		1
EGAD00000000040	GenomEUtwin Danish (DK) samples		1
EGAD00000000041	GenomEUtwin Swedish (SWE) samples		1
EGAD00000000042	GenomEUtwin Finnish (FIN) samples		1
EGAD00000000043	GenomeEUtwin control samples	Illumina HumanHap300-Duo Illumina HumanHap 550K	2099
EGAD00000000044	Northern Finland Birth Cohort 1966 samples	Illumina HumanHap370	5844
EGAD00000000045	Genomic sequencing and transcriptome shotgun sequencing of a metastatic tumour and its recurrence after drug therapy in a single patient	Illumina Genome Analyzer II	1
EGAD00000000046	RNA-SEQ data from 3 recurrent and 1 ovarian primary Granulosa Cell Tumour samples		4
EGAD00000000047	Signal data for from 3 recurrent and 1 ovarian primary Granulosa Cell Tumour samples		4
EGAD00000000048	Sequencing data from oestrogen-receptor-alpha-positive metastatic lobular breast cancer sample	Illumina Genome Analyzer II	1
EGAD00000000049	RNA-SEQ data from oestrogen-receptor-alpha-positive metastatic lobular breast cancer sample	Illumina Genome Analyzer II	1
EGAD00000000051	Sequencing data from matching Renal Carcinoma samples	Illumina Genome Analyzer II	25
EGAD00000000052	Sequencing data from natching Pancreatic Carcinoma samples	Illumina Genome Analyzer II	25
EGAD00000000053	Sequencing data from Breast Cancer samples	Illumina Genome Analyzer II	1
EGAD00000000054	NCI-H209 is an immortal cell line derived from a bone marrow metastasis of a patient with small cell lung cancer, taken before chemotherapy. The specimen showed histologically typical small cells with classic neuroendocrine features. NCI-BL209 is an EBV-transformed B-cell line derived from the same patient as the small cell lung cancer cell line, NCI-H209	Life Tech - Solid	1
EGAD00000000055	COLO-829 is a publicly available immortal cancer cell line and COLO-829BL is a lymphoblastoid cell line derived from the same patient	Illumina Genome Analyzer II	2
EGAD00000000056	WTCCC project samples from the primary biliary cirrhosis cohort	Illumina 610K Quad	1705
EGAD00000000057	WTCCC project samples from the Parkinson's disase cohort	Illumina 610K Quad	1705
EGAD00000000058	Aggregate results from 22 Carbamazepine-induced hypersensitivity syndrome patients and 2691 UK National Blood Service (NBS) control samples		2713
EGAD00000000059	Aggregate results from 43 Carbamazepine-induced hypersensitivity syndrome patients and 1296 1958 British Birth Cohort control samples		1
EGAD00000000060	Samples from the UK Glomerulonephritis DNA bank		-
EGAD00000000073	Gabriel samples from the 1958 British Birth Cohort		1
EGAD00000000074	Gabriel samples from the Swedish BAMSE Cohort		1
EGAD00000000075	Gabriel samples from the Swedish BAMSE Cohort		1
EGAD00000000076	Gabriel samples from the Australian Bussleton Cohort		1
EGAD00000000077	Gabriel samples from the Australian Bussleton Cohort		1
EGAD00000000082	Gabriel samples from the French EGEA Cohort		1
EGAD00000000083	Gabriel samples from the French EGEA Cohort		1
EGAD00000000084	Gabriel samples from the German Gabriel Advanced Survey		1
EGAD00000000085	Gabriel samples from the German Gabriel Advanced Survey		1
EGAD00000000086	Gabriel samples from the multicenter GAIN cohort		1
EGAD00000000087	Gabriel samples from the multicenter GAIN cohort		1
EGAD00000000088	Gabriel samples from the Karelia Allergy Study		1
EGAD00000000089	Gabriel samples from the Karelia Allergy Study		1
EGAD00000000090	Gabriel samples from the Russian KMSU cohort		1
EGAD00000000091	Gabriel samples from the Russian KMSU cohort		1
EGAD00000000092	Gabriel samples from the German MAGIS cohort		1
EGAD00000000093	Gabriel samples from the German MAGIS cohort		1
EGAD00000000094	Gabriel samples from the UK MRCA cohort		1
EGAD00000000101	Gabriel samples from the Russian TOMSK cohort		1
EGAD00000000102	Gabriel samples from the Russian TOMSK cohort		1
EGAD00000000103	Gabriel samples from the Russian UFA cohort		1
EGAD00000000104	Gabriel samples from the Russian UFA cohort		1
EGAD00000000105	Gabriel samples from the multicenter occupational cohort		1
EGAD00000000106	Gabriel samples from the multicenter occupational cohort		1
EGAD00000000107	Gabriel samples from the multicenter occupational cohort		1
EGAD00000000108	Gabriel samples from the UK AUGOSA cohort		1
EGAD00000000109	Gabriel samples from the UK SEVERE cohort		1
EGAD00000000114	Whole transcriptome sequence data from 18 ovarian clear-cell carcinoma samples and one TOV21G ovarian clear-cell carcinoma cell line	Illumina Genome Analyzer II	1
EGAD00000000115	Summary data from GWAS analysis on 856 cases and 2836 control		3719
EGAD00000000119	Genotypes from cell lines derived from breast carcinoma tissue	Affymetrix 6.0	51
EGAD00000000120	WTCCC2 project Multiple Sclerosis (MS) samples	Human670-QuadCustom v1	11375
EGAD00000000121	Genotypes at MITF E318K variant	Taqman and sequencing	2488
EGAD00000000122	Genotypes at MITF E318K variant	Illumina Human660W-Quad Illumina HumanCNV370 Illumina HumanHap 300 v2 Duo	1925
EGAD00001000001	Exome sequencing identifies frequent mutation of the SWI/SNF complex gene PBRM1 in renal carcinoma	Illumina Genome Analyzer II	18
EGAD00001000002	Massive genomic rearrangement acquired in a single catastrophic event during cancer development	Illumina Genome Analyzer Illumina Genome Analyzer II	1
EGAD00001000003	Gencode Exome Pilot	Illumina Genome Analyzer II	7
EGAD00001000004	CLL cancer Sample Sequencing	Illumina Genome Analyzer Illumina Genome Analyzer II	5
EGAD00001000005	Various Cancer Fusion Gene Sequencing	Illumina Genome Analyzer II	14
EGAD00001000007	Osteosarcoma Sequencing	Illumina Genome Analyzer II	43
EGAD00001000013	CLL Cancer Whole Genome Sequencing	Illumina Genome Analyzer II	19
EGAD00001000014	Agilent whole exome hybridisation capture will be performed on genomic DNA derived from 25 renal cancers and matched normal DNA from the same patients. Three lanes of Illumina GA sequencing will be performed on the resulting 50 exome libraries and mapped to build 37 of the human reference genome to facilitate the identification of novel cancer genes.	Illumina Genome Analyzer II	54
EGAD00001000015	Exome sequencing of hyperplastic polyposis patients.	Illumina Genome Analyzer II Illumina HiSeq 2000	84
EGAD00001000016	Familial Melanoma Sequencing	Illumina Genome Analyzer II Illumina HiSeq 2000	89
EGAD00001000017	PAS Pedigrees: Identification of novel genetic variants contributing to cardiovascular disease in pedigrees with premature atherosclerosis.	Illumina Genome Analyzer II Illumina HiSeq 2000	18
EGAD00001000018	Identifying causative mutations for Thrombocytopenia with Absent Radii	Illumina Genome Analyzer II	5
EGAD00001000019	Lethal malformation syndrome	Illumina Genome Analyzer II	6
EGAD00001000021	Paroxysmal neurological disorders	Illumina Genome Analyzer II Illumina HiSeq 2000	97
EGAD00001000022	Exome sequencing in patients with cardiac arrhythmias	Illumina Genome Analyzer II	20
EGAD00001000023	Recurrent Somatic Mutations in CLL	Illumina Genome Analyzer IIx	11
EGAD00001000024	Whole Exome Sequencing for Characterization of Disease Causing Mutations in two Pakistani Families Suffering from Autosomal Recessive Ocular Disorders.	Illumina Genome Analyzer II	4
EGAD00001000025	Determination of the molecular nature of the Vel blood group by exome sequencing	Illumina Genome Analyzer II	4
EGAD00001000026	Investigation of the genetic basis of the rare syndrome Post-Transfusion Purpura (PTP)	Illumina Genome Analyzer II	5
EGAD00001000027	ICGC Germany PedBrain Medulloblastoma Pilot_2_LM	Illumina Genome Analyzer IIx Illumina HiSeq 2000	8
EGAD00001000029	Grey Platelet Syndrome (GPS)	Illumina Genome Analyzer II	5
EGAD00001000030	Analysis of genomic integrity of disease-corrected human induced pluripotent stem cells by exome sequencing	Illumina HiSeq 2000	4
EGAD00001000031	Human Colorectal Cancer Exome Sequencing	Illumina Genome Analyzer II	16
EGAD00001000032	Hepatitis C IL28B pooled resequencing study with 100 responders and 100 non-responders	Illumina Genome Analyzer IIx	4
EGAD00001000033	"SNV detection from formalin fixed paraffin embedded (FFPE) samples"	Illumina Genome Analyzer II	6
EGAD00001000034	"Usage of small amounts of DNA for Illumina sequencing"	Illumina Genome Analyzer II	3
EGAD00001000035	"Single nucleotide variant detection in multiple foci of three prostate cancer tumors"	Illumina Genome Analyzer II	9
EGAD00001000036	"Copy number variant detection in multiple foci of three prostate cancer tumors"	Illumina Genome Analyzer II	9
EGAD00001000037	An evaluation of different strategies for large-scale pooled sequencing study design.	Illumina Genome Analyzer II	7
EGAD00001000038	Hyperfibrinolysis	Illumina Genome Analyzer II	5
EGAD00001000039	Platelet collagen defect	Illumina Genome Analyzer II Illumina HiSeq 2000	11
EGAD00001000040	Bleeding	Illumina Genome Analyzer II	6
EGAD00001000041	Various Platelet Disorders	Illumina Genome Analyzer II	7
EGAD00001000042	Whole-Exome-Seq-Dataset	Illumina Genome Analyzer IIx	30
EGAD00001000043	RNA-Seq-Dataset	Illumina Genome Analyzer IIx	16
EGAD00001000044	Recurrent Somatic Mutations in CLL	Illumina Genome Analyzer IIx	212
EGAD00001000045	Somatic mutation of SF3B1 in myelodysplasia with ring sideroblasts and other cancers	Illumina Genome Analyzer II Illumina HiSeq 2000	33
EGAD00001000046	Gastric Cancer Exome Sequencing	Illumina Genome Analyzer IIx Illumina HiSeq 2000	43
EGAD00001000047	exome sequence data for 49 HIV elite long term non-progressors and rapid progressors. Partial dataset (overlap with EGAD00001000087) of raw BAMs mapped to GRCh37_53.	Illumina HiSeq 2000	49
EGAD00001000048	monozygotic twin discordant for schizophrenia	Complete Genomics	2
EGAD00001000049	Pancreatic adenocarcinoma QCMG 20110901	AB SOLiD 4 System AB SOLiD System 3.0	26
EGAD00001000050	Tandem duplication of chromosomal segments is common in ovarian and breast cancer genomes	Illumina Genome Analyzer II	13
EGAD00001000052	UK10K_NEURO_MUIR REL-2011-01-28	Illumina Genome Analyzer II	104
EGAD00001000053	Exome sequencing in patients with Calcific Aortic Valve Stenosis	Illumina HiSeq 2000	20
EGAD00001000054	Mutational Screening of Human Acute Myleloid Leukaemia Samples	Illumina HiSeq 2000	10
EGAD00001000055	Genetic variation in Kuusamo	Illumina HiSeq 2000	434
EGAD00001000057	RNA-Seq analysis	Illumina Genome Analyzer II	15
EGAD00001000058	Exome Sequencing analysis	Illumina Genome Analyzer II	21
EGAD00001000059	Screening for human epigenetic variation at CpG islands	Illumina Genome Analyzer II	116
EGAD00001000060	Acral melanoma study whole genomes	Complete Genomics	3
EGAD00001000061	Acral melanoma study whole exomes	Illumina Genome Analyzer IIx	3
EGAD00001000062	ADCC Rearrangement Screen	Illumina Genome Analyzer II Illumina HiSeq 2000	14
EGAD00001000063	Triple Negative Breast Cancer sequencing	Illumina Genome Analyzer II	6
EGAD00001000064	Cell Line Sub Clone Rearrangement Screen	Illumina Genome Analyzer II	6
EGAD00001000065	Mixed Leukemia Rearrangement Screen	Illumina Genome Analyzer II	5
EGAD00001000066	Breast Cancer Follow Up Series	Illumina Genome Analyzer II	288
EGAD00001000067	Cancer Single Cell Sequencing	Illumina HiSeq 2000	16
EGAD00001000068	Multifocal Breast Project	Illumina Genome Analyzer II Illumina HiSeq 2000	22
EGAD00001000069	Lung Rearrangement Study	Illumina HiSeq 2000	48
EGAD00001000070	TMD_AMLK Exome Study	Illumina HiSeq 2000	50
EGAD00001000071	Kaposi sarcoma exome	Illumina HiSeq 2000	20
EGAD00001000072	Fanconi Anemia transformation to AML	Illumina HiSeq 2000	6
EGAD00001000073	MDSMPN Rearrangement Screen	Illumina HiSeq 2000	11
EGAD00001000074	Integrative Oncogenomics of Multiple Myeloma	Illumina Genome Analyzer II Illumina HiSeq 2000	174
EGAD00001000075	Gastric and Esophageal tumour rearrangement screen	Illumina HiSeq 2000	32
EGAD00001000076	CRLF2 sequencing project	Illumina HiSeq 2000	13
EGAD00001000077	CRLF2 sequencing project Exomes	Illumina HiSeq 2000	26
EGAD00001000078	ALK inhibitors in the context of ALK-dependent cancer cell lines	Illumina HiSeq 2000	16
EGAD00001000079	PREDICT	Illumina HiSeq 2000	186
EGAD00001000080	Genomics of Colorectal Cancer Metastases - Massively Parallel Sequencing of Matched Primary and Metastatic tumours to Identify a Metastatic Signature of Somatic Mutations (MOSAIC)	Illumina HiSeq 2000	351
EGAD00001000081	Splenic Marginal Zone Lymphoma with villous lymphocytes exome sequencing	Illumina HiSeq 2000	1
EGAD00001000082	20 Matched Pair Breast Cancer Genomes	Illumina Genome Analyzer II Illumina HiSeq 2000	42
EGAD00001000083	Recurrent Somatic Mutations in CLL	Illumina Genome Analyzer II Illumina Genome Analyzer IIx	61
EGAD00001000084	Matched Ovarian Cancer Sequencing	Illumina Genome Analyzer II	23
EGAD00001000085	Somatic Histone H3 mutations	Illumina HiSeq 2000	14
EGAD00001000086	Analysis of genomic integrity of disease-corrected human induced pluripotent stem cells by exome sequencing	Illumina HiSeq 2000	16
EGAD00001000087	exome sequence data for 25 HIV elite long term non-progressors and rapid progressors. Partial dataset (overlap with EGAD00001000047) of raw BAMs mapped to GRCh37_53.	Illumina HiSeq 2000	25
EGAD00001000088	ER-, HER2-, PR- breast Cancer genome sequencing	Illumina Genome Analyzer II	6
EGAD00001000089	Acute Lymphoblastic Leukemia Exome sequencing	Illumina Genome Analyzer II	20
EGAD00001000090	Glioma cell lines rearrangement screen	Illumina Genome Analyzer II	3
EGAD00001000091	Non Tumour Renal Cell Line Sequencing	Illumina Genome Analyzer II	1
EGAD00001000092	Cancer Exome Resequencing	Illumina Genome Analyzer II	58
EGAD00001000093	Breast Cancer Exome Resequencing	Illumina Genome Analyzer II	21
EGAD00001000094	Cancer Genome Libraries Tests	Illumina Genome Analyzer II	16
EGAD00001000095	Acute Myeloid Leukemia Sequencing	Illumina Genome Analyzer II Illumina HiSeq 2000	9
EGAD00001000096	Pancreatic adenocarcinoma QCMG 20120201	AB SOLiD 4 System	166
EGAD00001000097	Matched breast cancer fusion gene study	Illumina Genome Analyzer II Illumina HiSeq 2000	46
EGAD00001000098	FRCC Exome sequencing	Illumina Genome Analyzer II	16
EGAD00001000099	Meningioma Exome	Illumina Genome Analyzer II	26
EGAD00001000100	Renal Matched Pair Cell Line Exome Sequencing	Illumina Genome Analyzer II	10
EGAD00001000101	ADCC Exome Sequencing	Illumina Genome Analyzer II Illumina HiSeq 2000	125
EGAD00001000102	Myeloproliferative Disorder Sequencing	Illumina Genome Analyzer II	6
EGAD00001000103	Myeloproliferative Disorder Sequencing	Illumina Genome Analyzer II	4
EGAD00001000104	Acute Lymphoblastic Leukemia Exome sequencing 2	Illumina Genome Analyzer II	97
EGAD00001000105	MuTHER adipose tissue small RNA expression	Illumina Genome Analyzer II	130
EGAD00001000106	Primary Myelofibrosis Myeloproliferative Disease exome sequencing	Illumina Genome Analyzer II Illumina HiSeq 2000	67
EGAD00001000107	SCAT osteosarcoma sequencing	Illumina Genome Analyzer II Illumina HiSeq 2000	114
EGAD00001000108	Paroxysmal neurological disorders	Illumina Genome Analyzer II Illumina HiSeq 2000	327
EGAD00001000109	Unraveling the genetic basis of a collagen migration defect in patients with a combined platelet dysfunction and reduced bone density	Illumina HiSeq 2000	29
EGAD00001000110	Breast Cancer Exome Sequencing	Illumina Genome Analyzer II Illumina HiSeq 2000	179
EGAD00001000111	CML Discovery Project	Illumina Genome Analyzer II	6
EGAD00001000112	Identifying Novel Fusion Genes in Myeloma	Illumina Genome Analyzer II	6
EGAD00001000113	Mutational landscapes of primary triple negative breast cancers - Exomes	Illumina Genome Analyzer IIx	108
EGAD00001000115	Mutational landscapes of primary triple negative breast cancers - WGS	ABI_SOLID	32
EGAD00001000116	Acute Lymphoblastic Leukemia Sequencing	Illumina Genome Analyzer II Illumina HiSeq 2000	61
EGAD00001000117	Myelodysplastic Syndrome Exome Sequencing	Illumina Genome Analyzer II Illumina HiSeq 2000	152
EGAD00001000118	Osteosarcoma Exome Sequencing	Illumina Genome Analyzer II Illumina HiSeq 2000	102
EGAD00001000119	Chordoma Exome Sequencing	Illumina Genome Analyzer II Illumina HiSeq 2000	50
EGAD00001000121	Breast Cancer Whole Genome Sequencing	Illumina HiSeq 2000	6
EGAD00001000122	DATA_SET_ICGC_PedBrainTumor_Medulloblastoma	Illumina Genome Analyzer IIx Illumina HiSeq 2000	206
EGAD00001000123	Polycythemia Vera Myeloproliferative Disease exome sequencing	Illumina Genome Analyzer II Illumina HiSeq 2000	119
EGAD00001000124	Sequencing Acute Myeloid Leukaemia	Illumina HiSeq 2000	4
EGAD00001000125	Chondrosarcoma Exome	Illumina HiSeq 2000	104
EGAD00001000126	HER2 positive Breast Cancer	Illumina HiSeq 2000	101
EGAD00001000127	Burden of Disease in Sarcoma	Illumina HiSeq 2000	220
EGAD00001000128	Familial Thrombocytosis germline exome sequencing	Illumina HiSeq 2000	4
EGAD00001000129	Essential Thrombocythemia Myeloproliferative Disease exome sequencing	Illumina HiSeq 2000	189
EGAD00001000130	Breast Cancer Matched Pair Cell Line Whole Genomes	Illumina HiSeq 2000	22
EGAD00001000131	Genetic landscape of hepatocellular carcinoma	Illumina HiSeq 2000	48
EGAD00001000132	Mutational landscapes of primary triple negative breast cancers - RNA seq	Illumina Genome Analyzer IIx	80
EGAD00001000133	The landscape of cancer genes and mutational processes in breast cancer	Illumina Genome Analyzer II Illumina HiSeq 2000	199
EGAD00001000134	Sequence reads for pediatric GBM samples for manuscript: Driver mutations in histone H3.3 and chromatin remodelling genes in paediatric glioblastoma	Illumina HiSeq 2000	54
EGAD00001000135	Neuroblastoma whole genome sequencing	Illumina HiSeq 2000	80
EGAD00001000136	CML blast phase rearrangement screen	Illumina HiSeq 2000	6
EGAD00001000138	The expression data for this study can be found here: http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-1088/and its SNP6 data can be found here:http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-1087/	Illumina Genome Analyzer II Illumina HiSeq 2000	58
EGAD00001000139	Tumor sample of a serious ovarian carcinoma	Complete Genomics	1
EGAD00001000140	Blood sample of serious ovarian carcinoma patient	Complete Genomics	1
EGAD00001000141	Triple Negative Breast Cancer Whole Genomes	Illumina Genome Analyzer II Illumina HiSeq 2000	243
EGAD00001000142	Renal Follow Up Series	Illumina HiSeq 2000	637
EGAD00001000143	Xenograft Seqeuncing	Illumina HiSeq 2000	16
EGAD00001000144	Lung Cancer Whole Genomes	Illumina HiSeq 2000	18
EGAD00001000145	Matched Pair Cancer Cell line Whole Genomes	Illumina HiSeq 2000	58
EGAD00001000147	Osteosarcoma Whole Genome	Illumina HiSeq 2000	108
EGAD00001000149	A Comprehensive Catalogue of Somatic Mutations from a Human Cancer Genome	Illumina HiSeq 2000	2
EGAD00001000150	Targeted re-sequencing of 97 genes in T-ALL	454 GS FLX Titanium	33
EGAD00001000151	UK10K OBESITY REL-2011-07-14	Illumina HiSeq 2000	88
EGAD00001000152	UK10K_RARE_THYROID REL-2012-01-13	Illumina Genome Analyzer II Illumina HiSeq 2000	27
EGAD00001000153	UK10K_RARE_SIR REL-2012-01-13	Illumina Genome Analyzer II Illumina HiSeq 2000	38
EGAD00001000154	Single-cell genome sequencing reveals DNA-mutation per cell cycle	Illumina Genome Analyzer II Illumina HiSeq 2000	12
EGAD00001000158	Subgroup-specific structural variation across 1,000 medulloblastoma genomes		23
EGAD00001000159	DATA FILES FOR SJOS	Illumina HiSeq 2000	37
EGAD00001000160	DATA FILES FOR SJACT	Illumina HiSeq 2000	16
EGAD00001000161	DATA FILES FOR SJLGG	Illumina HiSeq 2000	33
EGAD00001000162	DATA FILES FOR SJEPD	Illumina HiSeq 2000	44
EGAD00001000163	DATA FILES FOR SJPHALL	Illumina HiSeq 2000	18
EGAD00001000164	Whole Genome Sequencing accompanying Genetic landscape of pediatric Rhabdomyosarcoma.	Illumina HiSeq 2000	29
EGAD00001000165	DATA FILES FOR SJINF	Illumina HiSeq 2000	46
EGAD00001000167	UK10K_RARE_HYPERCHOL REL-2012-01-13	Illumina Genome Analyzer II Illumina HiSeq 2000	48
EGAD00001000168	UK10K_RARE_CILIOPATHIES REL-2012-01-13	Illumina Genome Analyzer II Illumina HiSeq 2000	50
EGAD00001000170	UK10K_NEURO_MUIR REL-2012-01-13	Illumina Genome Analyzer II Illumina HiSeq 2000	167
EGAD00001000171	UK10K_RARE_FIND REL-2012-01-13	Illumina Genome Analyzer II Illumina HiSeq 2000	44
EGAD00001000173	UK10K_NEURO_ASD_FI REL-2012-01-13	Illumina HiSeq 2000	85
EGAD00001000174	DATA_SET_Coverage_bias_sensitivity_of_variant_calling_for_4_WG_seq_tech	AB SOLiD 4 System Complete Genomics Illumina HiSeq 2000 unspecified	4
EGAD00001000175	Identification of SPEN as a novel cancer gene and FGFR2 as a potential therapeutic target in adenoid cystic carcinoma	Illumina Genome Analyzer II	48
EGAD00001000176	DATA_SET_Comparing_sequencing_four_proto-typical_Burkitt_lymphomas_BL_IG-MYC_translocation	Illumina Genome Analyzer IIx Illumina HiSeq 2000	8
EGAD00001000177	Whole Genome Methylation in CLL	Illumina Genome Analyzer IIx	6
EGAD00001000178	UK10K_RARE_CHD REL-2012-01-13	Illumina Genome Analyzer II Illumina HiSeq 2000	46
EGAD00001000179	UK10K_RARE_COLOBOMA REL-2012-01-13	Illumina Genome Analyzer II Illumina HiSeq 2000	75
EGAD00001000180	UK10K_RARE_NEUROMUSCULAR REL-2012-01-13	Illumina HiSeq 2000	47
EGAD00001000181	UK10K_OBESITY_SCOOP REL-2012-01-13	Illumina HiSeq 2000	212
EGAD00001000182	UK10K_NEURO_UKSCZ REL-2012-01-13	Illumina HiSeq 2000	95
EGAD00001000183	UK10K_NEURO_FSZNK REL-2012-01-13	Illumina HiSeq 2000	273
EGAD00001000184	UK10K_NEURO_FSZ_REL_2012_01_13	Illumina HiSeq 2000	120
EGAD00001000185	UK10K_RARE_COLOBOMA REL-2012-02-22	Illumina Genome Analyzer II Illumina HiSeq 2000	98
EGAD00001000186	UK10K_RARE_HYPERCHOL REL-2012-02-22	Illumina Genome Analyzer II Illumina HiSeq 2000	71
EGAD00001000187	UK10K_RARE_THYROID REL-2012-02-22	Illumina Genome Analyzer II Illumina HiSeq 2000	65
EGAD00001000188	UK10K_RARE_SIR REL-2012-02-22	Illumina Genome Analyzer II Illumina HiSeq 2000	63
EGAD00001000189	UK10K_RARE_NEUROMUSCULAR REL-2012-02-22	Illumina HiSeq 2000	86
EGAD00001000190	UK10K_RARE_FIND REL-2012-02-22	Illumina Genome Analyzer II Illumina HiSeq 2000	90
EGAD00001000191	UK10K_RARE_CILIOPATHIES REL-2012-02-22	Illumina Genome Analyzer II Illumina HiSeq 2000	128
EGAD00001000192	UK10K_RARE_CHD REL-2012-02-22	Illumina Genome Analyzer II Illumina HiSeq 2000	46
EGAD00001000193	UK10K_OBESITY_SCOOP REL-2012-02-22	Illumina HiSeq 2000	573
EGAD00001000194	UK10K_COHORT_TWINS REL-2011-12-01	Illumina Genome Analyzer II Illumina HiSeq 2000	1713
EGAD00001000195	For information about this sample set, please contact the sample custodian Nic Timpson: N.J.Timpson@bristol.ac.uk	Illumina HiSeq 2000	740
EGAD00001000196	Neuroblastoma samples	Complete Genomics	203
EGAD00001000197	Progressive Hearing Loss	Illumina Genome Analyzer II	8
EGAD00001000198	Gene Discovery in Age-Related Hearing Loss	Illumina Genome Analyzer II Illumina HiSeq 2000	20
EGAD00001000199	ORCADES_WGA	Illumina HiSeq 2000	400
EGAD00001000200	Dilgom Exome	Illumina HiSeq 2000	130
EGAD00001000201	MDACC-endo	AB SOLiD System 3.0	28
EGAD00001000202	Neuroblastoma samples (Analyses_vcf files)		204
EGAD00001000203	Otosclerosis gene discovery	Illumina HiSeq 2000	10
EGAD00001000204	Hearing loss in adults from South Carolina	Illumina HiSeq 2000	10
EGAD00001000205	BRAF and MEK resistant cell line clones	Illumina HiSeq 2000	3
EGAD00001000206	UK10K_RARE_COLOBOMA REL-2012-07-05	Illumina Genome Analyzer II Illumina HiSeq 2000	123
EGAD00001000207	UK10K_RARE_HYPERCHOL REL-2012-07-05	Illumina Genome Analyzer II Illumina HiSeq 2000	88
EGAD00001000208	UK10K_RARE_THYROID REL-2012-07-05	Illumina Genome Analyzer II Illumina HiSeq 2000	65
EGAD00001000209	UK10K_RARE_FIND REL-2012-07-05	Illumina Genome Analyzer II Illumina HiSeq 2000	121
EGAD00001000210	UK10K_RARE_CHD REL-2012-07-05	Illumina Genome Analyzer II Illumina HiSeq 2000	124
EGAD00001000212	Functional characterisation of CpG islands in human tissues	Illumina Genome Analyzer II	26
EGAD00001000213	Screening for abnormal CGI methylation in primary colorectal tumours	Illumina Genome Analyzer II	21
EGAD00001000214	Whole genome sequencing of colon samples	Illumina HiSeq 2000	11
EGAD00001000215	RNA sequencing of colon tumor/normal sample pairs	Illumina HiSeq 2000	139
EGAD00001000216	Exome capture sequencing of colon tumor/normal pairs	Illumina HiSeq 2000	144
EGAD00001000217	UK10K_RARE_CILIOPATHIES REL-2012-07-05	Illumina Genome Analyzer II Illumina HiSeq 2000	150
EGAD00001000218	UK10K_RARE_SIR REL-2012-07-05	Illumina Genome Analyzer II Illumina HiSeq 2000	81
EGAD00001000219	UK10K_RARE_NEUROMUSCULAR REL-2012-07-05	Illumina HiSeq 2000	117
EGAD00001000220	Deep sequencing of CTCs	454 GS FLX Titanium Illumina MiSeq	3
EGAD00001000221	Whole genome sequencing of SCLC tumor/normal samples	Illumina HiSeq 2000	4
EGAD00001000222	Exome capture sequencing of SCLC tumor/normal pairs and cell lines	Illumina HiSeq 2000	103
EGAD00001000223	RNA sequencing of SCLC tumor/normal sample pairs and cell lines	Illumina HiSeq 2000	79
EGAD00001000224	Enrichment of CRC	454 GS FLX Titanium	2
EGAD00001000225	Deep sequencing of KRAS	454 GS FLX Titanium	8
EGAD00001000226	Chordoma is a rare malignant bone tumor that expresses the transcription factor T. We conducted an association study of 40 patients with chordoma and 358 ancestry-matched, unaffected individuals with replication in an independent cohort. Whole-exome and Sanger sequencing of T exons reveals a strong risk association ( allelic odds ratio (OR) = 4.9, P = 3.3x10-11, CI= 2.9-8.1) with the common (minor allelic frequency >5%) non-synonymous SNP rs2305089 in chordoma, which is exceptional in cancer genetics.	Illumina Genome Analyzer II Illumina HiSeq 2000	18
EGAD00001000227	EGAD00001000227_UK10K_NEURO_ABERDEEN_REL_2012_07_05	Illumina HiSeq 2000	347
EGAD00001000228	EGAD00001000228_UK10K_NEURO_ASD_BIONED_REL_2012_07_05	Illumina HiSeq 2000	59
EGAD00001000229	EGAD00001000229_UK10K_NEURO_ASD_FI_REL_2012_07_05	Illumina HiSeq 2000	85
EGAD00001000230	EGAD00001000230_UK10K_NEURO_ASD_GALLAGHER_REL_2012_07_05	Illumina HiSeq 2000	72
EGAD00001000231	EGAD00001000231_UK10K_NEURO_ASD_SKUSE_REL_2012_07_05	Illumina HiSeq 2000	320
EGAD00001000232	EGAD00001000232_UK10K_NEURO_ASD_TAMPERE_REL_2012_07_05	Illumina HiSeq 2000	54
EGAD00001000233	EGAD00001000233_UK10K_NEURO_EDINBURGH_REL_2012_07_05	Illumina HiSeq 2000	219
EGAD00001000234	EGAD00001000234_UK10K_NEURO_FSZNK_REL_2012_07_05	Illumina HiSeq 2000	281
EGAD00001000235	EGAD00001000235_UK10K_NEURO_IOP_COLLIER_REL_2012_07_05	Illumina HiSeq 2000	170
EGAD00001000236	EGAD00001000236_UK10K_NEURO_MUIR_REL_2012_07_05	Illumina Genome Analyzer II Illumina HiSeq 2000	167
EGAD00001000237	EGAD00001000237_UK10K_NEURO_GURLING_REL_2012_07_05	Illumina HiSeq 2000	43
EGAD00001000239	EGAD00001000239_UK10K_NEURO_IMGSAC_REL_2012_07_05	Illumina HiSeq 2000	114
EGAD00001000240	UK10K_NEURO_FSZ_REL_2012_07_05	Illumina HiSeq 2000	120
EGAD00001000241	EGAD00001000241_UK10K_OBESITY_SCOOP_REL_2012_07_05	Illumina HiSeq 2000	674
EGAD00001000242	EGAD00001000242_UK10K_NEURO_ASD_MGAS_REL_2012_07_05	Illumina HiSeq 2000	60
EGAD00001000243	Melanoma-TIL Study Exomes	Illumina HiSeq 2000	43
EGAD00001000245	Pulldown cytosine deaminases	Illumina HiSeq 2000	20
EGAD00001000246	Integrative Oncogenomics of multiple myeloma	Illumina HiSeq 2000	106
EGAD00001000247	Integrative Oncogenomics of multiple myeloma	Illumina HiSeq 2000	51
EGAD00001000248	RNAseq Pulldown	Illumina HiSeq 2000	6
EGAD00001000249	This is the bam file generated after alignment using BWA program for the SAIF genome	Illumina HiSeq 2000	1
EGAD00001000251	De novo mutations in schizophrenia	Illumina HiSeq 2000	611
EGAD00001000252	Evaluation of PCR library method on whole genome samples	Illumina HiSeq 2000	12
EGAD00001000253	AML targeted resequencing study	Illumina HiSeq 2000	-
EGAD00001000254	This dataset contain the raw files generated for SAIF genome project	Illumina HiSeq 2000	1
EGAD00001000255	Testing the feasibility of genome scale sequencing in routinely collected FFPE cancer specimens versus matched fresh frozen samples	Illumina HiSeq 2000	32
EGAD00001000256	UK10K_NEURO_UKSCZ REL-2012-07-05	Illumina HiSeq 2000	595
EGAD00001000258	Deep RNA sequencing in CLL	Illumina Genome Analyzer II	107
EGAD00001000259	DATA FILES FOR SJAMLM7	Illumina HiSeq 2000	8
EGAD00001000260	Hypodiploid acute lymphoblastic leukemia whole genome sequencing	Illumina HiSeq 2000	40
EGAD00001000261	Retinoblastoma whole genome sequencing	Illumina HiSeq 2000	8
EGAD00001000262	OICR PANCREATIC CANCER DATASET		4
EGAD00001000263	A small subsample of EGAD00001000689. Please do not use.	Illumina HiSeq 2000	18
EGAD00001000264	Resistance towards chemotherapy is one of the main causes of treatment failure and deathamong breast cancer patients.The main objective of this project is toidentify genetic mechanisms causing some breast cancer patients not torespond to a particluar type of chemotherapy (epirubicin) while otherpatients respond very well to the same treatment. In the project wewill perform genome / exome sequencing of a selection of breast cancerpatients (n=30). These patients are drawn from a cohort where allpatients have recieved treatment with epirubicin monotherapy before surgical removal of alocally advanced breast tumour, and where all patients have beensubjected to objective evaluation of the response to thetherapy. Subsequent to sequencing, we will analyse the data andcompare with the clinical data for each patient (object response totherapy). The main aim being to identify mutations that are associatedwith resistance to epirubicin. Identification of mutations with strongpredictive value, may have a direct impact on cancer treatment sinceit opens the possibility for genetic testing of a tumour, and desicionon which drug is likely to work best, prior to treatment start.	Illumina HiSeq 2000	29
EGAD00001000265	This Study uses a focused bespoke bait pull down library method to target findings of Chondrosarcoma whole genome and whole exome sequencing studies in order to validate findings. This method will also be used on a larger set of tumour only samples in order to find precedence of these findings in a larger set of patient samples.	Illumina HiSeq 2000	-
EGAD00001000266	This Study uses a focused bespoke bait pull down library method to target findings of Osteosarcoma whole genome and whole exome sequencing studies in order to validate findings. This method will also be used on a larger set of tumour only samples in order to find precedence of these findings in a larger set of patient samples.	Illumina HiSeq 2000	110
EGAD00001000267	This Study uses a focused bespoke bait pull down library method to target findings of Chordoma whole genome and whole exome sequencing studies in order to validate findings. This method will also be used on a larger set of tumour only samples in order to find precedence of these findings in a larger set of patient samples.	Illumina HiSeq 2000	46
EGAD00001000268	DATA FILES FOR SJCBF	Illumina HiSeq 2000	34
EGAD00001000269	OLD DATA FILES FOR SJMB - Superseded by EGAD00001001864	Illumina HiSeq 2000	68
EGAD00001000270	DATA_SET_EOP-PCA-LargeAndSmallTumors1	Illumina HiSeq 2000	18
EGAD00001000271	Pilot study Pilocytic Astrocytoma ICGC PedBrain, whole genome sequencing of 5 tumors and matched blood	Illumina HiSeq 2000	10
EGAD00001000272	Genomic Alterations in Gingivo-buccal Cancer: ICGC-India Project_YR01	454 GS FLX Titanium Illumina HiSeq 2000	200
EGAD00001000273	This Study uses a focused bespoke bait pull down library method to target findings of Meningioma whole genome and whole exome sequencing studies in order to validate findings. This method will also be used on a larger set of tumour only samples in order to find precedence of these findings in a larger set of patient samples.	Illumina HiSeq 2000	147
EGAD00001000274	DATA_SET_TRANSCIPTOME_Comparing_sequencing_four_proto-typical_Burkitt_lymphomas_BL_IG-MYC_translocation	Illumina HiSeq 2000	4
EGAD00001000275	Data set for Whole-genome-Sequencing of adult medulloblastoma	Illumina HiSeq 2000	10
EGAD00001000276	OICR PANCREATIC CANCER DATASET 2		10
EGAD00001000277	High Quality Variant Call files, generated by bioscope, converted to vcf format. Complete dataset for all 300 samples.		202
EGAD00001000278	ICGC MMML-seq Data Freeze November 2012 whole genome sequencing	Illumina HiSeq 2000	12
EGAD00001000279	ICGC MMML-seq Data Freeze November 2012 whole exome sequencing	Illumina Genome Analyzer IIx	4
EGAD00001000280	This experiment is to validate putative somatic substitutions and indels identified in an exome screen of ~50 osteosarcoma tumour/normal pairs. It is the first stage in our ICGC commitment to study osteosarcoma. The validation process is an important component of our analysis to clarify the data prior to looking for evidence of new cancer genes, or subverted pathways important in the development of cancer. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2000	112
EGAD00001000281	ICGC MMML-seq Data Freeze November 2012 transcriptome sequencing	Illumina HiSeq 2000	6
EGAD00001000282	Neuroblastomas are tumors of peripheral sympathetic neurons and are the most common solid tumor in children. To determine the genetic basis for neuroblastoma we performed whole-genome sequencing (6 cases), exome sequencing (16 cases), genome-wide rearrangement analyses (32 cases), and targeted analyses of specific genomic loci (40 cases) using massively parallel sequencing. On average each tumor had 19 somatic alterations in coding genes (range, 3-70). Among genes not previously known to be involved in neuroblastoma, chromosomal deletions and sequence alterations of chromatin remodeling genes, ARID1A and ARID1B, were identified in 8 of 71 tumors (11%) and were associated with early treatment failure and decreased survival. Using tumor-specific structural alterations, we developed an approach to identify rearranged DNA fragments in sera, providing personalized biomarkers for minimal residual disease detection and monitoring. These results highlight dysregulation of chromatin remodeling in pediatric tumorigenesis and provide new approaches for the management of neuroblastoma patients.	Illumina Genome Analyzer IIx Illumina HiSeq 2000	114
EGAD00001000283	Agilent whole exome hybridisation capture was performed on genomic DNA derived from MDS and matched normal DNA from the same patients. Next Generation sequencing performed on the resulting exome libraries and mapped to build 37 of the human reference genome to facilitate the identification of novel cancer genes. Now we aim to discover the prevalence of our findings using bespoke pulldown methods and sequencing the products from a larger set of patient DNA.	Illumina HiSeq 2000	764
EGAD00001000284	Cancer Genome Scanning in Plasma: Detection of Tumor-Associated Copy Number Aberrations, Single-Nucleotide Variants, and Tumoral Heterogeneity by Massively Parallel Sequencing	Illumina Genome Analyzer IIx	1
EGAD00001000285	We propose to definitively characterise the somatic genetics of breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses.	Illumina Genome Analyzer II Illumina HiSeq 2000	55
EGAD00001000286	Whole-exome study of congenital macrothrombocytopenia	Illumina HiSeq 2000	21
EGAD00001000287	Agilent whole exome hybridisation capture will be performed on genomic DNA derived from 25 renal cancers and matched normal DNA from the same patients. Three lanes of Illumina GA sequencing will be performed on the resulting 50 exome libraries and mapped to build 37 of the human reference genome to facilitate the identification of novel cancer genes.	Illumina Genome Analyzer II	54
EGAD00001000288	Invasive lobular carcinoma (ILC) is the second most common histological subtype of breast cancer accounting for 10-15% of cases. ILC differs from invasive ductal carcinoma (IDC)with respect to epidemiology, histology, and clinical presentation. Moreover, ILC is lesssensitive to chemotherapy, more frequently bilateral, and more prone to form gastrointestinal, peritoneal, and ovarian metastases than IDCs. In contrast to IDC, the prognostic value ofhistological grade (HG) in ILC is controversial. One of the three major components of histological grading (tubule formation) is missing in ILC which hinders the process of gradingin this histological subtype and results in the classification of approximately two thirds of ILC as HG 2.Over the last decade, a number of gene expression signatures have shed light onto breast cancer classification, allowing breast cancer care to become more personalized. Withrespect to the management of estrogen receptor (ER)-positive breast cancer, several gene expression signatures provide prognostic and/or predictive information beyond what is possible with current classical clinico-pathological parameters alone. Nevertheless, most studies using gene expression signature have not considered different histologic subtypesseparately. Recently, a comprehensive research program has elucidated some of the biological underpinnings of invasive lobular carcinoma. Genetic material extracted from 200 ILC tumor samples were studied using gene expression profiling and identified ILCmolecular subtypes. These proliferation-driven gene signatures of ILC appear to have prognostic significance. In particular, the Genomic Grade (GG) gene signature improved upon HG in ILC and added prognostic value to classic clinico-pathologic factors. In addition this study demonstrated that most ILC are molecularly characterized as luminal-A (~75%)followed by luminal-B (~20%) and HER2-positve tumors (~5%). Moreover, we investigated the prognostic value of known gene signatures/ gene modules in the same cohort of ILC. As a second step within the scope of this project, we aim to investigate the interactionsbetween somatic ILC tumor mutations to observed transcriptome findings. To this end, we aim to perform somatic mutation analysis for the ILC tumors for which Affymetrix gene expression profiling is available. To this end, we will use a gene screen assay, which specifically interrogates the mutational status of a few hundreds of cancer genes. We believe that this pioneering effort will be fundamental for a tailored treatment of ILC withimprovement in patients' outcome.	Illumina HiSeq 2000	1130
EGAD00001000289	Agilent whole exome hybridisation capture was performed on genomic DNA derived from cancer and matched normal DNA from the same patients. Next Generation sequencing performed on the resulting exome libraries and mapped to build 37 of the human reference genome to facilitate the identification of novel cancer genes. Now we aim to re find and validate the findings of those exome libraries using bespoke pulldown methods and sequencing the products.	Illumina HiSeq 2000	12
EGAD00001000290	Cancer Genome Scanning in Plasma: Detection of Tumor-Associated Copy Number Aberrations, Single-Nucleotide Variants, and Tumoral Heterogeneity by Massively Parallel Sequencing	Illumina Genome Analyzer IIx	1
EGAD00001000291	Exome sequencing identifies mutation of the ribosome in T-cell acute lymphoblastic leukemia	Illumina HiSeq 2000	128
EGAD00001000292	Whole genome sequencing analysis was performed on 6 patients within matched germline, follicular lymphoma and transformed follicular lymphoma.	Illumina HiSeq 2000	20
EGAD00001000293	Sequencing data for Australian Ovarian Cancer study submitted 20121116	AB SOLiD 4 System	72
EGAD00001000294	UK10K_RARE_CHD REL-2012-11-27	Illumina Genome Analyzer II Illumina HiSeq 2000	124
EGAD00001000295	UK10K_RARE_HYPERCHOL REL-2012-11-27	Illumina Genome Analyzer II Illumina HiSeq 2000	120
EGAD00001000296	UK10K_RARE_CILIOPATHIES REL-2012-11-27	Illumina Genome Analyzer II Illumina HiSeq 2000	108
EGAD00001000297	UK10K_RARE_FIND REL-2012-11-27	Illumina Genome Analyzer II Illumina HiSeq 2000	124
EGAD00001000298	UK10K_RARE_NEUROMUSCULAR REL-2012-11-27	Illumina HiSeq 2000	130
EGAD00001000299	Whole exome sequencing of samples selected from the Finrisk sample collection. The samples sequenced in this study have all been collected in Kuusamo, Finland.	Illumina HiSeq 2000	24
EGAD00001000300	UK10K_OBESITY_GS_REL_2012_07_05	Illumina HiSeq 2000	430
EGAD00001000301	A couple of previously characterized and sequenced libraries will be repeated using a couple of differing size selection criteria and skim sequenced using an Illumina HiSeq. The resulting sequence will be analyzed to determine the optimal DNA library size for our specific downstream analysis.	Illumina HiSeq 2000	1
EGAD00001000302	This experiment is looking at the mutational signatures generated by engineered HRAS mutations by using whole genome sequence generated on massively parallel next generation sequencers.	Illumina HiSeq 2000	6
EGAD00001000303	ICGC prostate cancer whole genome mate-pair sequencing	Illumina Genome Analyzer IIx	22
EGAD00001000304	ICGC prostate cancer miRNA sequencing	Illumina HiSeq 2000	8
EGAD00001000305	ICGC prostate cancer RNA sequencing	Illumina HiSeq 2000	12
EGAD00001000306	ICGC prostate cancer whole genome sequencing	Illumina HiSeq 2000	22
EGAD00001000307	UK10K_RARE_COLOBOMA REL-2012-11-27	Illumina Genome Analyzer II Illumina HiSeq 2000	117
EGAD00001000308	Cancer Genome Scanning in Plasma: Detection of Tumor-Associated Copy Number Aberrations, Single-Nucleotide Variants, and Tumoral Heterogeneity by Massively Parallel Sequencing		1
EGAD00001000309	UK10K_OBESITY_GS REL-2012-11-27	Illumina HiSeq 2000	424
EGAD00001000310	UK10K_NEURO_ASD_BIONED REL-2012-11-27	Illumina HiSeq 2000	76
EGAD00001000311	UK10K_NEURO_ASD_FI REL-2012-11-27	Illumina HiSeq 2000	84
EGAD00001000312	UK10K_NEURO_ASD_MGAS REL-2012-11-27	Illumina HiSeq 2000	96
EGAD00001000313	UK10K_NEURO_ASD_SKUSE REL-2012-11-27	Illumina HiSeq 2000	305
EGAD00001000314	UK10K_NEURO_ASD_TAMPERE REL-2012-11-27	Illumina HiSeq 2000	48
EGAD00001000315	UK10K_NEURO_ABERDEEN REL-2012-11-27	Illumina HiSeq 2000	313
EGAD00001000316	UK10K_NEURO_ASD_GALLAGHER REL-2012-11-27	Illumina HiSeq 2000	75
EGAD00001000317	UK10K_NEURO_EDINBURGH REL-2012-11-27	Illumina HiSeq 2000	214
EGAD00001000318	UK10K_NEURO_FSZ REL-2012-11-27	Illumina HiSeq 2000	119
EGAD00001000319	UK10K_NEURO_GURLING REL-2012-11-27	Illumina HiSeq 2000	48
EGAD00001000320	UK10K_NEURO_IMGSAC REL-2012-11-27	Illumina HiSeq 2000	111
EGAD00001000321	UK10K_NEURO_IOP_COLLIER REL-2012-11-27	Illumina HiSeq 2000	158
EGAD00001000322	UK10K_NEURO_MUIR REL-2012-11-27	Illumina Genome Analyzer II Illumina HiSeq 2000	166
EGAD00001000323	Sequencing data for Australian Pancreatic Cancer study submitted 20130102	AB SOLiD 4 System Illumina HiSeq 2000	200
EGAD00001000324	We will sequence the RNA of lymphoblast samples, transformed with EBV, which have poikiloderma syndrome with mutations in c16orf57. The aim of the experiment is to characterise RNA structural effects in this disease.	Illumina HiSeq 2000	4
EGAD00001000325	In this study, mutations present in a series of human melanomas (stage IV disease) will be determined, using autologous blood cells to obtain a reference genome. From each of the samples that are analyzed, tumour-infiltrating T lymphocytes have also been isolated. This offers a unique opportunity to determine which (fraction of) mutations in human cancer leads to epitopes that are recognized by T cells. The resulting information is likely to be of value to understand how T cell activating drugs exert their action.	Illumina HiSeq 2000	22
EGAD00001000327	release_2: ICGC PedBrain: whole genome mate-pair sequencing	Illumina Genome Analyzer IIx Illumina HiSeq 2000	70
EGAD00001000328	ICGC PedBrain: RNA sequencing	Illumina HiSeq 2000	28
EGAD00001000329	UK10K_RARE_THYROID REL-2012-11-27	Illumina Genome Analyzer II Illumina HiSeq 2000	113
EGAD00001000332	UK10K_NEURO_FSZNK REL-2012-11-27	Illumina HiSeq 2000	258
EGAD00001000333	Cancer is driven by mutations in the genome. We will uncover the mutations that give rise to Ewing's sarcoma, a bone tumour that largely affects children. We will use second generation Illumina massively parallel sequencing, and bespoke software, to characterise the genomes and transcriptomes of Ewing,s sarcoma tumours.	Illumina HiSeq 2000	58
EGAD00001000334	UK10K_RARE_SIR REL-2012-11-27	Illumina Genome Analyzer II Illumina HiSeq 2000	111
EGAD00001000335	UK10K_NEURO_UKSCZ REL-2012-11-27	Illumina HiSeq 2000	527
EGAD00001000336	UK10K_OBESITY_SCOOP REL-2012-11-27	Illumina HiSeq 2000	784
EGAD00001000337	Illumina RNA-Seq will be performed on four Ewing's sarcoma cell lines and two control cell lines. RNA was extracted from all the lines using a basic Trizol extraction protocol.	Illumina HiSeq 2000	12
EGAD00001000338	We propose to definitively characterise the somatic genetics of ER+ve, HER2-ve breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses.	Illumina HiSeq 2000	3
EGAD00001000339	Multiple myeloma is an incurable plasma cell malignancy whose molecular pathogenesis is incompletely understood. We used whole exome sequencing, copy number profiling and cytogenetic to analyses 84 samples from 67 patients with myeloma. In addition to known myeloma genes, we identify new candidate genes, including truncations of SP140, ROBO1 and FAT3 and clustered missense mutations in EGR1. We find oncogenic mutations in cancer genes not previously implicated in myeloma, including SF3B1, PI3KCA and PTEN. We define diverse processes contributing to the mutational repertoire, including kataegis and somatic hypermutation. Most cases have at least one cluster of subclonal variants, including subclonal driver mutations, implying on-going tumor evolution. Serial samples revealed diverse patterns of clonal evolution, including linear evolution, differential clonal response and branching evolution. Our findings reveal the myeloma genome to be heterogeneous across patients and, within individual patients, to exhibit diversity in clonal admixture and dynamics in response to therapy.	Illumina Genome Analyzer II Illumina HiSeq 2000	154
EGAD00001000340	The objective of this study is to resequence of targeted intervals containing autosomal recessive variants causing neurological disorders in consanguineous pedigrees. Using homozygosity mapping, three intervals of very different sizes have previously been unambiguously mapped for three different neurological diseases: 2.4Mb, 8Mb and 14.3Mb in size, for Microlissencephaly, Severe Mental Retardation and Complicated hereditary spastic paraplegia respectively. This study is a pilot to assess how well custom targeted resequencing performs across a broad size range of intervals. The study design is to use a different custom capture probe set for each interval, pulldown from a single patient from each family, and sequence 1 lane using Illumina paired-reads for each sample. Candidate variants will be followed up in the families themselves, and in patients with similar phenotypes from outbred populations	Illumina Genome Analyzer II	3
EGAD00001000341	This pilot study aims to generate pilot data to inform future study designs in consanguineous families or inbred populations by resequencing the exome of six individuals from five families with neurodevelopmental diseases. For all of these families a single mapping interval containing the causal variant has previously been identified.	Illumina HiSeq 2000	6
EGAD00001000342	This project aims to find causal variants in 50 patients diagnosed with Microcephalic Osteodysplastic Primordial Dwarfism (MOPD), of presumed recessive inheritance performing whole exome sequencing to ~50x mean depth.This is a collaboration with Prof A. Jackson, MRC Human Genetics Unit, Edinburgh	Illumina Genome Analyzer II Illumina HiSeq 2000	66
EGAD00001000343	This project aims to identify highly penetrant coding variants increasing the risk of Congenital Heart Disease (CHD) performing whole exome sequencing on DNA samples from 23 affected individuals, selected from 10 families with presumed Autosomal Recessive Inheritance. This is a collaboration with Prof. Eamonn Maher and Dr. Chirag Patel from the Department of Medical and Molecular Genetics, University of Birmingham plans to sequence 23 indexed Agilent whole exome pulldown libraries on 75Bp PE HiSeq (Illumina)	Illumina HiSeq 2000	24
EGAD00001000344	Exome sequencing of 30 parent-offspring trios to >50X mean depth, where the offspring has sporadic TOF, to identify potential causal de novo mutations. We will use the exome plus design for pulldown that incorporates ~6.8Mb of additional regulatory sequences in addition to the ~50Mb GENCODE exome.	Illumina HiSeq 2000	90
EGAD00001000345	Exome sequencing of 12 DNA samples obtained from patients with structural brain malformations.	Illumina HiSeq 2000	9
EGAD00001000346	Exome sequencing of patients and their families with diverse rare neurological disorders. Some families have prior linkage data identifying a specific chromosomal interval or interest, other families do not have linkage data available. Many of these families come from special populations whose demography or preference for consanguineous marriages make them particularly tractable for genetic studies.	Illumina HiSeq 2000	30
EGAD00001000347	These samples include exome sequences of family members with dyslipidemias from Finnish origin.	Illumina HiSeq 2000	95
EGAD00001000348	This pilot study aims to generate pilot data to inform future study designs by resequencing the whole exomes of 10 unrelated individuals diagnosed with Bilateral Anophthalmia.	Illumina Genome Analyzer II	16
EGAD00001000349	These samples are from locally advanced breast cancers that have been treated with epirubicin monotherapy before surgery. We will sequence some samples from patients with good response to the therapy and some with poor response to the therapy.	Illumina HiSeq 2000	33
EGAD00001000350	We propose to definitively characterise the somatic genetics of a number of pediatric malignant tumours including ependymoma, high grade glioma and central nervous system primitive neurectodermal tumours through generation of comprehensive catalogues of somatic mutations by high coverage genome sequencing.	Illumina HiSeq 2000	17
EGAD00001000351	This pilot study aims to generate pilot data to inform future study designs by resequencing the whole exomes of 10 unrelated individuals diagnosed with Congenital Heart Disease (CHD).	Illumina Genome Analyzer II	16
EGAD00001000352	DATA FILES FOR SJLGG	Illumina HiSeq 2000	7
EGAD00001000353	DATA FILES FOR SJLGG	Illumina HiSeq 2000	45
EGAD00001000354	Testing the feasibility of genome-scale sequencing in routinely collected formalin-fixed paraffin-embedded (FFPE) cancer specimens versus matched fresh-frozen samples using targeted pulldown capture prior to Illumina sequencing.	Illumina HiSeq 2000	81
EGAD00001000355	ICGC MMML-seq Data Freeze March 2013 whole genome sequencing	Illumina HiSeq 2000	46
EGAD00001000356	ICGC MMML-seq Data Freeze March 2013 transcriptome sequencing	Illumina HiSeq 2000	23
EGAD00001000357	PCR products were obtained from each target loci using genomic DNA from human iPS cells. Subsequently, PCR products are pooled and subjected to Illumina library preparation. The library will be sequenced by MiSeq. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina MiSeq	4
EGAD00001000358	Chondrosarcoma (CHS) is a heterogeneous collection of malignant bone tumours and is the second most common primary malignancy of bone after osteosarcoma. Recent work has identified frequent, recurrent mutations in IDH1/2 in nearly half of central CHS. However, there has been little systematic genomic analysis of this tumour type and thus the contribution of other genes is unclear. Here we report comprehensive genomic analyses of 49 cases of CHS. We identified hypermutability of the major cartilage collagen COL2A1 with insertions, deletions and rearrangements identified in 37% of cases. The patterns of mutation were consistent with selection for variants likely to impair normal collagen biosynthesis. In addition we identified mutations in IDH1/2 (59%), TP53 (20%), the RB1 pathway (27%) and hedgehog signaling (22%).	Illumina HiSeq 2000	17
EGAD00001000359	In this study we will sequence the transcriptome of Verified Cancer Cell lines. This will be married up to whole exome and whole genome sequencing data to establish a full catalog of the variations and mutations found.	Illumina HiSeq 2000	2
EGAD00001000360	The genome-wide landscape of somatically acquired mutations in mesothelioma has not been deeply characterised to date, but advances in DNA sequencing technology now allow this to be addressed comprehensively. Harnessing massively parallel DNA sequencing platforms, we will identify somatically acquired point mutations in all coding regions of the genome from patients with mesothelioma. In addition, using paired-end sequencing, we will map copy number changes and genomic rearrangements from the same patients.	Illumina HiSeq 2000	232
EGAD00001000361	This is a small pilot data set to test the feasibility of cDNA exomes across 1200 cancer cell line panel. cDNA exomes or Fus-seq is further explained in this studies Abstract.	Illumina HiSeq 2000	3
EGAD00001000362	Human induced pluripotent stem (hiPS) cells hold great promise for regenerative medicine. Safety issues of use of hiPS cells however remain to be addressed. One of such issues is mutations derived from somatic donor cells and introduced during genome manipulation. We sequence whole genomes of hiPS cells and analyzed mutations. Our study brings hiPS cell technology one step closer to application to regenerative medicine.	Illumina HiSeq 2000	7
EGAD00001000363	Common variable immunodeficiency (CVID) is the most common form of primary immunodeficiency with an estimated incidence of 1:10,000. It has been apparent for many years that CVID has a genetic component, occurs frequently in families and can have both a recessive or dominant mode of inheritance. In recent years, 4 genes underlying CVID have been identified; however, mutations within in them are estimated to account for no more than 10% of all cases of CVID. We have identified a multi-generational family with autosomal dominant CVID. Genome-wide linkage analysis has mapped the locus underlying CVID in this family to an approximately 9.2 Mb interval on chromosome 3q27.3-q29, between the markers D3S3570 and D3S1265. This locus is distinct from any of the previously mapped susceptibility loci suggesting a novel genetic variant is responsible for disease in this family. The aim of this study is to use exome sequencing of affected (n = 4) and unaffected (n = 4) individuals, in tandem with the available genetic mapping data, to identify the causal variant underlying CVID in this family.	Illumina HiSeq 2000	8
EGAD00001000364	We performed low coverage whole genome sequencing of plasma DNA from prostate cancer patients to establish copy number profiles on both a genome-wide and a gene-specific level. The data include plasma samples from prostate cacner patients (n=13), non-malignant controls (males, n=10 and females, n=9), plasma samples from pregnancies with aneuploid and euploid fetuses (n=4). Furthermore, we sequenced different tumor samples (n=6) of one patients and a serial dilution of HT29 in a background of normal DNA (n=9).	Illumina MiSeq	50
EGAD00001000365	In this study we analysed patients with metastatic prostate cancer to scan their tumor genomes noninvasively in plasma DNA. We enriched 1.3 Mbp of seven plasma DNAs (4 CRPC cases: CRPC1-3 and CRPC5; 3 CSPC cases: CSPC1-2 and CSPC4) including exonic sequences of 55 cancer genes and 38 introns of 18 genes, where fusion breakpoints have been described using Sure Select Custom DNA Kit.	Illumina MiSeq	7
EGAD00001000366	WGBS data of whole blood samples from smoking and non-smoking mothers and their children at gestation/birth and follow-up years.	Illumina HiSeq 2000	52
EGAD00001000367	Genomic libraries (500 bps) will be generated from total genomic DNA derived from lung cancer patients and subjected to short paired end sequencing on the llumina platform. Paired reads will be mapped to build 37 of the human reference genome to facilitate the generation of genome wide copy number information, and the identification of novel rearranged cancer genes and gene fusions.	Illumina HiSeq 2000	5
EGAD00001000368	Genomic libraries (500 bps) will be generated from total genomic DNA derived from Osteosarcoma cancer patients and subjected to short paired end sequencing on the llumina platform. Paired reads will be mapped to build 37 of the human reference genome to facilitate the generation of genome wide copy number information, and the identification of novel rearranged cancer genes and gene fusions.	Illumina HiSeq 2000	3
EGAD00001000369	We propose to definitively characterise the somatic genetics of a number of pediatric malignant tumours including ependymoma, high grade glioma and central nervous system primitive neurectodermal tumours through generation of comprehensive catalogues of somatic mutations by high coverage genome sequencing.	Illumina HiSeq 2000	3
EGAD00001000370	This dataset is compromised of 5 sequencing experiments from a single patient with sporadic and recurring parathyroid carcinoma. The samples include whole genome sequence of the primary tumor, the first recurrent tumor and peripheral blood. Whole transcriptome sequence of the first and second recurrent tumors are also included.	Illumina HiSeq 2000	5
EGAD00001000371	Sequencing data for PDAC cell lines generated by QCMG	Illumina HiSeq 2000 Illumina HiSeq 2500	54
EGAD00001000372	We conducted whole genome sequencing and DNA SNP array of 12 uveal melanoma genomes and their matched DNA from blood. We also conducted RNA-seq of the 12 tumour samples.	Illumina HiSeq 2000	24
EGAD00001000380	Illumina paired-end sequencing of whole- exome pulldown DNA from Severe Insulin Resistant patients.	Illumina Genome Analyzer II Illumina HiSeq 2000	64
EGAD00001000381	Illumina paired-end sequencing of whole- exome pulldown DNA from Severe Insulin Resistant patients.	Illumina HiSeq 2000	3
EGAD00001000382	Whole Exome Sequencing of Permanent Neonatal Diabetes Patients	Illumina HiSeq 2000	25
EGAD00001000383	In collaboration with Dr Robert Semple we have identified a family harbouring an autosomal dominant variant, which leads to severe insulin resistance (SIR), short stature and facial dysmorphism. This family is unique within the SIR cohort in having normal lipid profiles, preserved adiponectin and normal INSR expression and phosphorylation. DNA is available for 7 affected and 7 unaffected family members across 3 generations. All 14 samples have been genotyped using microsatellites and the Affymetrix 6.0 SNP chip. Linkage analysis identified an 18.8Mb haplotype on chromosome 19 as a possible location of the causative variant. However, Exome sequencing of 3 affected and 1 unaffected family members has not identified the causative variant suggesting the possibility of an intronic or intergenic variant in this region or elsewhere in the genome. We propose to conduct whole genome sequencing of 5 members of the pedigree at a depth of 20X. The chosen samples are two sets of parents plus one member of an unaffected branch of the pedigree who shares the risk haplotype on chromosome 19. Sequencing of the two sets of parents will be used along with the genome-wide SNP data to impute 4 affected children giving an effect sample size of 6 affected individuals.	Illumina HiSeq 2000	7
EGAD00001000384	In order to progress human induced pluripotent stem cells (hiPSCs) towards the clinic, several outstanding questions must be addressed. It is possible to reprogram different somatic cell types into hiPSCs but it is unlcear whether some cell types carry through fewer mutations through reprogramming (either due to mutations present in the primary cells, or mutations accumulated during reprogramming). Through in depth analysis of hiPSCs generated from different somatic cells, it will be possible to assess the variation in genetic stability of different cell types.	Illumina HiSeq 2000	35
EGAD00001000385	Wholegenome libraries will be prepared from at least two serial samples reflecting different stages of disease progression and matched constitutional DNA for 30 Myeloproliferative Disease samples. Five lanes of Illumina HiSeq sequencing will be performed on each of the tumour samples and four lanes for each of the constitutional DNA. Sequencing data will mapped to build 37 of the human reference genome and analysis will be performed to characterize the spectrum of somatic variation present in these samples including single base pair mutations, insertions, deletions as well as larger structural variants and genomic rearrangements.	Illumina HiSeq 2000	108
EGAD00001000386	Wholegenome libraries will be prepared from at least two serial samples reflecting different stages of disease progression and matched constitutional DNA for 30 Myelodysplastic syndrome patient samples. Five lanes of Illumina HiSeq sequencing will be performed on each of the tumour samples and four lanes for each of the constitutional DNA. Sequencing data will mapped to build 37 of the human reference genome and analysis will be performed to characterize the spectrum of somatic variation present in these samples including single base pair mutations, insertions, deletions as well as larger structural variants and genomic rearrangements.	Illumina HiSeq 2000	83
EGAD00001000387	This study aims to whole genome sequence DNA derived from breast cancer patients who received neo-adjuvany chemotherapy. All patients had multiple biopsies performed before chemotherapy. Patients who had residual disease after the course of treatment underwent a further biopsy. We aim to characterise the mutations involved.	Illumina HiSeq 2000	35
EGAD00001000388	Genomic libraries (500 bps) will be generated from total genomic DNA derived from lung cancer patients and subjected to short paired end sequencing on the llumina platform. Paired reads will be mapped to build 37 of the human reference genome to facilitate the generation of genome wide copy number information, and the identification of novel rearranged cancer genes and gene fusions.	Illumina HiSeq 2000	15
EGAD00001000389	Cancer is driven by mutations in the genome. We will uncover the mutations that give rise to Ewing's sarcoma, a bone tumour that largely affects children. We will use second generation Illumina massively parallel sequencing, and bespoke software, to characterise the genomes and transcriptomes of Ewing's sarcoma tumours.	Illumina HiSeq 2000	20
EGAD00001000390	We propose to definitively characterise the somatic genetics of triple negative breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses.	Illumina HiSeq 2000	101
EGAD00001000392	Agilent whole exome hybridisation capture was performed on genomic DNA derived from Chondrosarcoma cancer and matched normal DNA from the same patients. Next Generation sequencing performed on the resulting exome libraries and mapped to build 37 of the human reference genome to facilitate the identification of novel cancer genes. Now we aim to re find and validate the findings of those exome libraries using bespoke pulldown methods and sequencing the products.	Illumina MiSeq	60
EGAD00001000393		Illumina HiSeq 2000	30
EGAD00001000394	DNA methylation has been shown to play a major role in determining cellular phenotype by regulating gene expression. Moreover, dysregulation of differentially methylated genes has been implicated in disease pathogenesis of various conditions including cancer development as well as autoimmune diseases such as systemic Lupus erythematosus and rheumatoid arthritis. Evidence is rapidly accumulating for a role of DNA methylation in regulating immune responses in health and disease. However, the exact mechanisms remain unknown. The overall aim of the project is to investigate the role of epigenetic mechanisms in regulating immunity and their impact on autoimmune disease pathogenesis.The aim of this pilot study is to perform whole genome methylation analysis in peripheral blood mononuclear cells (PBMCs) and cell subsets (CD4, CD8, CD14, CD19, CD16 and whole PBMCs) obtained from 6 healthy volunteers. Whole genome methylation analysis will be performed using two methodological approaches, the Infinium Methylation Bead Array K450 (Illumina) and MeDIP-seq. mRNA expression arrays will also be performed in order to correlate DNA methylation with gene expression as well as genotyping on the Illumina OmniExpress chip	Illumina Genome Analyzer II	6
EGAD00001000395	Noninvasive Prenatal Molecular Karyotyping from Maternal Plasma		1
EGAD00001000396	We performed serial plasma-Seq analyses on a male who progressed from castration-sensitive to castration-resistant prostate cancer within 10 months following treatment with androgen-deprivation therapy.	Illumina MiSeq	2
EGAD00001000397	The Cardiogenics re-sequencing study will consist of three parts: Eight pools of 25 individuals will be sequenced using a Nimblegen hybrid-capture solution specific to miRNA sequences, 80 pools of 25 individuals will be sequenced using a custom Agilent SureSelect array covering genes associated with coronary artery disease (CAD) and myocardial infarction (MI), 10 individuals from families with a history of CAD/MI will be exome sequenced using the Sanger exome array. The experiment will use the early onset patients from the German MI cohort and the UK BHF CAD/MI cohort both of which have strong family history. For controls we will consider individuals from the UKBS and KORA cohorts.	Illumina HiSeq 2000	47
EGAD00001000398	The Cardiogenics re-sequencing study will consist of three parts: Eight pools of 25 individuals will be sequenced using a Nimblegen hybrid-capture solution specific to miRNA sequences, 80 pools of 25 individuals will be sequenced using a custom Agilent SureSelect array covering genes associated with coronary artery disease (CAD) and myocardial infarction (MI), 10 individuals from families with a history of CAD/MI will be exome sequenced using the Sanger exome array. The experiment will use the early onset patients from the German MI cohort and the UK BHF CAD/MI cohort both of which have strong family history. For controls we will consider individuals from the UKBS and KORA cohorts.	Illumina Genome Analyzer II	8
EGAD00001000399	In 2009 we identified a four-generation family with over 700 members and 41 affected with Crohn's disease (CD). At the time we sequenced the exome of 6 affected individuals but did not identify any coding variants which appear to explain the high prevalence of disease. Since then we have collected DNA from a large number of additional family members, genotyped linkage arrays on the entire family to refine genomic regions shared by identity by descent and genotyped affected and unaffected members at known CD risk loci identified by Genome Wide Association Studies (GWAS). These analyses have confirmed that a significant unexplained excess of disease remains after accounting for all known genetic factors, and that several regions of the genome are shared by a large fraction of affected individuals. We therefore perform whole genomes sequencing from 8 individuals which will allow us to impute the complete sequence of nearly all the members of the two largest and most severely affected branches of the family.	Illumina HiSeq 2000	8
EGAD00001000400	The Cardiogenics re-sequencing study will consist of three parts: Eight pools of 25 individuals will be sequenced using a Nimblegen hybrid-capture solution specific to miRNA sequences, 80 pools of 25 individuals will be sequenced using a custom Agilent SureSelect array covering genes associated with coronary artery disease (CAD) and myocardial infarction (MI), 10 individuals from families with a history of CAD/MI will be exome sequenced using the Sanger exome array. The experiment will use the early onset patients from the German MI cohort and the UK BHF CAD/MI cohort both of which have strong family history. For controls we will consider individuals from the UKBS and KORA cohorts.	Illumina HiSeq 2000	12
EGAD00001000401	Population based sequencing of whole genomes of Crohn's disease patients.	Illumina HiSeq 2000	2926
EGAD00001000402	The study will analyse by exome sequencing 42 Greek patients with premature MI and no vessel disease to identify genetic factors underlying this condition.	Illumina HiSeq 2000	46
EGAD00001000403	The ENGAGE project is a FP7 funded EU project aiming to combine genetic and phenotype information from European population based cohorts. In this sub-project we aim to do whole exome sequencing of individuals selected from Health 2000 and FINRISK cohorts. Individuals have been selected based on their metabolic trait phenotypes	Illumina HiSeq 2000	394
EGAD00001000404	Acute myeloid leukaemia (AML) is an aggressive and molecularly diverse disease with a poor overall survival of 20-25%. With an annual incidence of 2.9 per 100,000, AML is currently the commonest myeloid malignancy in Europe, yet the two main therapeutic options for this disease, anthracyclines and purine analogues, have remained unchanged for over 20 years. Currently patients are stratified at diagnosis according to a series of clinicopathological parameters (e.g. age, white cell count and presence/absence of previous clonal haematological disease) and molecular markers (e.g. chromosomal translocations/deletions, aneuploidy and mutations in genes such as FLT3 and NPM1). Patients with adverse prognostic features, whose prognosis is particularly poor (e.g. <15% long-term survival) are offered treatment with allogeneic bone marrow transplantation (allo-BMT) if a sibling or unrelated donor is available. This can significantly improve survival (e.g. up to 40% long-term survival in some contexts), albeit at the expense of significant toxicity and transplant-related mortality (TRM). Allo-BMT is thought to work in part by allowing the delivery of large doses of chemotherapy followed by haemopoietic "rescue" with donor haemopoietic stem cells (haemopoietic failure would otherwise ensue). However, potentially the most potent effect of allo-BMT is the cytotoxic effect of donor lymphocytes against AML blasts, a phenomenon known as graft-vs-leukaemia (GVL) effect. Increasingly, transplants using reduced chemotherapy intensity (mini-allografts) are being used that partially circumvent the toxicity from chemotherapy and rely on GVL to effect cure. Nevertheless, AML relapse after allo-BMT still occurs at a significant rate of up to 80% depending on the type of transplant. There is accumulating evidence that genetic events in residual leukaemic cells enable them to evade immunodetection and therefore survive the GVL effect and expand to cause relapse. The most striking example of this is the loss of HLA antigens after transplants in which donor and recipient are not fully HLA-matched. In these cases, the leukaemia "deletes" the genomic region containing the disparate HLA antigen which was preferentially targeted as "foreign" by the GVL effect. However, the genetic basis of immune evasion in the majority of transplants, which are fully HLA matched, is not known. One possibility is that loss of genes coding for antigens outside the HLA locus but which are also targets of GVL may operate, alternatively genetic events that affect processes downstream of immunological cytotoxicity may be responsible. The identification of genetic events that mediate immune evasion would not only facilitate the understanding of this process but can help plan therapeutic interventions that improve the outcomes of allogeneic transplantation for AML and other disorders. We intend to study this by conducting exome sequencing on 6 cases of AMLs from patients that attend my clinic at Addenbrooke's hospital and have relapsed after allogeneic transplantation. Samples from AML diagnosis, remission/normal and AML relapse (total n=18) will be studied to identify somatic mutations in the primary AML and those acquired by the relapsed clone. The 18 samples will also be studied by array CGH to detect regions of genomic amplification or deletion.	Illumina HiSeq 2000	25
EGAD00001000405	In this project we will sequence the exomes of 250 patients with Parkinson's disease	Illumina HiSeq 2000	247
EGAD00001000406	Blastic plasmacytoid dendritic cell neoplasm (BPDCN) is a rare and aggressive haematological malignancy derived from precursors of plasmacytoid dendritic cells. Due to the rarity of BPDCNs our knowledge of their molecular pathogenesis was until recently confined to observations describing reccurent chromosomal deletions involving chromosomes 5q, 12p, 13q, 6q, 15q and 9. A recent publication went on to delineate the common deleted regions using aCGH and demonstrated that these centred around known tumour suppressor genes including CDKN2A/B (9p21.3), RB1 (12p13.2-14.3), CDKN1B (13q11-q12) and IKZF1 (7p12.2). These mutations are found recurrently in several different cancers and in most cases are thought to be involved in tumour progression rather than initiation. However, the well-defined nature and cellular ontogeny of these neoplasms suggests strongly that they share one or a few characteristic mutations as has been demonstrated for other uncommon but well-defined neoplasms such as Hairy Cell Leukemia (BRAF) and ovarian Granulosa Cell tumours (FOXL2).	Illumina HiSeq 2000	14
EGAD00001000407	We are sequencing the exomes of patients with paroxysmal neurological disorders mainly focusing on migraine and epilepsy. Cases are collected from performance sites of members of the International Headache Genetics consortium and EuroEPINOMICS. Most cases have a strong family history. The study sample will include both cases and controls.	Illumina HiSeq 2000	327
EGAD00001000408	We aim to whole-exome sequence DNA samples from 75 individuals with severe forms of Inflammatory Bowel Disease and related autoimmune diseases to identify the rare, highly penetrant, variants that we believe underlie these phenotypes. Case samples will be obtained from both new and existing (UK IBD Genetics Consortium) collaborators to ensure only the most extreme cases are sequenced.	Illumina HiSeq 2000	4
EGAD00001000409	2000 ulcerative colitis cases drawn from the UKIBD Genetics Consortium cohort and whole-genome sequenced at 2X depth. A case control association study using control samples whole-genome sequenced by UK10K will be undertaken to identify common, low-frequency and rare variants associated with ulcerative colitis. Data will be combined with similar data across 3000 Crohn's disease cases from the same cohort to identify inflammatory bowel disease (IBD) loci and better understand the genetic differences and similarities of the two common forms of IBD.	Illumina HiSeq 2000	1992
EGAD00001000410	We will perform exome sequencing on selected cases of splenic marginal zone lymphoma (SMZL) and diffuse large B-cell lymphoma (DLBCL) in order to characterise their genetic makeup and identify biomarkers for prognosis and prediction of treatment response.	Illumina HiSeq 2000	78
EGAD00001000411	These samples include exome sequences of family members with dyslipidemias from northern Finnish origin.	Illumina HiSeq 2000	68
EGAD00001000412	We are sequencing the exomes of patients with paroxysmal neurological disorders mainly focusing on migraine and epilepsy. Cases are collected from performance sites of members of the International Headache Genetics consortium and EuroEPINOMICS. Most cases have a strong family history. The study sample will include both cases and controls.	Illumina HiSeq 2000	477
EGAD00001000413	UK10K_RARE_CHD REL-2013-04-20	Illumina Genome Analyzer II Illumina HiSeq 2000	125
EGAD00001000414	UK10K_RARE_CILIOPATHIES REL-2013-04-20	Illumina Genome Analyzer II Illumina HiSeq 2000	122
EGAD00001000415	UK10K_RARE_COLOBOMA REL-2013-04-20	Illumina Genome Analyzer II Illumina HiSeq 2000	123
EGAD00001000416	UK10K_RARE_FIND REL-2013-04-20	Illumina Genome Analyzer II Illumina HiSeq 2000	124
EGAD00001000417	UK10K_RARE_HYPERCHOL REL-2013-04-20	Illumina Genome Analyzer II Illumina HiSeq 2000	125
EGAD00001000418	UK10K_RARE_NEUROMUSCULAR REL-2013-04-20	Illumina HiSeq 2000	140
EGAD00001000419	UK10K_RARE_SIR REL-2013-04-20	Illumina Genome Analyzer II Illumina HiSeq 2000	121
EGAD00001000420	UK10K_RARE_THYROID REL-2013-04-20	Illumina Genome Analyzer II Illumina HiSeq 2000	124
EGAD00001000421	The aim of this project is to identify rare variants in the 1q region associated with type 2 diabetes. To this end 651 case samples and 651 control samples from six populations have been pooled (pool sizes range from 27-33 individuals), and are being sequenced. The hybridization solution being used captures the exons and UTRs of genes in the 1q region.	Illumina HiSeq 2000	48
EGAD00001000422	We perform whole exome sequencing on samples from a large IBD pedigree. The selected samples are from more distantly related family members (healthy and with IBD) and a set of matched population (Ashkenazy Jewish ancestry) samples.	Illumina HiSeq 2000	86
EGAD00001000423	The aim is to find rare variants of intermediate penetrance in those at risk of Crohn's disease	Illumina Genome Analyzer II	10
EGAD00001000424	The aim of this project is to identify rare variants in the 1q region associated with type 2 diabetes. To this end 651 case samples and 651 control samples from six populations have been pooled (pool sizes range from 27-33 individuals), and are being sequenced. The hybridization solution being used captures the exons and UTRs of genes in the 1q region.	Illumina Genome Analyzer II Illumina HiSeq 2000	23
EGAD00001000425	GENCORD2 RNA-seq BAM files using BWA	Illumina Genome Analyzer II Illumina HiSeq 2000	568
EGAD00001000427		Illumina HiSeq 2000	30
EGAD00001000428	204 individuals were genotyped with the Illumina 2.5M Omni chip. Filtered genotypes were imputed into the 1000 genomes project European panel SNPs. Beagle R2 is indicated in VCF files for further filtering. See Materials and Methods in publication for details.		204
EGAD00001000429	UK10K_OBESITY_TWINSUK REL-2013-04-20	Illumina HiSeq 2000	68
EGAD00001000430	UK10K_NEURO_UKSCZ REL-2013-04-20	Illumina HiSeq 2000	554
EGAD00001000431	UK10K_OBESITY_GS REL-2013-04-20	Illumina HiSeq 2000	428
EGAD00001000432	UK10K_OBESITY_SCOOP REL-2013-04-20	Illumina HiSeq 2000	985
EGAD00001000433	UK10K_NEURO_ABERDEEN REL-2013-04-20	Illumina HiSeq 2000	392
EGAD00001000434	UK10K_NEURO_ASD_BIONED REL-2013-04-20	Illumina HiSeq 2000	77
EGAD00001000435	UK10K_NEURO_ASD_FI REL-2013-04-20	Illumina HiSeq 2000	84
EGAD00001000436	UK10K_NEURO_ASD_GALLAGHER REL-2013-04-20	Illumina HiSeq 2000	77
EGAD00001000437	UK10K_NEURO_ASD_TAMPERE REL-2013-04-20	Illumina HiSeq 2000	55
EGAD00001000438	UK10K_NEURO_EDINBURGH REL-2013-04-20	Illumina HiSeq 2000	234
EGAD00001000439	UK10K_NEURO_FSZNK REL-2013-04-20	Illumina HiSeq 2000	285
EGAD00001000440	UK10K_NEURO_GURLING REL-2013-04-20	Illumina HiSeq 2000	46
EGAD00001000441	UK10K_NEURO_IMGSAC REL-2013-04-20	Illumina HiSeq 2000	113
EGAD00001000442	UK10K_NEURO_IOP_COLLIER REL-2013-04-20	Illumina HiSeq 2000	172
EGAD00001000443	UK10K_NEURO_MUIR REL-2013-04-20	Illumina Genome Analyzer II Illumina HiSeq 2000	175
EGAD00001000444	Cancer is driven my mutations in the genome. We will uncover the mutations that give rise to Ewing's sarcoma, a bone tumour that largely affects children. We will use second generation Illumina massively parallel sequencing, and bespoke software, to characterise the genomes and transcriptomes of Ewing's sarcoma tumours.	Illumina HiSeq 2000	3
EGAD00001000445	We recently worked-up a pulldown protocol for studying 21 genes recurrently mutated in AML (Study1770). Our manuscript is currently under revision and to address the reviewers' comments we need to validate some mutations by re-sequencing. In this add-on study we will be using PCR followed by MiSeq for this purpose.	Illumina MiSeq	9
EGAD00001000446	Fastq files of 213 samples of hepatocellular carcinoma (NCCRI)	Illumina HiSeq 2000	213
EGAD00001000596	This project is to develop and validate a method to detect de novo mutations in a foetal genome through deep sequencing of cell-free DNA from the plasma of pregnant women. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2000	5
EGAD00001000597		Illumina HiSeq 2000	212
EGAD00001000598	The Ethiopian area stands among the most ancient ones ever occupied by human populations and their ancestors. Particularly, according to archaeological evidences, it is possible to trace back the presence of Hominids up to at least 3 million years ago. Furthermore, the present day human populations show a great cultural, linguistic and historic diversity which makes them essential candidate to investigate a considerable part of the African variability. Following the typing of 300 Ethiopian samples on Illumina Omni 1M (see Human Variability in Ethiopia project, previously approved by the Genotyping committee) we now have a clearer idea on which populations living in the area include the most of the diversity.This project therefore aims to sequence the whole genome of 300 individuals at low (4-8x) depth belonging to the six most representative populations of the Ethiopian area to produce a unique catalogue of variants peculiar of the North East Africa. Furthermore 6 samples (one from each population) will also be sequenced at high (30x) depth to ensure full coverage of the diversity spectrum.The retrieved variants will be of great help in evaluating the demographic dynamics of those populations as well as shedding light on the migrations out of Africa.	Illumina HiSeq 2000	120
EGAD00001000599	We have collected material from a patient who had BrafV600E mutant melanoma that was treated with PLX4032. We have germline DNA from the patient and DNA and RNA from distinct lesions before and after treatment with PLX4032. We have transcriptome sequenced these samples to obtain a snap shot of the mechanisms of resistance that are operative.	Illumina HiSeq 2000	6
EGAD00001000601		Illumina HiSeq 2000	1
EGAD00001000602		Illumina HiSeq 2000	1
EGAD00001000603	We recently used the Agilent SureSelect platform to re-sequence a set of genes known to be mutated in human AML. The results from 10 AML DNA samples were very satisfactory, but the effort required was significant. Thus, we decided to re-sequence the same genes using the Haloplax system for target enrichment in 48 AML samples. We planned to do this using MiSeq and have data from a pilot of 3 samples. The data is promising but coverage appears pathcy so far. However, in order to get a better understanding of the data we will need deeper sequencing. We will need two lanes of HiSeq to get the same degree coverage as Sureselect. his data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2000 Illumina MiSeq	54
EGAD00001000604	In order to progress human induced pluripotent stem cells (hiPSCs) towards the clinic, several outstanding questions must be addressed. It is possible to reprogram different somatic cell types into hiPSCs and from studies in the mouse, it appears that an epigenetic memory of the starting cell type is carried over to hiPSCs. However a comprehensive comparative study of the characteristics of these hiPSCs has been missing from the literature. Importantly studies which aimed to address these aspects of hiPSCs have used cells from different patients. In order to avoid this important confounding variable and to keep the genetic background constant, tissue samples were procured from the patients and reprogrammed to iPS cells. The transcriptomes of these iPS cells will be compared. Protocol: primary cultures of cells were reprogrammed to iPS cells. RNA was extracted using a standard column extraction kit.	Illumina HiSeq 2000	47
EGAD00001000605	CR products were obtained from each target loci using genomic DNA from human iPS cells. Subsequently, PCR products are pooled and subjected to Illumina library preparation. The library will be sequenced either by HiSeq or MiSeq. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2000 Illumina MiSeq	10
EGAD00001000606	Background Massively parallel sequencing technology has transformed cancer genomics. It is now feasible, in a clinically relevant time-frame, for a clinically manageable cost, to screen DNA from patient tumours for mutations essentially genome-wide. The challenge for personalised medicine will be to increase the sample size to thousands or tens of thousands of well-characterised cases in order to attain sufficient statistical power to stratify patients accurately across the complexity and genomic heterogeneity expected for most of the common tumour types. Currently, whole genome sequencing on this scale is not feasible, and targeted sequencing of relevant portions of the genome will be required. Pilot data We have developed protocols for large-scale, multiplexed sequencing of 100-200 genes in thousands of samples. Essentially, using robotic technology, genomic DNA from the cancer specimen is processed into sequencing libraries with unique DNA barcodes, thereby allowing sequencing reads to be attributed to the sample they derive from. Currently, these sequencing libraries can be generated in a 96-well format using fully automated protocols, and we are exploring methods to expand this to a 384-well format. The sequencing libraries are pooled and hybridized to custom sets of RNA baits representing the genomic regions of interest. Sequencing of the pulled-down libraries is done in pools of 48-96 samples per lane of an Illumina Hi-Seq. This protocol is already implemented at the Sanger Institute. We have published proof that somatic mutations in novel cancer genes can be identified from exome-wide sequencing. In unpublished pilot data, we have established the feasibility of robotic library production, custom pull-down, and multiplexed sequencing of barcoded libraries for 100 known myeloid cancer genes across 760 myelodysplasia samples. Highlights of the data thus far analysed reveal that the coverage is remarkably even between samples; when 96 samples are run, average coverage per lane of sequencing is ~250, with 90-95% of targeted exons covered by >25 reads; known mutations can be discovered in the data set; and the protocol is amenable to whole genome amplified DNA. The bioinformatic algorithms for identification of substitutions and indels in pull-down data are well-established; we have pilot data proving that copy number changes, LOH and genomic rearrangements in specific regions of interest can also be identified by tiling of baits across the relevant loci. Proposal We propose to apply this methodology to 10000 samples from patients with AML enrolled in clinical trials over the last 10-20 years. Oncogenic point mutations and potentially genomic rearrangements will be identified, and linked to clinical outcome data, with a view to undertaking the following sorts of analyses: ? Identification of co-occurrence, mutual exclusivity and clusters of driver mutations. ? Correlation of prognosis with driver mutations and potentially gene-gene interactions ? Exploration of genomic markers of drug response Ultimately, we would like to be in a position to release the mutation data together with matched clinical outcome data to genuine medical researchers via a controlled access approach, possibly within the COSMIC framework (www.sanger.ac.uk/genetics/CGP/cosmic/). The vision here is to generate a portal whereby a clinician faced with an AML patient and his / her mutational profile can obtain a ?personalised? prediction of outcome, together with a fair assessment of the uncertainty of the estimate. With a sufficient sample size, there would also be the potential to develop decision support algorithms for therapeutic choices based on such data.	Illumina MiSeq	38
EGAD00001000607	PCR products were obtained from each target loci using genomic DNA from human iPS cells. Subsequently, PCR products are pooled and subjected to Illumina library preparation. The library will be sequenced either by HiSeq or MiSeq. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina MiSeq	2
EGAD00001000608	PCR products were obtained from each target loci using genomic DNA from human iPS cells. Subsequently, PCR products are pooled and subjected to Illumina library preparation. The library will be sequenced either by HiSeq or MiSeq. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina MiSeq	60
EGAD00001000609	Whole transcriptome sequencing of 28 untreated prostate cancers, 13 castration resistant prostate cancers, and 12 benign prostatic hyperplasias.	Illumina HiSeq 2000	53
EGAD00001000610	Methylated DNA immunoprecipitation sequencing of 28 untreated prostate cancers, 11 castration resistant prostate cancers, and 12 benign prostatic hyperplasias.	Illumina HiSeq 2000	51
EGAD00001000611	Small RNA sequencing of 28 untreated prostate cancers, 12 castration resistant prostate cancers, and 3 benign prostatic hyperplasias.	Illumina HiSeq 2000	43
EGAD00001000612	Low coverage whole genome sequencing of 27 untreated prostate cancers, 9 castration resistant prostate cancers, and 4 benign prostatic hyperplasias.	Illumina HiSeq 2000	40
EGAD00001000613	UK10K_NEURO_ASD_MGAS REL-2013-04-20	Illumina HiSeq 2000	97
EGAD00001000614	UK10K_NEURO_ASD_SKUSE REL-2013-04-20	Illumina HiSeq 2000	341
EGAD00001000615	UK10K_NEURO_FSZ REL-2013-04-20	Illumina HiSeq 2000	128
EGAD00001000616	Pilocytic Astrocytoma ICGC PedBrain whole genome sequencing	Illumina HiSeq 2000	192
EGAD00001000617	Pilocytic Astrocytoma ICGC PedBrain RNA sequencing	Illumina HiSeq 2000	73
EGAD00001000618	1204 Sardinian males		1195
EGAD00001000619	Experiments using targeted pulldown methods will be sequenced to validate findings in the exomes of patients with Myeloproliferative Neoplasms (MPN).	Illumina HiSeq 2000	360
EGAD00001000620	A bespoke targeted pulldown experiment will be performed on patients with Angiosarcoma. the resulting products will be sequenced to determine the prevalence of previously found mutations in these patients.	Illumina HiSeq 2000	14
EGAD00001000621	We propose to definitively characterise the somatic genetics of Prostate cancer through generation of comprehensive catalogues of somatic mutations by high coverage genome sequencing. This study will aim to validate the findings of the whole genome study by re-sequencing regions of interest using a bespoke pulldown bait. See ICGC website for more information: http://icgc.org/icgc/cgp/70/508/71331	Illumina MiSeq	18
EGAD00001000623	This VCF contains the full sequence data post QC. This consists of 41,911 individuals. All polymorphic sites are present in this VCF.		41911
EGAD00001000624	Multifocality or multicentricity in breast cancer may be defined as the presence of two or more tumor foci within a single quadrant of the breast or within different quadrants of the same breast, respectively. This original classification of the breast cancer as multicentric or multifocal was based on the assumption that cancers arising in the same quadrant were more likely to arise from the same ductal structures than those occurring in separate areas of the breast. The problem with these definitions is that the ?quadrants? of the breast are arbitrary external designations, as no internal boundaries do exist. This project will therefore focus both on synchronous multifocal and multicentric tumors. The incidence of multifocal and multicentric breast cancers was reported to be between 13 and 75% depending on the definition used, the extent of the pathologic sampling of the breast and whether in situ disease is considered evidence of multicentricity (1). Although this incidence is variable, those figures show that it is a frequent phenomenon. Multiple (multifocal/multicentric) breast carcinomas, especially when occurring in the same breast, represent a real challenge for both pathologists and clinicians in terms of identifying the cellular origin and the best therapeutic management of the cancer. Multifocality or multicentricity has been associated with a number of more aggressive features including an increased rate of regional lymph node metastases and adverse patient outcome when compared with unifocal tumors (2-3), and a possible increased risk of local recurrence following breast conserving surgery (4). For the moment, the literature is divided on whether there is a corresponding impact on survival outcomes. Today, the current convention to stage and to treat multifocal and multicentric tumors is the classical tumor-node-metastasis (TNM) staging guidelines with which tumor size is assessed by the largest tumor focus without taking other foci of disease into consideration. If some papers, as the recent one from Lynch and colleagues, support the current staging convention (3), others, however, as Boyages et al. suggested that aggregate size and not the size of the largest lesion should be considered in order to refine the prognostic assessment of those tumors (5). On the top of that, the question whether multifocal/multicentric carcinomas are due to the spread of a single carcinoma throughout the breast or is due to multiple carcinomas arising simultaneously has been a matter of debate. Some studies suggested that multifocal breast cancer may result from either intramammary spread from a single primary tumor or multiple synchronous primary tumors; whereas others suggest that multiple breast carcinomas always arise from the same clone (6-8). Recently, Pietri and colleagues analyzed the biological characterization of a series of 113 multifocal/multicentric breast cancers (8) which were diagnosed over a 5-year period. The expression of estrogen (ER) and progesterone (PgR) receptors, Ki-67 proliferative index, expression of HER2 and tumor grading were prospectively determined in each tumor focus, and mismatches among foci were recorded. Mismatches in ER status were present in 5 (4.4%) cases and PgR in 18 (15.9%) cases. Mismatches in tumor grading were present in 21 cases (18.6%), proliferative index (Ki-67) in 17 (15%) cases and HER2 status in 11 (9.7%) cases. Interestingly, this heterogeneity among foci has led to 14 (12.4%) patients receiving different adjuvant treatments compared with what would have been indicated if we had only taken into account the biologic status of the primary tumor. This study therefore showed that differences in biological characteristics of multifocal/multicentric lesions play a crucial role in the adjuvant treatment decision making process. In this study, we will concentrate on a larger series of patients with multifocal invasive ductal breast cancer lesions. We aim at: 1. Evaluating the incidence of multifocality according to the different breast cancer molecular subtypes (ER-/HER2-, HER2+, ER+/HER2-). 2. Evaluating the incidence of multifocality in patients with hereditary breast cancer disease (presence of germline BRCA1 or BRCA2 mutations). Moreover, we would like to investigate if multifocal lesions with BRCA1 or BRCA2 mutations exhibit a characteristic combination of substitution mutation signatures and a distinctive profile of deletions as demonstrated recently by Nik-Zainal and colleagues (9). 3. Correlating multifocality with clinical information in order to define its influence on patients? survival (DFS and OS). 4. Carrying high coverage targeted gene sequencing of driver cancer genes and genes whose mutation is of therapeutic importance in order to compare clinically-relevant genetic differences between several multifocal breast cancer lesions. 5. Evaluating the impact of the distance between the different lesions on the clinical outcome but also on the genetic differences. 6. Comparing gene expression patterns between several multifocal breast cancer lesions and correlate them with the results of the targeted genes screen. 7. Characterizing the genomic and transcriptomic status of cancer related genes in metastatic lesions (local recurrence, positive lymph node or distant metastatic sites) from the same multifocal invasive ductal breast cancer patients in order to evaluate the consequence of genomic and transcriptomic heterogeneity of multifocal lesions on metastatic lesions. Multiple (multifocal/multicentric) breast carcinomas, especially when occurring in the same breast, represent a real challenge for both pathologists and clinicians in terms of identifying the cellular origin and the best therapeutic choice. This project has the potential to identify genetic/transcriptomic differences existing between several lesions constituting multifocal breast cancers, which in the routine clinical practice are usually considered to be homogeneous among them. We foresee validating significant results in a larger series of patients and this, in turn, could have a remarkable impact on the treatment and clinical management of multifocal breast cancers. Indeed, we hope to provide some evidence whether or not each focus matters in multifocal and multicentric breast cancer to define the adequate therapeutic approach, especially in the context of targeted therapies. The work to be done at Sanger will be target gene screen pooling of 1400 samples.	Illumina HiSeq 2000	908
EGAD00001000625	The main objective of this benchmark is the comparison of the full sequencing pipeline of different ICGC partners, including procedures, methods and performance of library preparation and whole-genome deep-sequencing. A secondary objective will be a follow-up comparison of data analysis pipelines for identification of germline and somatic variants subsequent to the results of the ICGC Somatic Variant Calling Pipeline Benchmark.	Illumina HiSeq 2000	2
EGAD00001000626	Exome sequencing data for tumor and matched normal samples of the EGAS00001000495 project.	Illumina HiSeq 2000	114
EGAD00001000627	Transcriptome sequencing data of tumor and 10 matched normal samples of the EGAS00001000495 project	Illumina HiSeq 2000	68
EGAD00001000628		Illumina Genome Analyzer IIx Illumina HiSeq 2000	66
EGAD00001000630	In this study we will sequence the transcriptome of Verified Matched Pair Cancer Cell line tumour samples. This will be married up to whole exome and whole genome sequencing data to establish a full catalog of the variations and mutations found.	Illumina HiSeq 2000	7
EGAD00001000631	PCR products were obtained from each target loci using genomic DNA from human iPS cells. Subsequently, PCR products are pooled and subjected to Illumina library preparation. The library will be sequenced either by HiSeq or MiSeq. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina MiSeq	4
EGAD00001000632		AB SOLiD 4 System	12
EGAD00001000634	The ETV6-RUNX1 fusion gene, found in 25% of childhood acute lymphoblastic leukemia (ALL), is acquired in utero but requires additional somatic mutations for overt leukemia. We used exome and low-coverage whole-genome sequencing to characterize the critical secondary events associated with leukemic transformation. RAG-mediated deletions emerge as the dominant mutational process, accounting for at least 43% of genomic rearrangements and characterized by the presence of recombination signal sequence motifs near the breakpoints; incorporation of non-templated sequence at the junction and a ten-fold enrichment at promoters and enhancers of genes actively transcribed in early B-lineage development. Single-cell tracking shows that this mechanism is not restricted to one founder cell but is rather active throughout leukemic evolution. Integration of point mutation and rearrangement data identifies recurrent inactivation of ATF7IP and MGA as two new tumor suppressor genes.Thus, a remarkably parsimonious mutational process transforms ETV6-RUNX1 lymphoblasts, striking promoters and enhancers of the genes that normally control B-cell differentiation.	Illumina HiSeq 2000	2
EGAD00001000635	The ETV6-RUNX1 fusion gene, found in 25% of childhood acute lymphoblastic leukemia (ALL), is acquired in utero but requires additional somatic mutations for overt leukemia. We used exome and low-coverage whole-genome sequencing to characterize the critical secondary events associated with leukemic transformation. RAG-mediated deletions emerge as the dominant mutational process, accounting for at least 43% of genomic rearrangements and characterized by the presence of recombination signal sequence motifs near the breakpoints; incorporation of non-templated sequence at the junction and a ten-fold enrichment at promoters and enhancers of genes actively transcribed in early B-lineage development. Single-cell tracking shows that this mechanism is not restricted to one founder cell but is rather active throughout leukemic evolution. Integration of point mutation and rearrangement data identifies recurrent inactivation of ATF7IP and MGA as two new tumor suppressor genes.Thus, a remarkably parsimonious mutational process transforms ETV6-RUNX1 lymphoblasts, striking promoters and enhancers of the genes that normally control B-cell differentiation.	Illumina Genome Analyzer II Illumina HiSeq 2000	50
EGAD00001000636	The ETV6-RUNX1 fusion gene, found in 25% of childhood acute lymphoblastic leukemia (ALL), is acquired in utero but requires additional somatic mutations for overt leukemia. We used exome and low-coverage whole-genome sequencing to characterize the critical secondary events associated with leukemic transformation. RAG-mediated deletions emerge as the dominant mutational process, accounting for at least 43% of genomic rearrangements and characterized by the presence of recombination signal sequence motifs near the breakpoints; incorporation of non-templated sequence at the junction and a ten-fold enrichment at promoters and enhancers of genes actively transcribed in early B-lineage development. Single-cell tracking shows that this mechanism is not restricted to one founder cell but is rather active throughout leukemic evolution. Integration of point mutation and rearrangement data identifies recurrent inactivation of ATF7IP and MGA as two new tumor suppressor genes.Thus, a remarkably parsimonious mutational process transforms ETV6-RUNX1 lymphoblasts, striking promoters and enhancers of the genes that normally control B-cell differentiation.	Illumina Genome Analyzer II	117
EGAD00001000637	Insertion of processed pseudogenes is known to occur in the germline but has not previously been observed in somatic cells. Formation of pseudogenes could represent a new class of mutation in cancers and a new source of potential driver events.	Illumina Genome Analyzer II Illumina HiSeq 2000	4
EGAD00001000638	Insertion of processed pseudogenes is known to occur in the germline but has not previously been observed in somatic cells. Formation of pseudogenes could represent a new class of mutation in cancers and a new source of potential driver events.	Illumina HiSeq 2000	20
EGAD00001000639	Insertion of processed pseudogenes is known to occur in the germline but has not previously been observed in somatic cells. Formation of pseudogenes could represent a new class of mutation in cancers and a new source of potential driver events.	Illumina HiSeq 2000	3
EGAD00001000640	Transcriptome studies in patients with rare genetic diseases can potentially aid in the interpretation of likely causal genetic variation through identification of altered transcript abundance and/or structure. RNA-Seq is the most sensitive assay for both investigating transcript structure and abundance The primary aim of this pilot project is to investigate to what degree integrating exome-Seq and RNA-Seq data on the same individual can accelerate the identification of causal alleles for rare genetic diseases. There are two main strands to this: (i) identifying which variants discovered in exome-seq appear to be having a functional impact on transcripts, and (ii) identifying transcript outliers, especially among known causal genes, that may not necessarily have a causal variant identified from exome sequencing. The latter may identify the presence of causal variants that lie far from coding regions (e.g. the formation of cryptic splice sites deep within introns, or loss of long range regulatory elements), which can be confirmed with further targeted genetic assays. Just over 50% of all disease-causing variants recorded in the Human Gene Mutation Database (HGMD) affect transcript structure and abundance (e.g. nonsense SNVs, essential splice site SNVs, frameshifting indels, CNVs). This pilot project will study RNA from lymphoblastoid cell-lines from 12 patients with primordial dwarfism syndromes, for 10 of these samples we have previously generate exome data as part of our collaboration with the group of Prof Andrew Jackson. The two remaining samples are positive controls where the causal mutation is known, and is known to affect transcript structure and/or abundance. Primordial dwarfism is a prime candidate for these RNA-seq studies because all known causal mutations to date have key roles in DNA replication and thus, unsurprisingly, the products of the causal genes are typically ubiquitously expressed. Each RNA will be sequenced, with two technical replicates (independent RT-PCR and libraries) per sample, and each replicate run in 1/2 of a HiSeq lane using 100bp paired reads. Samples preparation was as follows :The cells were grown to confluency, then pellets frozen at -80. RNA samples were prepared using the Qiagen RNeasy kit, then nanodropped and analyzed using the bioanalyzer to determine concentration and purity. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2000	24
EGAD00001000641	DNA replication errors occurring in mismatch repair (MMR) deficient cells persist as mismatch mutations and predispose to a range of tumors. Here, we sequenced the first whole-genomes from MMR-deficient endometrial tumors.	Complete Genomics Illumina HiSeq 2000	44
EGAD00001000642		Illumina HiScanSQ	2
EGAD00001000643		Illumina HiScanSQ	2
EGAD00001000644	ICGC PedBrain DNA Methylation project	Illumina HiSeq 2000	42
EGAD00001000645	ICGC MMML-seq Data Freeze July 2013 whole genome sequencing		42
EGAD00001000646	A selection of human cancers harbours somatic driver mutations in genes encoding histones, most notably childhood brain tumours with K27M substitutions of the histone 3.3 gene, H3F3A. We performed whole genome sequencing of the benign cartilage tumour, chondroblastoma, and targeted sequencing of histone 3.3 genes, H3F3A and H3F3B, in seven further skeletal tumour types. We identified an exceptionally high prevalence of novel histone 3.3 driver mutations at glycine 34 and at lysine 36. Histone 3.3 gene mutations were found in 91% in giant cell tumours of bone (48/53), mainly H3F3A G34W variants, and in 92% of chondroblastoma (73/79), predominantly K36M mutations in H3F3B. H3F3B is paralogous to the cancer gene H3F3A. However, H3F3B driver variants have not previously been reported in human cancer. Our observation demonstrate remarkable tumour-specificity of mutations, with respect to which histone 3.3 gene and residue is mutated, indicating that the advantage these mutations confer is tumour dependent. Moreover, tumour-specific mutation of H3F3A and H3F3B suggests, that although both genes encode identical proteins, they are likely non-redundant and employed differentially during skeletal development.	Illumina HiSeq 2000	14
EGAD00001000647	We are sequencing the exomes of patients with paroxysmal neurological disorders mainly focusing on migraine and epilepsy. Cases are collected from performance sites of members of EuroEPINOMICS. Most cases have a strong family history. The study sample will include both cases and controls.	Illumina HiSeq 2000	110
EGAD00001000648	ICGC MMML-seq Data Freeze July 2013 transcriptome sequencing		31
EGAD00001000650	ICGC MMML-seq Data Freeze July 2013 miRNA sequencing		52
EGAD00001000652	Pulldown experiments will be performed on a number of patients with Myeloproliferative Neoplasms (MPN). The pulldown will be a bespoke design targeting known mutations, this pulldown will be sequenced and analysed to inform prevalence of mutations and to inform to the possibility of use as a diagnostic tool.	Illumina HiSeq 2000	1036
EGAD00001000653	This is a continuation of the Chordoma Sequencing Project. All cancers arise due to somatically acquired abnormalities in DNA sequence. Systematic sequencing of cancer genomes allows acquisition of complete catalogues of all classes of somatic mutation present in cancer. These mutation catalogues will allow identification of the somatically mutated cancer genes that are operative and characterise patterns of somatic mutation that may reflect previous exogenous and endogenous mutagenic exposures. In this application, we aim to perform whole genome sequencing on 10 chordoma matched genome pairs. RNA Sequencing/Methylation and SNP6 and an additional sequencing of three cancer cell lines will be added to this work.	Illumina HiSeq 2000	10
EGAD00001000654	DATA FILES FOR BALL-PAX5	Illumina HiSeq 2000	153
EGAD00001000655	DATA FILES FOR Histone-NSD2_RNASeq	Illumina HiSeq 2000	8
EGAD00001000656	FACS phenotype of 1629 Sardinian samples		1629
EGAD00001000657	DATA FILES FOR Histone Capture bams	Illumina HiSeq 2000	962
EGAD00001000658	Changes in gene dosage are a major driver of cancer1, engineered from a finite, but increasingly well annotated, repertoire of mutational mechanisms2-6. These processes operate over levels ranging from individual exons to whole chromosomes, often generating correlated copy number alterations across hundreds of linked genes. An example of the latter is the 2% of childhood acute lymphoblastic leukemia (ALL) characterized by recurrent intrachromosomal amplification of megabase regions of chromosome 21 (iAMP21)7,8 To dissect the interplay between mutational processes and selection on this scale, we used genomic, cytogenetic and transcriptional analysis, coupled with novel bioinformatic approaches, to reconstruct the evolution of iAMP21 ALL. We find that individuals born with the rare constitutional Robertsonian translocation between chromosomes 15 and 21, rob(15;21)(q10;q10)c, have ~2700-fold increased risk of developing iAMP21 ALL compared to the general population. In such cases, amplification is initiated by chromothripsis involving both sister chromatids of the dicentric Robertsonian chromosome. In contrast, sporadic iAMP21 is typically initiated by breakage-fusion-bridge (BFB) events, often followed by chromothripsis or other rearrangements. In both sporadic and iAMP21 in rob(15;21)c individuals, the final stages of amplification frequently involve large-scale duplications of the abnormal chromosome. The end-product is a derivative chromosome 21 or a derivative originating from the rob(15;21)c chromosome, der(15;21), respectively, with gene dosage optimised for leukemic potential, showing constrained copy number levels over multiple linked genes. In summary, the constitutional translocation, rob(15;21)c, predisposes to leukemia through a novel mechanism, namely a propensity to undergo chromothripsis, likely related to its dicentric nature. More generally, our data illustrate that several cancer-specific mutational processes, applied sequentially, can co-ordinate to fashion copy number profiles over large genomic scales, incrementally refining the fitness benefits of aggregated gene dosage changes.	Illumina Genome Analyzer II Illumina HiSeq 2000	9
EGAD00001000659		Illumina HiSeq 2000	12
EGAD00001000660	Analysis .bam files from HiSeq sequencing of Australian ICGC PDAC study samples, submitted 20130826		353
EGAD00001000661	Bespoke validation experiments will be performed on ER+ Breast Cancer cases to confirm the presence of mutations found in whole genome sequencing.	Illumina HiSeq 2000	46
EGAD00001000662	We propose to definitively characterise the somatic genetics of Triple negative breast cancer through generation of comprehensive catalogues of somatic mutations in 500 cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses. This study will use a bespoke bait set to pulldown regions of interest found in whole genome sequencing to validate mutations found.	Illumina HiSeq 2000	46
EGAD00001000663	This study aims to re-sequence findings from whole genome studies using a bespoke pulldown method to validate mutations in those genomes sequenced.	Illumina HiSeq 2000	47
EGAD00001000664	Whole Genome Seq: Illumina HiSeq sequence data (with >30x coverage) were aligned to the hg19 human reference genome assembly using BWA (Li and Durbin, 2009);duplicate reads were removed from the final BAM file. No realignment or recalibration was performed. Paired-end RNA sequencing reads were mapped to the hg19 assembly of the human reference genome using BWA.Each ChIP-seq library was sequenced with two complete lanes on the Illumina HiSeq 2500 in the 101-bases paired-end rapid mode and aligned to hg19 using bwa.This resulted in the following coverage values (genome-wide, after deduplication, including all uniquely mapping reads):GBM103 macroH2A1: 17x H3K36me3: 20xMB59 macroH2A1: 11x H3K36me3: 11x		7
EGAD00001000665	Illumina HiSeq sequence data (with >30x coverage) were aligned to the hg19 human reference genome assembly using BWA (Li and Durbin, 2009); duplicate reads were removed from the final BAM file. No realignment or recalibration was performed. Sample derived from secondary myelodysplastic syndrome (MDS), arising after treatment for medulloblastoma in an 11-year old female Li-Fraumeni syndrome case (LFS-MB1; Rausch et al., 2012; matching WGS data available under EGAS00001000085).		1
EGAD00001000666	HSC73_clone: Bone marrow mononuclear cells from the healthy 73 years old female were thawed and labeled with Alexa-Fluor 488-conjugated anti-CD34 (581, Biolegend), Alexa-Fluor 700-conjugated anti-CD38 (HIT2, eBioscience), a cocktail of APC-conjugated lineage antibodies consisting of anti-CD4 (RPA-T4), anti-CD8 (RPA-T8), anti-CD11b (ICRF44), anti-CD20 (2H7), anti-CD56 (B159, all BD Biosciences), anti-CD14 (61D3), anti-CD19 (HIB19) and anti-CD235a (HIR2, all eBiocience) and 1 micro-gram/ml propidium iodide (Sigma). Using a BD FACSAria cell sorter, single Lin-CD34+CD38-PI- cells were individually sorted into low-adhesion 96-well tissue culture plates (Corning) containing 100micro-litre of StemSpan Serum-Free Expansion Medium (Stemcell technologies) supplemented with 100ng/ml of human SCF and FLT-3L, 50ng/ml of human TPO, 20ng/ml of human IL-3, IL-6 and G-CSF (all cytokines from Peprotech) and 50U/ml of penicillin and 50μg/ml of streptomycin (Sigma). Cells were incubated at 37 degrees C in a humidified atmosphere with 5% CO2 in air. After 5 days in culture, another 100micro litres of cytokine-containing medium were added. 13 days after seeding, clones B6 and G2 had expanded to approx. 105 cells and were selected for whole genome sequencing (2x101bp, paired-end, Illumina HiSeq2500) after tagmentation-based library preparation (see Extended Experimental Procedures) for clone B6 and standard library preparation for clone G2. For germline-control ~106 unsorted bone marrow mononuclear cells from the same donor were used for sequencing. An average of 30-fold sequence coverage for each the clones and the matching control were obtained.L4clone: A progenitor cell clone was raised from a peripheral blood sample of the 39 year old healthy female. Frozen peripheral blood mononuclear cells (PBMCs) were isolated from 2 ml heparinised peripheral blood via Ficoll Paque density centrifugation. A methylcellulose assay was performed as described earlier (Weisse et al., 2012). In brief, non-adherent mononuclear cells were incubated in the presence of the recombinant human cytokines IL-3, IL-5 and GM-CSF (R&D systems) over 14 days to induce colony formation. Colonies were detected under an inverted light microscope, and plucked by a pipette when colonies had approximately 10,000 cells/CFU. Each colony was washed three times in PBS and finally frozen as a cell pellet in -80 degrees C. Genomic DNA was isolated using the QIAamp DNA micro kit according to the instructions of the manufacturer (Qiagen, Hilden, Germany). Whole genome sequencing (2x101bp, paired-end, Illumina HiSeq2500) was performed for colony 4 after tagmentation-based library preparation and resulted in 15-fold sequence coverage for each the colony and the matching whole blood.		5
EGAD00001000667		Illumina HiSeq 2000	72
EGAD00001000669	High-grade serous ovarian cancer (HGSC) is characterized by poor outcome, often attributed to the emergence of treatment-resistant subclones. We sought to measure the degree of genomic diversity within primary, untreated HGSCs to examine the natural state of tumour evolution prior to therapy. We performed exome sequencing, copy number analysis, targeted amplicon deep sequencing and gene expression profiling on 31 spatially and temporally separated HGSC tumour specimens (six patients), including ovarian masses, distant metastases and fallopian tube lesions. We found widespread intratumoural variation in mutation, copy number and gene expression profiles, with key driver alterations in genes present in only a subset of samples (eg PIK3CA, CTNNB1, NF1). On average, only 51.5% of mutations were present in every sample of a given case (range 10.2 to 91.4%), with TP53 as the only somatic mutation consistently present in all samples. Complex segmental aneuploidies, such as whole-genome doubling, were present in a subset of samples from the same individual, with divergent copy number changes segregating independently of point mutation acquisition. Reconstruction of evolutionary histories showed one patient with mixed HGSC and endometrioid histology, with common aetiologic origin in the fallopian tube and subsequent selection of different driver mutations in the histologically distinct samples. In this patient, we observed mixed cell populations in the early fallopian tube lesion, indicating that diversity arises at early stages of tumourigenesis. Our results revealed that HGSCs exhibit highly individual evolutionary trajectories and diverse genomic tapestries prior to therapy, exposing an essential biological characteristic to inform future design of personalized therapeutic solutions and investigation of drug-resistance mechanisms	Illumina Genome Analyzer	25
EGAD00001000670	A potential and very serious side effect of treating IBD with antiTNFa therapies (the currentgold standard) is the development of systemic lupus erythematosis (SLE). This side effect israre and unpredictable. Out of several thousand cases having received treatment, theUniversity of Calgary have accumulated 12 individuals with full phenotyping and novelserological antibody discovery panel data. We propose to exome sequence these samples inan effort to identify rare highly-penetrant variants that could be underlying this severephenotype.	Illumina HiSeq 2000	15
EGAD00001000671	Primary sclerosing chloangitis is a rare autoimmune disease of the liver (prevalence =10/100,000) with a mean age of onset of 40 years. We are currently undertaking GWASand immunochip experiments to identify loci underlying PSC susceptibility. Through ourcollaborators at the University of Calgary we have access to DNA from three parent-offspringtrios where the children required liver transplants due to PSC before the age of 9. These areextremely rare cases indeed and we believe that exome-sequencing represents a powerfulmeans of identifying the causal mutation underlying this severe phenotype.	Illumina HiSeq 2000	5
EGAD00001000672	Whole-genome Bisulfite sequencing of two multiple myeloma samples and one pooled sample of plasma cells.	Illumina HiSeq 2000	3
EGAD00001000673	WGBS-seq for monocytes and neutrophils	Illumina HiSeq 2000	12
EGAD00001000674	DNaseI-seq for monocytes	Illumina HiSeq 2000	4
EGAD00001000675	RNA-seq for monocytes and neutrophils	Illumina HiSeq 2000	12
EGAD00001000676	ChIP-seq for monocytes and neutrophils	Illumina HiSeq 2000	14
EGAD00001000677	Genome-wide analysis of H3K27me3 occupancy and DNA methylation in K27M-mutant and H3.3-WT primary pediatric high-grade gliomas (pHGGs) as well as pediatric pHGG cell lines. The study aims to elucidate the connection between K27M-induced H3K27me3 reduction and changes in DNA methylation as well as gene expression.	Illumina HiSeq 2000	19
EGAD00001000678	FFPE CPA accreditation of genome-scale sequencing in routinely collected formalin-fixed paraffin-embedded (FFPE) cancer specimens versus matched fresh-frozen samples using targeted pulldown capture prior to Illumina sequencing.	Illumina HiSeq 2000	341
EGAD00001000679	A bespoke targeted pulldown experiment will be performed on patients with Angiosarcoma. the resulting products will be sequenced to determine the prevalence of previously found mutations in these patients.	Illumina HiSeq 2000	107
EGAD00001000680	Single end short-read (50 bp) SOLiD 4 sequencing data for 300 individuals, constituting 100 patient-parent trios. For more details please read; http://www.nejm.org/doi/full/10.1056/NEJMoa1206524	AB SOLiD 4 System	202
EGAD00001000688	In this study we performed ultra deep sequencing of genes associated with anti-EGFR resistance, such as KRAS, BRAF, PIK3CA, and EGFR in 17 plasma-DNA samples from a total of 10 patients treated with anti-EGFR therapy.	Illumina MiSeq	25
EGAD00001000689	Whole genome DNA sequencing was used to decrypt the phylogeny of multiple samples from distinct areas of cancer and morphologically normal tissue taken from the prostates of 3 men. For each of three different prostates, multiple tumour samples (4, 5, and 3 depending on the case) and one normal tissue sample were whole genome sequenced with a matched blood sample using the Illumiuna HiSeq platform. Tumour samples were sequenced to a target depth of 50X and normals and blood to a target depth of 30X. As of September 2020, some of the studies using these data include: Cooper et al, Nature Genetics 2015 (PMID: 25730763) Wedge et al, Nature Genetics 2018 (PMID: 29662167) Pan-Cancer Analysis of Whole Genomes, Nature 2020 (PMID: 32025007)	Illumina HiSeq 2000	-
EGAD00001000691	Dataset for "Genome-wide analysis of HPV integration in human cancers reveals recurrent, focal genomic instability"		12
EGAD00001000692	Files associated with the dataset: HS1626.bam, HS1484.bam, HS1483.bam, HS1482.bam, HS1481.bam, HS1480.bam, HS1479.bam, HS1478.bam, A13805.bam, A13800.bam, A13799.bam, A05253.bam, A05252.bam, A13806.bam	Illumina Genome Analyzer Illumina Genome Analyzer II Illumina HiSeq 2000	12
EGAD00001000693	The genetic consequences of cellular transformation by Epstein-Barr-Virus were assessed by comparing whole genome sequences of the original genome (before transformation) and the genome after transformation.		2
EGAD00001000694	This is an ongoing project and continuation to all the sequencing we have been doing over the last few years. We have some additional families and probands with syndromes of insulin resistance not previously sequenced within uk10k or other core funded projects. We would like to complete the sequencing in all of the good quality families and probands we have, this would require another ~50 samples to be WES sequenced. This cohort has already proven to be a rich source of interesting findings with papers in Science and Nature genetics.	Illumina HiSeq 2000	68
EGAD00001000695	DATA FILES FOR SJLGG	Illumina HiSeq 2000	46
EGAD00001000696	The Ethiopian area stands among the most ancient ones ever occupied by human populations and their ancestors. Particularly, according to archaeological evidences, it is possible to trace back the presence of Hominids up to at least 3 million years ago. Furthermore, the present day human populations show a great cultural, linguistic and historic diversity which makes them essential candidate to investigate a considerable part of the African variability. Following the typing of 300 Ethiopian samples on Illumina Omni 1M (see Human Variability in Ethiopia project, previously approved by the Genotyping committee) we now have a clearer idea on which populations living in the area include the most of the diversity. This project therefore aims to sequence the whole genome of 300 individuals at low (4-8x) depth belonging to the six most representative populations of the Ethiopian area to produce a unique catalogue of variants peculiar of the North East Africa. Furthermore 6 samples (one from each population) will also be sequenced at high (30x) depth to ensure full coverage of the diversity spectrum. The retrieved variants will be of great help in evaluating the demographic dynamics of those populations as well as shedding light on the migrations out of Africa.	Illumina HiSeq 2000	5
EGAD00001000697	Illumina HiSeq sequence data (with >30x coverage) were aligned to the hg19 human reference genome assembly using BWA (Li and Durbin, 2009); duplicate reads were removed from the final BAM file. No realignment or recalibration was performed.	Illumina Genome Analyzer IIx Illumina HiSeq 2000	90
EGAD00001000698	Illumina HiSeq sequence data (with >80x coverage) were aligned to the hg19 human reference genome assembly using BWA (Li and Durbin, 2009); duplicate reads were removed from the final BAM file. No realignment or recalibration was performed.The whole exome sequencing data of 20 SHH medulloblastomas from phs000504.v1.p1 dataset has been used in our study on SHH medulloblastomas: http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000504.v1.p1		4
EGAD00001000699	Illumina HiSeq sequence data (with >80x coverage) were aligned to the hg19 human reference genome assembly using BWA (Li and Durbin, 2009); duplicate reads were removed from the final BAM file. No realignment or recalibration was performed.	Illumina HiSeq 2000	78
EGAD00001000702	Complete set of bam files associated with study EGAS00001000622		190
EGAD00001000703	SCLC - Whole genome sequencing data Publication Peifer et al., 2012, Nature Genetics	Illumina Genome Analyzer IIx	29
EGAD00001000704		Illumina HiSeq 2000	-
EGAD00001000705	Whole genome sequencing of 20 tumour and normal pairs of diffuse intrinsic pontine glioma (DIPG)	Illumina HiSeq 2000	40
EGAD00001000706	Whole exome sequencing of 6 tumour and normal pairs of diffuse intrinsic pontine glioma (DIPG)	Illumina HiSeq 2000	12
EGAD00001000707	Discovery of resistance mechanisms to the BRAF inhibitor vemurafenib in metastatic BRAF mutant melanoma by massively-parallel sequencing of tumour samples. Comparison of genomic characteristics of pretreatment 'sensitive' to recurrence 'resistant' tumours to identify the genetics of drug resistance.	Illumina HiSeq 2000	57
EGAD00001000708	AZIN1 amplicon sequencing data of the EGAS00001000495 project.	454 GS FLX Titanium	69
EGAD00001000709	Dataset of CageKid Blood DNA samples		95
EGAD00001000710	Whole Genome Bisulfite-seq of four B cell samples	Illumina HiSeq 2000	4
EGAD00001000711		Illumina HiSeq 2000	42
EGAD00001000712		Illumina HiSeq 2000	72
EGAD00001000713		Illumina HiSeq 2000	12
EGAD00001000714			102
EGAD00001000715	Exome sequencing was performed for paired tumor/normal samples from patients with corticotropin-independnet Cushing's syndrome. Tumor DNA was extracted from adrenocortical adenomas and normal DNA was extracted from adjacent adrenal tissues or periphral blood.	Illumina HiSeq 2000	16
EGAD00001000716	RNAseq data, Publication Fernandez-Cuesta et al., 2014, CD74-NRG1 fusions in lung adenocarcinoma	Illumina HiSeq 2000	25
EGAD00001000717	Dataset of CageKid Tumor DNA samples		95
EGAD00001000718	Dataset of CageKid Tumor RNA samples		91
EGAD00001000719	Dataset of CageKid Normal RNA samples		45
EGAD00001000720	Dataset of CageKid tumor-normal paired RNA samples		90
EGAD00001000721	This is a continuation of the Chordoma Sequencing Project. All cancers arise due to somatically acquired abnormalities in DNA sequence. Systematic sequencing of cancer genomes allows acquisition of complete catalogues of all classes of somatic mutation present in cancer. These mutation catalogues will allow identification of the somatically mutated cancer genes that are operative and characterise patterns of somatic mutation that may reflect previous exogenous and endogenous mutagenic exposures. In this application, we aim to perform whole genome sequencing on 10 chordoma matched genome pairs. RNA Sequencing/Methylation and SNP6 and an additional sequencing of three cancer cell lines will be added to this work.	Illumina HiSeq 2000	20
EGAD00001000722	Extension of angiosarcoma whole genome sequencing study	Illumina HiSeq 2000	8
EGAD00001000723	Relative Spatial Homogeneity of Embryonal Brain Tumors of Childhood		42
EGAD00001000724		Illumina HiSeq 2000	68
EGAD00001000725	This dataset contains RNA sequencing data for 675 cancer cell lines. RNA libraries were made with the TruSeq RNA Sample Preparation kit (Illumina) according to the manufacturer protocol. The libraries were sequenced on an Illumnia HiSeq 2000	Illumina HiSeq 2000	675
EGAD00001000726	In total 30 Acute Myeloid Leukemias with an acquired inv(3)(q21q26) or t(3;3)(q21;q26) have been characterized by whole transcriptome sequencing (RNA-Seq). The 3q-aberration leads to overexpression of the proto-oncogene EVI1, but the mechanism of overexpression has thus far been elusive. The RNA-Seq was integral in determining the precise enhancer inducing the overexpression and led to other key discoveries.	Illumina HiSeq 2500	30
EGAD00001000727	Targeted resequencing on the specific regions chr3:126036241-130672290 and chr3:157712147-175694147 in hg19 centered on the chromosomal regions 3q21 and 3q26 respectively. The focus lies on the detection of the exact breakpoints in Acute Myeloid Leukemia (AML) patients having acquired a inv(3)(q21q26) or t(3;3)(q21;q26). This dataset contains all information to detect all structural variants contained within these regions, including the 3q-aberrations inducing the overexpression of the proto-oncogene EVI1.	Illumina HiSeq 2500	38
EGAD00001000728	Low coverage whole genome sequencing of samples from individuals from Friuli Venezia Giulia, an Italian genetic isolate population.	Illumina HiSeq 2000	199
EGAD00001000729	The Val Borbera is a region characterized by low iodine and high prevalence of thyroid disorders, the commonest endocrine disorders in the general population. About 30% of the participants of the Val Borbera Project were affected by such disorders and were characterized by several parameters, TSH level, anti TPO antibodies, echography, family origin. Individuals with extreme phenotypes were identified and could be clustered based on family origin and genotype. We propose to exome sequence 6 of them, affected with true goiter, at high dept (40-60x) to obtain information on exonic rare variants. Due to the family structure and to the availability of whole genome sequence information on 110 individuals from the isolated population we expect to be able to identify putative causative variants for thyroid disorders that may be studied in the remaining affected individuals.	Illumina HiSeq 2000	8
EGAD00001000730	The VBSEQ project aims to combine available extensive genetic and phenotypic data to the latest high-throughput genome sequencing technology and ad hoc statistical analysis to identify new rare genetic variants underlying complex traits. Up to 100 Val Borbera samples will be sequenced to a 6x depth.	Illumina HiSeq 2000	110
EGAD00001000731	This study includes Phase 2 whole-genome sequencing data (at 4x depth)of 100 individuals from an Italian genetic isolate population (Val Borbera, abbreviated VBI) of the Italian Network of Genetic Isolates (INGI). The INGI-VBI_SEQ2 project aims to combine available extensive genetic and phenotypic data to the latest high-throughput genome sequencing technology and ad hoc statistical analysis to identify new rare genetic variants underlying complex traits.	Illumina HiSeq 2000	100
EGAD00001000732	RNA sequencing to validate findings of somatic pseudogenes acquired during cancer development	Illumina HiSeq 2000	3
EGAD00001000733	The dataset entails 48 RRBS libraries of 24 siblings. 24 individuals are conceived during the Dutch Famine, a severe 6 month famine at the end of World War 2. A same sex sibling was added as a control, allowing partial matching for (early) familial environment and genetics.	Illumina Genome Analyzer IIx	48
EGAD00001000734	Paired end Illumina sequencing of whole exomes of multiple tumour regions.	Illumina Genome Analyzer IIx Illumina HiSeq 2000 Illumina HiSeq 2500	88
EGAD00001000735	Here we present the genomes of three secondary angiosarcomas	Illumina HiSeq 2000	7
EGAD00001000737	Whole exome sequencing data from 30 donors (46 tumors and 30 non-tumoral whole exome sequencing, paired-end, HiSeq 2000, Illumina) collected by the Inserm U674, PI Jessica Zucman-Rossi - Institut National du Cancer (INCa), PI Fabien Calvo, France.	Illumina HiSeq 2000	76
EGAD00001000738	Extension of angiosarcoma whole genome sequencing study	Illumina HiSeq 2000	4
EGAD00001000740	UK10K_COHORT_ALSPAC REL-2012-06-02: Low-coverage whole genome sequencing; variant calling, genotype calling and phasing	Illumina HiSeq 2000	2307
EGAD00001000741	UK10K_COHORT_TWINSUK REL-2012-06-02: Low-coverage whole genome sequencing; variant calling, genotype calling and phasing	Illumina Genome Analyzer II Illumina HiSeq 2000	1854
EGAD00001000743	These files contain a total of 20.4M SNVs and the complete information output by the GATK UnifiedGenotyper v1.4 on all 767 GoNL samples. These calls are not trio-aware and all genotypes were reported regardless of their quality. Both filtered and passing calls are reported in these files. Filtered calls include (1) calls failing our VQSR threshold and (2) calls in the GoNL inaccessible genome.		-
EGAD00001000744	The samples in this panel come from 250 families: 248 parents-child trios and 2 parent-child duos. As the children do not provide additional haplotypes or population information, they were excluded from the panel. The samples present in the release are composed of 248 couples, 2 single individuals and 1 sample composed from the 2 haplotypes from the duo's children transmitted by their missing parent. The composed sample is named gonl-220c_223c.The files contain a total of 18.9M SNVs and 1.1M INDELs in autosomal chromosomes. They were generated by phasing/imputing the SNVs (a) and INDELs (b) using MVNCall. Only sites passing filters are reported. Sites filtered as part of the GoNL inaccessible genome were kept (but flagged as filtered) and still may contain true positive calls but should be used with care as they are located in parts of the genome that are less well captured (systematic under or over-covered or low-mapping quality)		-
EGAD00001000745	Data supporting the paper Transcriptional diversity during lineage commitment of human blood progenitors	Illumina HiSeq 2000 PacBio RS	26
EGAD00001000746	Fernandez-Cuesta et al., RNAseq data Pipline	Illumina HiSeq 2000	25
EGAD00001000747	Genomic libraries will be generated from total genomic DNA derived from 4000 samples with Acute Myeloid Leukaemia. Libraries will be enriched for a selected panel of genes using a bespoke pulldown protocol. 64 Samples will be individually barcoded and subjected to up to one lanes of Illumina HiSeq. Paired reads will be mapped to build 37 of the human reference genome to facilitate the characterisation of known gene mutations in cancer as well as the validation of potentially novel variants identified by prior exome sequencing.	Illumina HiSeq 2000	2734
EGAD00001000748	In this study we performed whole genome sequencing of plasma DNA (plasma-Seq) of 19 plasma-DNA samples from a total of 10 patients treated with anti-EGFR therapy. We demonstrated that development of resistance to anti-EGFR therapies is frequently associated with focal amplifications of KRAS, MET, and ERBB2. We also showed that focal KRAS amplifications can be acquired in tumor genomes of patients under cytotoxic chemotherapy. Furthermore, we provide evidence that specific chromosomal polysomies, such as overrepresentations of 12p and 7p, harboring KRAS and EGFR, respectively, determine responsiveness to anti-EGFR therapy.	Illumina MiSeq	19
EGAD00001000749		Illumina HiSeq 2000	12
EGAD00001000750	UK10K_RARE_FIND REL-2013-10-31 variant calling	Illumina HiSeq 2000	1151
EGAD00001000752	UK10K_RARE_CILWG REL-2013-09-09	Illumina HiSeq 2000	4
EGAD00001000753	UK10K_RARE_FINDWG REL-2013-09-09	Illumina HiSeq 2000	4
EGAD00001000754	UK10K_RARE_NMWG REL-2013-09-09	Illumina HiSeq 2000	5
EGAD00001000755	UK10K_OBESITY_GS UK10K_EXOME_EXTRAS	Illumina HiSeq 2000	5
EGAD00001000756	UK10K_OBESITY_SCOOP UK10K_EXOME_EXTRAS	Illumina HiSeq 2000	1
EGAD00001000757	UK10K_RARE_SIR UK10K_EXOME_EXTRAS	Illumina HiSeq 2000	2
EGAD00001000758	dataset for BGI bladder cancer project	Illumina Genome Analyzer II	198
EGAD00001000759		Illumina HiSeq 2000	86
EGAD00001000760	dataset for esophageal cancer, 17 pairs for whole-genome sequencing and 71 pairs for whole-exome sequencing	Illumina HiSeq 2000	176
EGAD00001000761	In order to establish copy number profiles from the various samples we prepared libraries and subjected them to whole-genome sequencing at a shallow sequencing depth (0.1x)	Illumina MiSeq	14
EGAD00001000762	We utilized exome sequencing for DNA obtained from saliva (germline DNA) and the four spatially separated tumor foci and 3 corresponding lymph node metastases	Illumina HiSeq 2000	8
EGAD00001000763	We used targeted deep sequencing to accurately establish the allele frequencies of the mutations identified by exome sequencing	Illumina MiSeq	23
EGAD00001000764	Adrenocortical carcinomas (ACC) are aggressive cancers originating in the cortex of the adrenal glands. Despite the overall poor prognosis, ACC outcome is heterogeneous. CTNNB1 and TP53 mutations are frequent in these tumors, but the complete spectrum of genetic changes remains undefined. Exome sequencing and SNP array analysis of 45 ACC revealed recurrent alterations in known drivers (CTNNB1, TP53, CDKN2A, RB1, MEN1) and genes not previously reported to be altered in ACC (ZNRF3, DAXX, TERT and MED12), which were validated in an independent cohort of 77 ACC. The cell-surface transmembrane E3 ubiquitin ligase ZNRF36 was the gene the most frequently altered (21%), and appears as a potential novel tumor suppressor gene related to the ß-catenin pathway.Our integrated genomic analyses led to the identification of two distinct molecular subgroups with opposite outcome. The C1A group of poor outcome ACC was characterized by numerous mutations and DNA methylation alterations, whereas the C1B group with good prognosis displayed a specific deregulation of two miRNA clusters. Thus, aggressive and indolent ACC correspond to two distinct molecular entities, driven by different oncogenic alterations.	Illumina HiSeq 2000	45
EGAD00001000774	This study includes whole-genome sequencing data (at 4x depth) of 100 individuals from an Italian genetic isolate population (Carlantino, abbreviated CARL) of the Italian Network of Genetic Isolates (INGI). The INGI-CARL_SEQ project aims to combine available extensive genetic and phenotypic data to the latest high-throughput genome sequencing technology and ad hoc statistical analysis to identify new rare genetic variants underlying complex traits.	Illumina HiSeq 2000	106
EGAD00001000775	Whole exome sequencing of 41 melanomas and normal DNA from Braf mutant mice: 15 tumours from UV exposed mice, 15 tumours from non-exposed mice and 11 from UV exposed, sunscreen-protected mice.	Illumina HiSeq 2000	80
EGAD00001000776	UK10K_COHORT_IMPUTATION REL-2012-06-02: imputation reference panel (20140306); Merged UK10K+1000Genomes Phase 3 imputation reference panel added (20160420)	Illumina Genome Analyzer II Illumina HiSeq 2000	3781
EGAD00001000777	Dataset contains MeDIP-Seq, MRE-Seq and H3K4me3 ChIP-Seq data on 5 GBM patients.		16
EGAD00001000779		AB SOLiD 4 System	2
EGAD00001000780		Illumina HiSeq 2000	18
EGAD00001000781	Whole genome, high coverage, sequencing of 128 Ashkenazi Jewish controls		128
EGAD00001000782	Whole-genome sequencing was performed by Illumina Inc (San Diego, CA). Libraries were constructed with ~300bp insert length and paired-end 100bp reads were sequenced on Illumina HiSeq2000.	Illumina HiSeq 2000	190
EGAD00001000783	Genomic libraries will be generated from total genomic DNA derived from 200+ patients with childhood Transient Myeloproliferative Disorder (TMD) and or Acute Megakaryocytic Leukemia (AMKL) as well some matched constitutional samples (n < 50 ). Libraries will be enriched for a selected panel of genes using a bespoke pulldown protocol. 96 Samples will be individually barcoded and subjected to up to two lanes of Illumina HiSeq. Paired reads will be mapped to build 37 of the human reference genome to facilitate the characterisation of known gene mutations in cancer as well as the validation of potentially novel variants identified by prior exome sequencing.	Illumina HiSeq 2000 Illumina HiSeq 2500	400
EGAD00001000784	This study aims to target capture sequence regions of interest from DNA derived from breast cancer patients who received neo-adjuvant chemotherapy. All patients had multiple biopsies performed before chemotherapy. Patients who had residual disease after the course of treatment underwent a further biopsy. We aim to characterise the mutations involved.	Illumina HiSeq 2000	242
EGAD00001000785	We propose to definitively characterise the somatic genetics of a selection of rare bone cancers through generation of comprehensive catalogues of somatic mutations by high coverage genome sequencing.	Illumina HiSeq 2000	33
EGAD00001000786	We are interested in the contribution mutations in the Shelterin complex protein POT1 may have to the development of melanoma. We have identified a patient who carries a splice site mutation in POT1 and as part of our analysis of this gene we aim to sequence the transcriptome of this patient to see how this mutation influences splicing. RNA has been obtained from lymphocytes collected from the patient.	Illumina MiSeq	1
EGAD00001000789	UK10K_COHORT_ALSPAC REL-2012-06-02: Phenotype data		1927
EGAD00001000790	UK10K_COHORT_TWINSUK REL-2012-06-02: Phenotype data		1854
EGAD00001000791	Exome sequencing of familial and sporadic small cell cancer of ovary cases.	Illumina HiSeq 2000 Illumina HiSeq 2500	16
EGAD00001000792	Whole exome sequencing of paediatric glioblastoma with mutations reported in the manuscript: Mutations in ACVR1, FGFR1 and TP53 associate with tumor location in histone H3 K27M pediatric midline high-grade astrocytoma	Illumina HiSeq 2000 Illumina HiSeq 2500	38
EGAD00001000794	Small cell carcinoma of the ovary of hypercalcemic type (SCCOHT) is an extremely rare, aggressive cancer affecting children and young women. We identified germline and somatic inactivating mutations in the SWI/SNF chromatin-remodeling gene SMARCA4 in 75% (9/12) of SCCOHT patients in addition to SMARCA4 protein loss in 82% (14/17) of SCCOHT tumors, but in only 0.4% (2/485) of other primary ovarian tumors. These data implicate SMARCA4 in SCCOHT oncogenesis.	Illumina HiSeq 2000	11
EGAD00001000795	Fernandez-Cuesta et al, 2014, Nature Communication, RNA Sequencing data set	Illumina HiSeq 2000	69
EGAD00001000796	This project aims to study at least 90 exomes from families with congenital heart disease. The samples have been selected in Leuven in collaboration with Koen Devriendt. Ethic approval has been sought for in Leuven, Belgium and a HDMMC agreement for submitting these samples is in place at the WTSI. The phenotype we wil primarily focus our analysis is severe Left Ventricular Outflow Tract Obstructions (LVOTO) and Atrioventricular Septal Defect (AVSD). The indexed Agilent whole exome pulldown libraries will be sequenced on 75bp PE HiSeq (Illumina).	Illumina HiSeq 2000	167
EGAD00001000797	This project aims to study at least 90 exomes from families with congenital heart disease. The samples have been selected at the Royal, Brompton Hospital in collaboration with Stuart Cook and Piers Daubeney. Ethic approval has been sought for in the UK and a HDMMC agreement for submitting these samples is in place at the WTSI. The phenotype we wil primarily focus our analysis is severe Left Ventricular Outflow Tract Obstructions (LVOTO) and Atrioventricular Septal Defect (AVSD). The indexed Agilent whole exome pulldown libraries will be sequenced on 75bp PE HiSeq (Illumina).	Illumina HiSeq 2000	48
EGAD00001000798	In order to progress human induced pluripotent stem cells (hiPSCs) towards the clinic, several outstanding questions must be addressed. It is possible to reprogram different somatic cell types into hiPSCs but it is unclear whether some cell types carry through fewer mutations through reprogramming (either due to mutations present in the primary cells, or mutations accumulated during reprogramming). Through in depth analysis of hiPSCs generated from different somatic cells, it will be possible to assess the variation in genetic stability of different cell types.	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina MiSeq	28
EGAD00001000799	The exome sequencing is performed using Agilent SureSelect 50Mb exome v3 and Hiseq 75bp paired reads with an mean sequencing coverage target of 50X.	Illumina HiSeq 2000	95
EGAD00001000800	This project aims to study exomes from families and trios with congenital heart disease (CHD). The samples have been collected under the Competence Network - Congenital Heart Defects in Berlin, Germany. The phenotypes are mainly left ventricular outflow obstruction (aortic stenosis, bicuspd aortic valve disease coarctation and hypoplastic left heart), but will also include samples with hypoplastic right heart and atrioventricular septal defects. We will perform whole exome sequencing using Agilent sequence capture and Illumina HiSeq sequencing.	Illumina HiSeq 2000	406
EGAD00001000802	UK10K_RARE_CILWG REL-2013-03-06	Illumina HiSeq 2000	2
EGAD00001000803	UK10K_RARE_FINDWG REL-2013-03-06	Illumina HiSeq 2000	2
EGAD00001000804	UK10K_RARE_NMWG REL-2013-03-06	Illumina HiSeq 2000	1
EGAD00001000805	UK10K_RARE_THYWG REL-2013-03-06	Illumina HiSeq 2000	2
EGAD00001000806	Whole Genome Sequencing (WGS) for St. Jude High Grade Glioma (HGG) study	Illumina HiSeq 2000	63
EGAD00001000807	Whole Exome Sequencing (WES) for St. Jude High Grade Glioma (HGG) study	Illumina HiSeq 2000	148
EGAD00001000808	RIKEN collection WGS reads for 321 HCC and blood matched samples from 158 donors submitted to ICGC for release 15	Illumina Genome Analyzer IIx Illumina HiSeq 2000	321
EGAD00001000809	RIKEN collection WGS reads for 61 liver cancer and matched blood samples from 30 donors displaying biliary phenotype	Illumina HiSeq 2000	61
EGAD00001000810	Dataset for whole exome sequencing of 49 tumor-blood pairs and transcriptome sequencing of 44 tumors for adrenocortical tumors	Illumina HiSeq 2000	106
EGAD00001000811	Whole exome sequencing of 6 HCCs and matched background liver in children with bile salt export pump deficiency.	Illumina HiSeq 2000	12
EGAD00001000812	Sequencing of 350 cancer genes in BC samples from patients treated with either Epirubicin or Paclitaxel monotherapy in the neoadjuvant setting.	Illumina HiSeq 2000	364
EGAD00001000813	Fernandez-Cuesta et al., 2014, Nature Communication, Whole genome sequencing was performed using a read length of 2x100 bp for all samples. On average, 110 Gb of sequence were produced per sample, aiming a mean coverage of 30x for both tumour and matched normal.	Illumina HiSeq 2000	29
EGAD00001000814	Whole genome alignments of DIPG patients		40
EGAD00001000815	Exome-seq, RNA-Seq, SNP array profiling of gastric tumor samples.	Illumina HiSeq 2000	102
EGAD00001000816	ICGC medulloblastoma whole genome sequencing data, ICGC release 16		44
EGAD00001000817	Alternative splicing plays critical roles in differentiation, development, and cancer (Pettigrew et al., 2008; Chen and Manley, 2009). The recent identification of specific spliceosome inhibitors has generated interest in the therapeutic potential of targeting this cellular process (van Alphen et al., 2009). Using an integrated genomic approach, we have identified PRPF6, an RNA binding component of the pre-mRNA spliceosome, as an essential driver of oncogenesis in colon cancer. Importantly, PRPF6 is both amplified and overexpressed in colon cancer, and only colon cancer cells with high PRPF6 levels are sensitive to its loss. Our data clearly point to an important role for PRPF6 in colon cancer growth and suggest that a better understanding of its role in alternative splicing in colon cancer is warranted. To determine the specific alternative splice forms that PRPF6 regulates in colon cancer, we plan three experiments: 1. The first involves knocking down expression of PRPF6 in two different cancer cell lines with 3 different siRNAs, and then completing RNA-seq to determine the gene expression changes that occur relative to a non-targeting control siRNA. Because of the role for PRPF6 in pre-mRNA splicing, we especially want to quantify the changes in splice-specific forms of all genes genome-wide to identify genes whose splicing is altered upon PRPF6 knockdown. 2. The second involves immunoprecipitating PRPF6 from two different cancer cell lines and isolating any RNA that is bound to PRPF6, since PRPF6 is an RNA-binding protein. We then want to carry out RNA-seq to identify which RNA molecules co-immunoprecipitated with PRPF6. This will help us determine possible functions for PRPF6 in regulating colon cancer growth. 3. The third involves overexpressing PRPF6 in cell lines and then carrying out RNA-seq to identify any changes in splice-specific gene expression. This will allow us to determine whether increased PRPF6 expression is sufficient to drive alternative splicing changes.	Illumina HiSeq 2000	34
EGAD00001000818	Quiescent Sox2+ cells drive hierarchical growth and relapse in Sonic hedgehog subgroup medulloblastoma		4
EGAD00001000819	We are aiming to investigate repair of a double strand break (DSB) within the genome in the presence and absence of the BLOOM protein. Zinc Finger Nucleases introduce DSBs at specified loci within the genome. Using sequencing we will assess the size of the deletion following repair. Protocol 1. Transfect normal and BLOOM deficient human iPS cells with ZFNs, using AMXA 2. Harvest cells after 5 days 3. Perform column extraction of DNA 4. PCR-amplify the ZFN region 5. Sequence and analyse repair of the DSB This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina MiSeq	6
EGAD00001000820	Fernandez-Cuesta et al, 2014, Nature Communication, Whole exome sequencing data set	Illumina HiSeq 2000	15
EGAD00001000821	Raw sequencing data for all samples in fastq format.	Illumina HiSeq 2000	767
EGAD00001000822	Whole exome sequencing and miRNA-seq data of PPB.	Illumina HiSeq 2000 Illumina MiSeq	18
EGAD00001000824	RNA sequencing will be undertaken to reconstruct rearrangements at level of transcription to determine pathogenomic genomic events in chondromyxoid fibroma.	Illumina HiSeq 2000	1
EGAD00001000825	This study aims to define the landscape of somatic mutations in sun exposed human skin by deep sequencing, analyse their frequency and use the data to infer the effect of mutations on proliferating cell behaviour. The frequency of each mutation will reflect the size of the clone of cells in the tissue sample. By analyzing small samples, clones with as few as 100 cells will be detectable. Allele frequency distributions for each mutation will be used to infer cell fate using published methods (Klein et al. 2010). This study will shed unprecedented light on the early clonal events that lead to the emergence of cancer.	Illumina HiSeq 2000 Illumina HiSeq 2500	454
EGAD00001000826	We propose to definitively characterise the somatic genetics of Osteosarcoma cancer through generation of comprehensive catalogues of somatic mutations by high coverage genome and transcriptome sequencing.	Illumina HiSeq 2000	10
EGAD00001000827	n order to progress human induced pluripotent stem cells (hiPSCs) towards the clinic, several outstanding questions must be addressed. It is possible to reprogram different somatic cell types into hiPSCs and from studies in the mouse, it appears that an epigenetic memory of the starting cell type is carried over to hiPSCs. However a comprehensive comparative study of the characteristics of these hiPSCs has been missing from the literature. Importantly studies which aimed to address these aspects of hiPSCs have used cells from different patients. In order to avoid this important confounding variable and to keep the genetic background constant, tissue samples were procured from the patients and reprogrammed to iPS cells. The methylation status of these iPS cells will be compared. Protocol: Primary cell cultures were generated and reprogrammed to iPS cells. DNA was extracted and immunoprecipitated using anti-methyl cytosine and anti-hydroxymethyl cytosine antibodies. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina MiSeq	4
EGAD00001000828	Fibroblasts have been shown to re-program into induced pluripotent stem (hiPS) cells, through over-expression of pluripotency genes. These hiPS cells show similar characteristics to embryonic stem cells including cell surface markers, epigenetic changes and ability to differentiate into the three germ layers. However it is unclear as to the extent of changes in gene expression through the re-programming process.. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2000	6
EGAD00001000829		Illumina HiSeq 2000	16
EGAD00001000830		Illumina HiSeq 2000	14
EGAD00001000831		Illumina HiSeq 2000	30
EGAD00001000832		Illumina HiSeq 2000	16
EGAD00001000833		Illumina HiSeq 2000	10
EGAD00001000834		Illumina HiSeq 2000	20
EGAD00001000835		Illumina HiSeq 2000	8
EGAD00001000836		Illumina HiSeq 2000	49
EGAD00001000842	RIKEN collection WGS reads for 100 HCC and matched blood samples from 50 donors submitted to ICGC for release 16	Illumina HiSeq 2000	100
EGAD00001000843		Illumina HiSeq 2000	12
EGAD00001000844		Illumina HiSeq 2000	22
EGAD00001000845			44
EGAD00001000847	Shwachman-Diamond syndrome (SDS) is a rare autosomal recessive disorder characterized by exocrine pancreatic insufficiency, bone marrow dysfunction, leukemia predisposition, and skeletal abnormalities. We aim to characterise the structural effects of SDS in patients with this disorder by exome sequencing.	Illumina HiSeq 2000	2
EGAD00001000848	To evaluate the presence of mutations in frequently mutated genes in MPN by performing targeted resequencing of a selected gene panel comprising of 111 genes across 40 samples with MPN.	Illumina MiSeq	48
EGAD00001000849		Illumina HiSeq 2000	50
EGAD00001000850	Small cell carcinoma of the ovary of hypercalcemic type (SCCOHT) is an extremely rare, aggressive cancer affecting children and young women. We identified germline and somatic inactivating mutations in the SWI/SNF chromatin-remodeling gene SMARCA4 in 75% (9/12) of SCCOHT patients in addition to SMARCA4 protein loss in 82% (14/17) of SCCOHT tumors, but in only 0.4% (2/485) of other primary ovarian tumors. These data implicate SMARCA4 in SCCOHT oncogenesis.	Illumina HiSeq 2000	19
EGAD00001000853	DATA FILES FOR SJEPD	Illumina HiSeq 2000	37
EGAD00001000854	DATA FILES FOR SJEPD	Illumina HiSeq 2000	77
EGAD00001000856		Illumina HiSeq 2000	1
EGAD00001000865	WGS of 14 paired samples of Bladder Cancer patient	Illumina HiSeq 2000	28
EGAD00001000868	FFPE CPA accreditation of genome-scale sequencing in routinely collected formalin-fixed paraffin-embedded (FFPE) cancer specimens versus matched fresh-frozen samples using targeted pulldown capture prior to Illumina sequencing.	Illumina HiSeq 2000 Illumina HiSeq 2500	60
EGAD00001000869	It is the ambition of the team formed by members of the Netherlands Cancer Institute (NKI) and the Cancer Genome Project at the Wellcome Trust Sanger Institute (WTSI) to unravel the genomic and phenotypic complexity of human cancers in order to identify optimal drug combinations for personalized cancer therapy. Our integrated approach will entail (i) deep sequencing of human tumours and cognate mouse tumours; (ii) drug screens in a 1000+ fully characterized tumour cell line panel; (iii) high-throughput in vitro and in vivo shRNA and cDNA drug resistance and enhancement screens; (iv) computational analysis of the acquired data, leading to significant response predictions; (v) rigorous validation of these predictions in genetically engineered mouse models and patient-derived xenografts. This integrated effort is expected to yield a number of combination therapies and companion-diagnostics biomarkers that will be further explored in our existing clinical trial networks.	Illumina HiSeq 2000	62
EGAD00001000870	Testing logistics and infrastructure of molecular screening program. Core biopsies taken from invasive recurrent or metastatic breast cancer to evaluate and identify molecular traits rendering them suitable for clinical trials	Illumina HiSeq 2500	52
EGAD00001000871	The purpose of this study is to sequence 500 known cancer genes in 960 newly diagnosed high risk breast cancer patients treated with current standard of care therapies and trastuzumab, for somatic alteration and copy number changes. We will be using next gen sequencing technology to determine the prognostic relevance of these somatic genetic alterations and of teh low frequency events to determine if they are associated with trastuzumab benefit or HER2 positive breast cancer, i.e. treatment interaction. The samples will be analysed adn correlated with clinical variables including outcome.	Illumina HiSeq 2000	993
EGAD00001000872	These samples are to be analysed with the CGP Developed cancer panel and the results will be compared with WGS data from 4 different comercial providers.	Illumina HiSeq 2500	8
EGAD00001000873	Fastq files of 10 samples of condrosarcoma	Illumina Genome Analyzer IIx Illumina HiSeq 2000	10
EGAD00001000874	Indel/point mutation of chondrosarcoma		10
EGAD00001000875	The CRO7 clinical trial recruited patients with clinically operable rectal adenocarcinoma. Patients were randomized to either pre-operative short course surgery followed by chemo-radiotherapy only in those patients at high risk of local relapse. Patients in both arms the received standard %-FU based adjuvant chemotherapy as per local policy. We intend to use FFPE derived DNA from the primary tumours to identify patterns of mutations or copy number alterations that are predictive of local or distant relapse.	Illumina HiSeq 2000	330
EGAD00001000876		Illumina HiSeq 2000	98
EGAD00001000877	Complete WGS and RNA-Seq dataset for Australian ICGC ovarian cancer sequencing project 2014-07-07, representing 93 donors. Sequencing was performed on Illumina HiSeq. Alignment of the lane-level fastq data was performed with bwa (WGS data) and RSEM (transcriptome data). For this dataset lane-level .bam files have been merged and de-duplicated to create a single bam file for each sample type (tumour/normal) for each donor. This dataset supersedes all previous datasets for this study. 2016-08-08 updated with 14 outstanding RNA-seq samples & corresponding RSEM bams 2016-12-07 updated with 7 outstanding RNA-seq controls and corresponding RSEM bams		331
EGAD00001000878	RNA-Seq files accompanying Genetic landscape of pediatric Rhabdomyosarcoma	Illumina HiSeq 2000	42
EGAD00001000879	Genomic libraries will be generated from total genomic DNA derived from 200+ patients with childhood Transient Myeloproliferative Disorder (TMD) and or Acute Megakaryocytic Leukemia (AMKL) as well some matched constitutional samples (n < 50). Libraries will be enriched for a selected panel of genes using a bespoke pulldown protocol. 96 Samples will be individually barcoded and subjected to up to two lanes of Illumina HiSeq. Paired reads will be mapped to build 37 of the human reference genome to facilitate the characterisation of known gene mutations in cancer as well as the validation of potentially novel variants identified by prior exome sequencing.	Illumina HiSeq 2500	335
EGAD00001000880	Genotyping by array and Transcriptome profiling by high-throughput sequencing		233
EGAD00001000881	RNA sequencing of Resistant BCC samples.	Illumina HiSeq 2000	11
EGAD00001000882	Targeted genome sequences of the human X chromosome in 4 colorectal adenomas and 4 matched normal tissues from male patients	Illumina Genome Analyzer IIx Illumina HiSeq 2000	8
EGAD00001000883	Illumina HiSeq paired-end exome sequencing of a trio and singleton.	Illumina HiSeq 2000	4
EGAD00001000884	In order to elucidate whether newly acquired genetic alterations during serial transplantation of patient derived primary pancreatic cancer cultures contribute to the observed clonal dynamics in vivo, all coding genes of two patient derived primary cultures and derived genetically marked serial xenografts (1°/2°/3°) were sequenced.	Illumina HiSeq 2000	10
EGAD00001000885	Exome read sequences for 30 tumor-normal pairs for the study "Diverse modes of genomic alterations in Hepatocellular Carcinoma".	Illumina HiSeq 2000	60
EGAD00001000886	RNA-Sequencing data (raw read sequences) for 23 samples, from 12 patients, for the study "Diverse modes of genomic alterations in Hepatocellular Carcinoma"	Illumina HiSeq 2000	23
EGAD00001000887	Exome sequencing of Resistant BCC samples.	Illumina HiSeq 2000	23
EGAD00001000888	NSCLC WGS.	AB 5500 Genetic Analyzer	4
EGAD00001000889	NSCLC targeted.	Ion Torrent PGM	4
EGAD00001000891	To characterize the subclonal genomic architecture of androgen-deprived metastatic prostate cancer, we performed whole-genome sequencing (WGS) of 51 tumours from 10 patients to an average sequencing depth of 55x, including multiple metastases from different anatomic sites in each patient and, in five cases, the prostate tumour. Noncancerous DNA from blood or other tissue is used as reference comparison for each patient. The patients are part of PELICAN (Project to ELIminate Lethal Cancer) rapid autopsy study led by G. Steven Bova at Johns Hopkins University (USA) and Tampere University (Finland). As of September 2020, some of the studies using these data include: Gundem et al, Nature 2015 (PMID: 25830880). Additional EGAD00001000891 sample metadata is contained in Supplementary Information in this report.Tubio et al, Science 2014 (PMID: 25082706) Behjati et al, Nature Comm 2015 (PMID: 27615322) Wedge et al, Nature Genetics 2018 (PMID: 29662167)Pan-Cancer Analysis of Whole Genomes, Nature 2020 (PMID: 32025007)Rodriguez-Martin et al, Nature Genetics 2020 (PMID: 32024998)Woodcock et al, Nature Comm 2020 (In Press)	Illumina HiSeq 2000	62
EGAD00001000892	Whole Genome Sequencing Illumina HiSeq data from 20 men with prostate cancer. 20 samples were taken from primary tissue obtained at prostatectomy (target sequencing depth 50X) with matched blood control (target sequencing depth 30X). These were submitted for use in the ICGC Pan-Cancer Analysis of Whole Genomes project. Same raw data submitted in EGAD00001001116. As of September 2020, some of the studies using these data include: Wedge et al, Nature Genetics 2018 (PMID: 29662167) Pan-Cancer Analysis of Whole Genomes, Nature 2020 (PMID: 32025007)	Illumina HiSeq 2000	40
EGAD00001000893	HipSci - Healthy Normals - Exome Sequencing - May 2014	Illumina HiSeq 2000	15
EGAD00001000894	SPECTA comprises a network of participating European clinical sites and NGS screening platforms that can screen individual patients for multiple molecular targets and potentially allow the design of trials that will match the specific biology of the diseases affecting specific patients with cancer.	Illumina HiSeq 2000 Illumina HiSeq 2500	64
EGAD00001000896		Illumina HiSeq 2000	12
EGAD00001000897	HipSci - Healthy Normals - RNA Sequencing - May 2014	Illumina HiSeq 2000	22
EGAD00001000898	Cancers are ecosystems of genetically related clones, competing across space and time for limited resources. To understand the clonal structure of primary breast cancer, we applied genome and targeted sequencing to 295 samples from 49 patients’ tumors. The extent of subclonal diversification varied considerably among patients and encompassed many spatial patterns, including local growth, intraductal dissemination and clonal intermixture. Landmarks of disease progression, such as acquiring invasive or metastatic potential, arose within detectable subclones of antecedent lesions, suggesting that subclonal mutations could be relevant if actionable. No defined temporal order of mutation was evident, with the commonest genes, including PIK3CA, TP53, BRCA2, PTEN and MYC, mutated early in some, late in others, often exhibiting parallel evolution across subclones. Signatures of homologous recombination deficiency correlated with response to neoadjuvant chemotherapy. Thus, the interplay of mutation, growth and competition drives clonal structures of breast cancer that are complex, variable across patients and clinically relevant.	Illumina HiSeq 2000	42
EGAD00001000899	We propose to definitively characterise the somatic genetics of Metastatic breast cancer through generation of comprehensive catalogues of somatic mutations in Metastatic breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses.	Illumina HiSeq 2000	41
EGAD00001000900	Multi-region Illumina whole-exome and/or whole-genome sequencing on tumor regions collected from early-stage NSCLC patients who underwent definitive surgical resection prior to receiving adjuvant therapy.Detected variants were validated on Ion AmpliSeq™ Custom Panel and/or Comprehensive Cancer Gene Panels.Patients covered by this dataset: L001, L002, L003, L004, L008 and L011.	Illumina Genome Analyzer IIx Illumina HiSeq 2000 Illumina HiSeq 2500 Ion Torrent PGM	28
EGAD00001000901	The dataset includes the whole exome sequencing data from32 pairs of gallbladder caner tissues and patient-matched normal tissues.	Illumina HiSeq 2500	64
EGAD00001000902	The dataset includes the targeted gene sequencing data from51 pairs of gallbladder caner tissues and patient-matched normal tissues.	Illumina HiSeq 2500	102
EGAD00001000903	RNA-Seq data for 4 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 22 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	4
EGAD00001000904	RNA-Seq data for 7 mature neutrophil sample(s). 7 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	7
EGAD00001000905	DNase-Hypersensitivity data for 5 CD14-positive, CD16-negative classical monocyte sample(s). 5 run(s), 5 experiment(s), 5 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_dnaseseq_analysis_20140811	Illumina HiSeq 2000	5
EGAD00001000906	ChIP-Seq data for 1 mature eosinophil sample(s). 7 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	1
EGAD00001000907	RNA-Seq data for 3 common myeloid progenitor sample(s). 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	3
EGAD00001000908	RNA-Seq data for 3 inflammatory macrophage sample(s). 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	3
EGAD00001000909	Bisulfite-Seq data for 1 erythroblast sample(s). 14 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	1
EGAD00001000910	Bisulfite-Seq data for 1 precursor lymphocyte of B lineage sample(s). 8 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	1
EGAD00001000911	RNA-Seq data for 4 erythroblast sample(s). 22 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	4
EGAD00001000912	RNA-Seq data for 1 CD8-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001000913	ChIP-Seq data for 9 CD14-positive, CD16-negative classical monocyte sample(s). 59 run(s), 55 experiment(s), 55 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	9
EGAD00001000914	Bisulfite-Seq data for 3 inflammatory macrophage sample(s). 38 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	3
EGAD00001000915	RNA-Seq data for 4 megakaryocyte-erythroid progenitor cell sample(s). 4 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	4
EGAD00001000916	ChIP-Seq data for 1 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 7 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	1
EGAD00001000917	Bisulfite-Seq data for 1 hematopoietic multipotent progenitor cell sample(s). 8 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	1
EGAD00001000918	RNA-Seq data for 3 common lymphoid progenitor sample(s). 15 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	3
EGAD00001000919	RNA-Seq data for 3 hematopoietic multipotent progenitor cell sample(s). 9 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	3
EGAD00001000920	Bisulfite-Seq data for 1 alternatively activated macrophage sample(s). 10 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	1
EGAD00001000921	Bisulfite-Seq data for 1 CD8-positive, alpha-beta T cell sample(s). 14 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	1
EGAD00001000922	RNA-Seq data for 3 granulocyte monocyte progenitor cell sample(s). 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	3
EGAD00001000923	Bisulfite-Seq data for 1 macrophage sample(s). 14 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	1
EGAD00001000924	ChIP-Seq data for 2 erythroblast sample(s). 14 run(s), 14 experiment(s), 14 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	2
EGAD00001000925	ChIP-Seq data for 3 CD4-positive, alpha-beta T cell sample(s). 21 run(s), 21 experiment(s), 21 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	3
EGAD00001000926	DNase-Hypersensitivity data for 2 inflammatory macrophage sample(s). 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_dnaseseq_analysis_20140811	Illumina HiSeq 2000	2
EGAD00001000927	Bisulfite-Seq data for 1 Plasma cell sample(s). 11 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	1
EGAD00001000928	RNA-Seq data for 7 CD14-positive, CD16-negative classical monocyte sample(s). 7 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	7
EGAD00001000929	ChIP-Seq data for 1 macrophage sample(s). 6 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	1
EGAD00001000930	ChIP-Seq data for 7 mature neutrophil sample(s). 68 run(s), 50 experiment(s), 50 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	7
EGAD00001000931	DNase-Hypersensitivity data for 1 macrophage sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_dnaseseq_analysis_20140811	Illumina HiSeq 2000	1
EGAD00001000932	Bisulfite-Seq data for 1 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 14 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	1
EGAD00001000933	RNA-Seq data for 1 macrophage sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001000934	Bisulfite-Seq data for 2 Multiple myeloma sample(s). 16 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	2
EGAD00001000935	Bisulfite-Seq data for 6 mature neutrophil sample(s). 79 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	6
EGAD00001000936	ChIP-Seq data for 2 CD8-positive, alpha-beta T cell sample(s). 13 run(s), 13 experiment(s), 13 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	2
EGAD00001000937	RNA-Seq data for 1 alternatively activated macrophage sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001000938	ChIP-Seq data for 4 alternatively activated macrophage sample(s). 29 run(s), 28 experiment(s), 28 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	4
EGAD00001000939	RNA-Seq data for 3 hematopoietic stem cell sample(s). 8 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	3
EGAD00001000940	ChIP-Seq data for 3 inflammatory macrophage sample(s). 21 run(s), 21 experiment(s), 21 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	3
EGAD00001000941	Bisulfite-Seq data for 6 CD14-positive, CD16-negative classical monocyte sample(s). 86 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	6
EGAD00001000942	DNase-Hypersensitivity data for 1 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_dnaseseq_analysis_20140811	Illumina HiSeq 2000	1
EGAD00001000943	Bisulfite-Seq data for 1 germinal center B cell sample(s). 8 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release August 2014. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	1
EGAD00001000944	Whole Genome Sequencing of 5 acral melanomas and matched normal samples	Illumina HiSeq 2000	10
EGAD00001000945	NGS of 10 mucosal melanomas:Whole genome sequencing of 5 mucosal melanomas and matched normal DNAWhole exome sequencing of 5 mucosal melanomas and matched normal DNA	Illumina HiSeq 2000	20
EGAD00001000946	Divergent clonal selection dominates medulloblastoma at recurrence		125
EGAD00001000947	Genomic libraries (500 bps) will be generated from total genomic DNA derived from Colorectal cancer patients and subjected to short paired end sequencing on the llumina platform. Paired reads will be mapped to build 37 of the human reference genome to facilitate the generation of genome wide copy number information, and the identification of novel rearranged cancer genes and gene fusions.	Illumina HiSeq 2000	45
EGAD00001000948	A comparison of the somatic variation present in a primary colorectal tumour and three different liver metastases from the same patient.	Illumina HiSeq 2000	6
EGAD00001000949	Validations of variants identified by exome sequencing in sequential samples derived after treatment cycle with AZA.	Illumina HiSeq 2000	170
EGAD00001000950	Whole genome sequencing data for ependymomas (5 tumor-control pairs). See Mack, Witt et al. Nature 506(7489):445-50, 2014 (PMID: 24553142).		10
EGAD00001000951	Whole exome sequencing data for ependymomas (42 tumor-control pairs). See Mack, Witt et al. Nature 506(7489):445-50, 2014 (PMID: 24553142).		84
EGAD00001000952	DNA methylation profiling of 8 control samples from adult (4) and fetal brain (4)	Illumina HiSeq 2000	8
EGAD00001000963	Exome sequencing of sporadic schwannomatosis patients		16
EGAD00001000964	Low-coverage whole genome sequencing of sporadic schwannomatosis patients		16
EGAD00001000965	Cancers are ecosystems of genetically related clones, competing across space and time for limited resources. To understand the clonal structure of primary breast cancer, we applied genome and targeted sequencing to 295 samples from 49 patients’ tumors. The extent of subclonal diversification varied considerably among patients and encompassed many spatial patterns, including local growth, intraductal dissemination and clonal intermixture. Landmarks of disease progression, such as acquiring invasive or metastatic potential, arose within detectable subclones of antecedent lesions, suggesting that subclonal mutations could be relevant if actionable. No defined temporal order of mutation was evident, with the commonest genes, including PIK3CA, TP53, BRCA2, PTEN and MYC, mutated early in some, late in others, often exhibiting parallel evolution across subclones. Signatures of homologous recombination deficiency correlated with response to neoadjuvant chemotherapy. Thus, the interplay of mutation, growth and competition drives clonal structures of breast cancer that are complex, variable across patients and clinically relevant.	Illumina HiSeq 2000	331
EGAD00001000966	Whole genome bisulfite sequencing data for 6 ependymomas plus 3 fetal controls (f1, f2, f4) and 3 adult controls (a2, a3, a4). See Mack, Witt et al. Nature 506(7489):445-50, 2014 (PMID: 24553142).	Illumina HiSeq 2000	14
EGAD00001000967	This dataset contains the fastq sequencing data collected from bone marrow DNA of a chronic myeloid leukaemia patient at time of diagnosis.	Illumina HiSeq 2000	4
EGAD00001000972	Whole Genome Sequencing to track subclonal heterogeneity in 18 samples from 3 Chronic Lymphocytic Leukemia patients subjected to repeated cycles of therapy. NOTE: There are only 12 BAM files available to download. The other 6 files are missing.	Illumina HiSeq 2500	18
EGAD00001000973	Van Hippel-Lindau syndrome multi-region exome sequencing of two patients	Illumina HiSeq 2000	21
EGAD00001000974	High-grade serous ovarian cancer (HGSC) is characterized by poor outcome, often attributed to emergence of treatment-resistant sub-clones. We sought to measure the degree of genomic diversity within primary, untreated HGSC to examine the natural state of tumor evolution prior to therapy. We performed exome sequencing, copy number analysis, targeted amplicon deep sequencing and gene expression profiling on thirty-one spatially and temporally separated HGSC tumor specimens (six patients) including ovarian masses, distant metastases, and fallopian tube lesions. We found widespread intra-tumoral variation in mutation, copy number, and gene expression profiles, with key driver alterations in genes present in only a subset of samples (e.g. PIK3CA, CTNNB1, NF1). On average, only 51.5% of mutations were present in every sample of a given case (range: 10.2% to 91.4%), with TP53 as the only somatic mutation consistently present in all samples. Complex segmental aneuploidies, such as whole genome doubling, were present in a subset of samples from the same individual, with divergent copy number changes segregating independently of point mutation acquisition. Reconstruction of evolutionary histories showed one patient with mixed HGSC and endometrioid histology with common etiologic origin in the fallopian tube and subsequent selection of different driver mutations in the histologically distinct samples. In this patient, we observed mixed cell populations in the early fallopian tube lesion, indicating diversity arises at early stages of tumorigenesis. Our results reveal that HGSC exhibit highly individual evolutionary trajectories and diverse genomic tapestries prior to therapy, exposing an essential biological characteristic to inform future design of personalized therapeutic solutions and investigation of drug resistance mechanisms.	Illumina HiSeq 2000 Illumina MiSeq	131
EGAD00001000975	65 prostate cancer cases transcriptome sequencing	Illumina HiSeq 2000	130
EGAD00001000976	WGS DATA FILES FOR SJPhLike	Illumina HiSeq 2000	80
EGAD00001000977	WGS dataset LCNEC study	Illumina HiSeq 2000	11
EGAD00001000978	Multi-region whole genome sequencing of an high grade serous ovarian carcinoma sample for characterization of genomic intra-tumoural heterogeneity.	Illumina HiSeq 2000	48
EGAD00001000979	We are developing a protocol to differentiate mouse and human induced pluripotent stem (IPS) and embryonic stem (ES) cells towards the haematopoietic pathway to generate erythrocytes in vitro. This system has many applications such as the study of the role of specific genes and human polymorphisms in infectious diseases such as malaria, as well as haematological diseases such as myelodysplastic syndrome. The nature of the in vitro differentiation process means that a heterogeneous population of cells is generated. In order to understand the types of cells produced with our protocol, we have performed a single cell analysis, which has the power to reveal the different populations of cells and their characteristics. For this, a cDNA library has been made that needs to be sequenced to obtain the gene expression profiles of the different cells. With this information we will be able to assess the quality of the differentiation protocol and improve it in order to produce better cells for the downstream applications.This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2500	192
EGAD00001000980	This study involves a forward genetic screen to identify common insertion sites in drug resistant clones. We will be utilising piggybac transposon systems in order to generate multiple drug resistant clones in a range of human cancer cell lines.	Illumina MiSeq	144
EGAD00001000983	65 prostate cancer cases wgs sequencing	Illumina HiSeq 2000	10
EGAD00001000984	This is the Whole Exome Sequencing (WES) data from 59 samples from 11 patients with lung adenocarcinomas including 48 tumor samples and 11 peripheral white blood cell samples	Illumina HiSeq 2000	59
EGAD00001000985	This is the targeted capture deep sequencing (TCS) data for validation of the mutations discovered in the WES step. There are 58 bam files of TCS data including 48 tumor samples and 10 peripheral blood WBC samples.	Illumina HiSeq 2000	58
EGAD00001000986	Pheochromocytomas and paragangliomas (PCC/PGL) are neural crest derived tumors with a very strong genetic component. We report the first integrated genomic portrayal of a large collection of PCC/PGL. SNP array analysis revealed distinct copy-number patterns associated with genetic background. Whole-exome sequencing showed a low mutation rate of 0.3 mutations per megabase, with few recurrent somatic mutations in genes not previously associated with PCC/PGL. DNA methylation arrays and miRNA sequencing identified DNA methylation changes and miRNA expression clusters strongly associated with mRNA expression profiling. Overexpression of the miRNA cluster 182/96/183 was specific of SDHB-mutated tumors and induced invasive traits, whereas silencing of the imprinted DLK1-MEG3 miRNA cluster appeared as a potential driver in a subgroup of sporadic tumors. Altogether, the complete genomic landscape of PCC/PGL is mainly driven by distinct germline and/or somatic mutations in susceptibility genes and reveals different molecular entities, characterized by a set of unique genomic alterations.	Illumina HiSeq 2000	60
EGAD00001000987	Whole exome sequencing data from tumor and normal samples from carcinosarcoma (malignant mixed mullerian tumor) patients	Illumina HiSeq 2000	44
EGAD00001000988	Validation/deeper sequencing for metastatic prostate cancer samples	Illumina HiSeq 2500 Illumina MiSeq	94
EGAD00001000989	Validation/deeper sequencing for metastatic prostate cancer samples	Illumina HiSeq 2500	26
EGAD00001000990	mRNA-Seq on total RNA from primary osteoblastomas and phosphaturic mesenchymal tumours, focussing on fusion transcript expression	Illumina HiSeq 2000	11
EGAD00001000992	HIPO blastemal Wilms (nephroblastoma) characterisation of tumor driving events caused by differential SIX1 binding of the SIX1 Q177R mutatns	Illumina HiSeq 2500	3
EGAD00001000993	HIPO blastemal Wilms (nephroblastoma) characterisation of tumor driving gene expression events	Illumina HiSeq 2000	40
EGAD00001000994	HIPO blastemal Wilms (nephroblastoma) characterisation of tumor driving chromosomal aberrations	Illumina HiSeq 2000 Illumina HiSeq 2500	56
EGAD00001000995	HIPO blastemal Wilms (nephroblastoma) characterisation of tumor driving DNA alterations	Illumina HiSeq 2000	112
EGAD00001000996	Whole exome sequencing data for AML and matched normal samples	Illumina HiSeq 2500	16
EGAD00001000997	Whole-exome sequencing of a chronic lymphocytic leukemia (CLL) developed during vemurafenib treatment of a patient with malignant melanoma. Peripheral blood mononuclear cells were separated by Ficoll gradient centrifugation. DNA was extracted from highly purified (>97%) CD19+CD5+ cells obtained from the patient while being under BRAF inhibition versus CD14+ germline control cells (>90% purity). No alterations that could be linked to aberrant RAS activity or paradoxical RAF/MEK/ERK signaling could be identified in the CLL, which shows characteristic copy number alterations.	Illumina HiSeq 2500	2
EGAD00001000998	Targeted capture of exonic and intronic regions of interest for the study of genomic alterations in multiple myeloma.	Illumina HiSeq 2000	24
EGAD00001001000	Background: The disease course of patients with diffuse low-grade glioma is notoriously unpredictable. Temporal and spatially distinct samples may provide insight into the evolution of clinically relevant copy number aberrations (CNAs). The purpose of this study is to identify CNAs that are indicative of aggressive tumor behaviour and can thereby complement the prognostically favorable 1p/19q co-deletion. Results: Genome-wide, 50 base pair single-end, sequencing was performed to detect CNAs in a clinically well-characterized cohort of 98 formalin-fixed paraffin-embedded low-grade gliomas. CNAs are correlated with overall survival as an endpoint. Seventy-five additional samples from spatially distinct regions and paired recurrent tumors of the discovery cohort were analysed to interrogate the intratumoral heterogeneity and spatial evolution. Loss of 10q25.2-qter is a frequent subclonal event and significantly correlates with an unfavorable prognosis. A significant correlation is furthermore observed in a validation set of 126 and confirmation set of 184 patients. Loss of 10q25.2-qter arises in a longitudinal manner in paired recurrent tumor specimens, whereas the prognostically favorable 1p/ 19q co-deletion is the only CNA that is stable across spatial regions and recurrent tumors. Conclusions: CNAs in low-grade gliomas display extensive intratumoral heterogeneity. Distal loss of 10q is a late onset event and a marker for reduced overall survival in low-grade glioma patients. Intratumoral heterogeneity and higher frequencies of distal 10q loss in recurrences suggest this event is involved in outgrowth to the recurrent tumor.	Illumina HiSeq 2000	175
EGAD00001001001			2
EGAD00001001002	Exome sequencing data for 8 pairs of seminomas and matched normal	Illumina Genome Analyzer IIx Illumina HiSeq 2000	16
EGAD00001001003	Exome sequencing of lymphocyte DNA from 12 affected individuals from six unrelated, non-syndromic Wilms tumor families.	Illumina HiSeq 2000	12
EGAD00001001004	65 prostate cancer cases wgs sequencing	Illumina HiSeq 2000	130
EGAD00001001006	Dataset for whole exome sequencing of 113 pairs of tumor and normal DNA samples along with 8 cell lines.	Illumina HiSeq 2000	234
EGAD00001001007	Low depth (4x) Illumina HiSeq raw sequence data for 100 unrelated Zulu from Durban area, South Africa.	Illumina HiSeq 2000	100
EGAD00001001008	Low depth (4x) Illumina HiSeq raw sequence data for 100 unrelated Baganda from rural Uganda.	Illumina HiSeq 2000	100
EGAD00001001009	Exome sequencing of peripheral blood from 4 individuals of a family with familial colorectal cancer type X	Illumina HiSeq 2000	4
EGAD00001001010	Sequencing of colorectal tumors and normal tissue using Ion AmpliSeq Cancer Hotspot Panel V2	Ion Torrent Proton	8
EGAD00001001011	Monocyte differentiation into macrophages represents a cornerstone process for host defense. Concomitantly, immunological imprinting of either tolerance or trained immunity determines the functional fate of macrophages and susceptibility to secondary infections. Transcriptomes (RNA-Seq) and epigenomes (ChIP-Seq H3K4me1,H3K4me3,H3K27ac) in four primary cell types: monocytes, in vitro differentiated naive, tolerized and trained macrophages were characterized. Inflammatory and metabolic pathways were modulated in macrophages, including decreased inflammasome activation, and pathways functionally implicated in trained immunity were identified. Strikingly, B-glucan training elicits an exclusive epigenetic signature, revealing a complex network of enhancers and promoters. Analysis of transcription factor motifs in DNase I hypersensitive sites at cell-type specific epigenetic loci unveiled differentiation and treatment specific repertoires. Altogether, this study provides a resource to understand the epigenetic changes that underlie innate immunity in humans.	Illumina HiSeq 2000 NextSeq 500	57
EGAD00001001012	The need for a detailed catalogue of local variability for the study of rare diseases within the context of the Medical Genome Project motivated the whole exome sequencing of 267 unrelated individuals, representative of the healthy Spanish population.	AB 5500xl Genetic Analyzer	267
EGAD00001001013	RNAseq and exome sequencing data of gastric cancer cell lines.	Illumina HiSeq 2000	30
EGAD00001001014		Illumina HiSeq 2000	2597
EGAD00001001015		Illumina HiSeq 2000	76
EGAD00001001016	DATA FILES FOR SJPhLike-RNASeq	Illumina HiSeq 2000	125
EGAD00001001017	DNA extracted from multiple biopsies taken from different areas of primary lung tumours will be subjected to targeted re-sequencing and analysed in order to assess intra-tumour heterogeneity with respect to mutations in a selection of cancer related genes.	Illumina HiSeq 2000	31
EGAD00001001018	The samples will be sequenced for a targeted panel of cancer relevant genes (n ~ 370) and analysed for somatic mutations. This dataset contains all the data available for this study on 2014-09-24	Illumina HiSeq 2000	374
EGAD00001001019	RNA-seq dataset used for the validation of CDK6 cis-regulatory mutation annotated by OncoCis. NB bam files for manuscript A_Proteomic_Chronology_of_Gene_Expression_through_the_Cell_Cycle_in_Human_Myeloid_Leukemia_Cells are now available at the following link:http://www.ebi.ac.uk/ena/data/view/ERP008483	Illumina HiSeq 2000	1
EGAD00001001020	DATA FILES FOR SJEWS-WGS	Illumina HiSeq 2000	38
EGAD00001001021	Exome sequencing of 1000 samples from the UK 1958 Birth Cohort. DNA library preps prepared with Illumina TruSeq sample preparation kit. The captured DNA libraries were PCR amplified using the supplied paired-end PCR primers. Sequencing was performed with an Illumina HiSeq2000 (SBS Kit v3, one pool per lane) generating 2x101-bp reads.	Illumina HiSeq 2500	1000
EGAD00001001022	nccRCC RNA-Seq data of consented samples	Illumina HiSeq 2500	139
EGAD00001001023	nccRCC Whole Exome sequencing data (consented samples only)	Illumina HiSeq 2500	137
EGAD00001001024	Fastq files of 52 samples of hepatocellular carcinoma(RCAST, THCC)	Illumina HiSeq 2000	104
EGAD00001001025	The offspring of first cousin marriages have ~6% of their genome autozygous, i.e. homozygous identical by descent, or even more if there was further consanguinity in their ancestry. In the UK there are large populations with very high first cousin marriage rates of 50-80%. Sequencing the exomes of a sample of these individuals has the potential both to support genetic health programmes in these populations, and to provide genetic research information about rare loss of function mutations. This pilot study based on existing British-Pakistani cohort samples from Birmingham will identify homozygous individuals for almost all variants down to an allele frequency around 1%, plus individuals carrying hundreds of new homozygous rare loss-of-function variants, and will support development of community relations and ethics for a wider study currently being designed. The data deposited in the EGA consist of low coverage whole exome sequencing on these samples.	Illumina HiSeq 2000	1156
EGAD00001001026	The offspring of first cousin marriages have ~6% of their genome autozygous, i.e. homozygous identical by descent, or even more if there was further consanguinity in their ancestry. In the UK there are large populations with very high first cousin marriage rates of 20-50%. Sequencing the exomes of a sample of these individuals has the potential both to support genetic health programmes in these populations, and to provide genetic research information about rare loss of function mutations. This pilot study based on existing British-Pakistani cohort samples from Birmingham will identify homozygous individuals for almost all variants down to an allele frequency around 1%, plus individuals carrying hundreds of new homozygous rare loss-of-function variants, and will support development of community relations and ethics for a wider study currently being designed. The data deposited in the EGA consists of low coverage whole exome sequencing on these samples.	Illumina HiSeq 2000	452
EGAD00001001027	The offspring of first cousin marriages have ~6% of their genome autozygous, i.e. homozygous identical by descent, or even more if there was further consanguinity in their ancestry. In the UK there are large populations with very high first cousin marriage rates of 20-50%. Sequencing the exomes of a sample of these individuals has the potential both to support genetic health programmes in these populations, and to provide genetic research information about rare loss of function mutations. This pilot study based on existing British-Pakistani cohort samples will identify homozygous individuals for almost all variants down to an allele frequency around 1%, plus individuals carrying hundreds of new homozygous rare loss-of-function variants, and will support development of community relations and ethics for a wider study currently being designed. The data deposited in the EGA consists of low coverage whole exome sequencing on these samples.	Illumina HiSeq 2000	130
EGAD00001001028	DNA belonging to 16 tumour/normal samples were treated with bisulfite, then up to 5 different bisulfite PCRs were performed in each one of the samples. Amplicons form the same sample were pooled and submitted to sequencing on a MiSeq platform.	Illumina MiSeq	18
EGAD00001001029	The dataset regards the sequencing of coding and putative regulatory sequences of 38 genes associated to either sporadic or Mendelian form of Parkinson's disease	Illumina HiSeq 2000	394
EGAD00001001031	These are only the whole exome sequences	Illumina HiSeq 2500	6
EGAD00001001032	DATA FILES FOR SJMEL-WGS	Illumina HiSeq 2000	12
EGAD00001001033	Whole exome sequencing (WES) was performed on genomic DNA derived from two patients with Sotos Syndrome Features. Sequencing (100 base pair paired-end) was performed on an Illumina Hiseq 2000 sequencer after enrichment of 62Mb of exonic and adjacent intronic sequences with TruSeq Exome Enrichment Kit (Illumina, San Diego, CA, USA).	Illumina HiSeq 2000	2
EGAD00001001034	Whole genome data (Complete genomics platform) for the study EGAS00001000824		24
EGAD00001001035	RIKEN collection WGS and RNA-seq reads for 66 HBV-associated HCC and matched blood or liver samples from 22 donors.	Illumina Genome Analyzer IIx Illumina HiSeq 2000	66
EGAD00001001036		Illumina HiSeq 2000	26
EGAD00001001037	A total of 395 couples were subjected to IVF-PGD treatment, including 129 couples with NGS-based test and 266 couples with SNP array based test for the detection of embryonic chromosomal abnormalities. The NGS test was performed using low coverage whole genome sequencing with HiSeq 2000 platform. And the SNP array test was using Affymetrix Gene Chip Mapping Nsp I 262K. The average age of patients was 32.1 years (age range 20-44 years).	Illumina HiSeq 2000	188
EGAD00001001038	We mapped the data to the UCSC human reference genome build 37 using BWA 0.5.9-r16. We first mapped each read pair separately using bwa aln. Then we used bwa sampe to map the paired reads together to a BAM9 file. The BAM file was then sorted by genomic position and indexed using PicardTools-1.32 SortSam. To prevent PCR artifacts from influencing the downstream analysis of our data, we used Picard to mark the duplicate reads, which were ignored in downstream analysis. We used GATK IndelRealigner on our data around known indels (from 1KG Pilot). The IndelRealigner creates all possible read alignments using the source and computes the likelihood of the data containing the indel based on the read pileup. Whenever the maximum likelihood contains an indel, the reads are realigned accordingly. Each base is associated with a phred-scaled base quality score. Calibration of Phred scores is crucial as they are used in some of the downstream analysis models. We used GATK to recalibrate the base qualities with respect to (i) the base cycle, (ii) original quality score, and (iii) dinucleotide context. To minimize issues stemming from mapping problems around indels, we decided to undergo a second round of indel realignment using the GATK IndelRealigner by family rather than by individual. For this second round, we considered two sources of possible indels: 1KG Phase 1 indels and indels aligned by BWA in the GoNL data.		-
EGAD00001001039	Genomic characterisation of a large series of cancer cell lines.	Illumina HiSeq 2000	1072
EGAD00001001040	This is the complete dataset (exome and genome) for the EGAS00001000974 study.	Illumina HiSeq 2500	16
EGAD00001001041	Comparison of genomic rearrangements and DNA methylation patterns between different foci of multiple synchronous (multifocal and multicentric) invasive breast cancers.	Illumina Genome Analyzer II Illumina HiSeq 2000	305
EGAD00001001042	In this work, using exome sequencing, we identified biallelic PNLPA6 mutations in patients with childhood blindness due to severe photoreceptor death and clinical features of Leber congenital amaurosis (LCA) and, interestingly, also of the rare Oliver McFarlane Syndrome	AB SOLiD 4 System Illumina HiSeq 2000	7
EGAD00001001043		Illumina HiSeq 2000	8
EGAD00001001044		Ion Torrent PGM	2
EGAD00001001045	DATA FILES FOR SJRB	Illumina HiSeq 2000	20
EGAD00001001046	We propose to biopsy 20 consented BRAF mutant melanoma patients at Addenbrooke's Hospital pre-treatment with vemurafenib and also upon the development of resistant disease, with the aim of using exome sequence and SNP6 data to identify novel sequence variants and copy number alterations that can be used to validate observed resistance mechanisms in our cell line models and also to use these models to inform as to likely candidate small molecule inhibitors to overcome resistance and that could be tested in the clinical trial setting.	Illumina HiSeq 2000	33
EGAD00001001047	Targeted exome sequencing of 375 genes	Illumina HiSeq 2500	31
EGAD00001001048	Samples from Edwards et al 2015 - doi:10.1186/s12864-015-1685-z	Illumina HiSeq 2000	1
EGAD00001001050	We propose to biopsy 20 consented BRAF mutant melanoma patients at Addenbrooke's Hospital pre-treatment with vemurafenib and also upon the development of resistant disease, with the aim of using exome sequence and SNP6 data to identify novel sequence variants and copy number alterations that can be used to validate observed resistance mechanisms in our cell line models and also to use these models to inform as to likely candidate small molecule inhibitors to overcome resistance and that could be tested in the clinical trial setting.	Illumina HiSeq 2000	8
EGAD00001001051		Illumina HiSeq 2000	200
EGAD00001001052	DATA FILES FOR SJTALL	Illumina HiSeq 2000	24
EGAD00001001053	DATA FILES FOR SJOS-WGS-2ndBatch	Illumina HiSeq 2000	27
EGAD00001001054	DATA FILES FOR Ph-likeALL WES	Illumina HiSeq 2000	23
EGAD00001001055	Bam files for the whole exome sequencing from the study on Spatial homogeneity in pediatric brain tumors.	Illumina HiSeq 2000	53
EGAD00001001056		Illumina HiSeq 2000	7
EGAD00001001057	RNA-seq from normal human tissues (2 x 75 bp)	Illumina HiSeq 2000	3
EGAD00001001058	Cancer exome reads consisting of FASTQ paired end reads from bone marrow samples	Illumina HiSeq 2000	42
EGAD00001001059	Whole Exome Sequencing files accompanying Genetic landscape of pediatric Rhabdomyosarcoma	Illumina HiSeq 2000	56
EGAD00001001060		Illumina HiSeq 2000	112
EGAD00001001061	This experiment is to inform us of the validity of using pre-made library material to perform a bespoke pulldown experiment to validate the mutations found between the whole genome sequencing of the DNA from the same individuals cancer and normal material. This is to identify the valid and informative mutations in cancer genomes.	Illumina MiSeq	4
EGAD00001001062	Patient (who has had multiple malignancies) has previously been found to harbour a pathogenic p53 variant which is probably mosaic. This finding is based on exome sequencing performed elsewhere. In this study we will resequence the locus in question to ascertain whether the variant is indeed mosaic.	Illumina MiSeq	4
EGAD00001001063	Chondromxoid fibroma is a benign tumour of bone with unknown underlying pathogenesis. To determine pathognomic genomic event in chondromyxoid fibroma whole genome sequencing will be undertaken to reconstruct rearrangements and find underlying mutations.	Illumina HiSeq 2000	2
EGAD00001001064	Extension of angiosarcoma whole genome sequencing study	Illumina MiSeq	4
EGAD00001001065	DATA FILES FOR SJCPC-WGS	Illumina HiSeq 2000	8
EGAD00001001066	Dynamics of genomic clones in breast cancer patient xenografts at single cell resolution	Illumina HiSeq 2000 Illumina MiSeq	188
EGAD00001001071	Samples from the "100" project that are in the ICGC PanCancer project.	Illumina HiSeq 2000	10
EGAD00001001072	(ShallowSeq CopyNumber)	Illumina MiSeq	5
EGAD00001001073	miRNA-seq Cohort of 140 Formalin Fixed Paraffin Embedded Diffuse Large B-cell Lymphoma Patient Samples		140
EGAD00001001074	miRNA-seq Cohort of 92 Fresh Frozen Diffuse Large B-cell Lymphoma Patient Samples		92
EGAD00001001075	miRNA-seq Cohort of 15 Benign Centroblasts		15
EGAD00001001076	Fastq files of 239 samples of biliary tract cancer	Illumina HiSeq 2000	239
EGAD00001001079	The offspring of first cousin marriages have ~6% of their genome autozygous, i.e. homozygous identical by descent, or even more if there was further consanguinity in their ancestry. In the UK there are large populations with very high first cousin marriage rates of 20-50%. Sequencing the exomes of a sample of these individuals has the potential both to support genetic health programmes in these populations, and to provide genetic research information about rare loss of function mutations. This pilot study based on existing cohort samples from the Born In Bradford study will identify homozygous individuals for almost all variants down to an allele frequency around 1%, plus individuals carrying hundreds of new homozygous rare loss-of-function variants, and will support development of community relations and ethics for a wider study currently being designed. The data deposited in the EGA consist of low coverage whole exome sequencing on these samples.Data Access is controlled by the Wellcome Trust Sanger Institute DAC and the Born In Bradford Executive Group. This dataset contains all the data available for this study on 2014-11-20.	Illumina HiSeq 2000	2702
EGAD00001001080	MDS patients		5
EGAD00001001081	Healthy reference samples		3
EGAD00001001083		Illumina HiSeq 2000	2
EGAD00001001084		Illumina HiSeq 2000	209
EGAD00001001085	This dataset includes 2 pairs of tumour/normal whole genome sequence data as well as MEN1 gene targeted sequencing of an additional 87 specimens.	Illumina HiSeq 2500 Illumina MiSeq	91
EGAD00001001086	These analysis are the BAM files for the LCLs samples of the EUROBATS samples.		765
EGAD00001001087	RNAseq BAM files for the Skin samples of the EUROBATS project.		672
EGAD00001001088	RNAseq BAM files for the blood samples of the EUROBATS project		391
EGAD00001001089	RNAseq BAM files for the Fat samples of the EUROBATS project		685
EGAD00001001090	This study aims to define the landscape of somatic mutations in sun exposed human skin by deep sequencing, analyse their frequency and use the data to infer the effect of mutations on proliferating cell behaviour. The frequency of each mutation will reflect the size of the clone of cells in the tissue sample. By analyzing small samples, clones with as few as 100 cells will be detectable. Allele frequency distributions for each mutation will be used to infer cell fate using published methods (Klein et al. 2010). This study will shed unprecedented light on the early clonal events that lead to the emergence of cancer.	Illumina HiSeq 2000	166
EGAD00001001091	We established and validated a sequence capture based NGS testing approach for PKD1. The presence of six PKD1 pseudogenes and tremendous allelic heterogeneity make molecular genetic testing of PKD1 variants challenging. In the publication accompaying this dataset (An efficient and comprehensive strategy for genetic diagnostics of polycystic kidney disease, Eisenberger et.al., PLoS one), we demonstrate that the applied standard mapping algorithm specifically aligns reads to the PKD1 locus and overcomes the complication of unspecific capture of pseudogenes. This dataset contains the raw PKD1 reads of all patients from the publication.	Illumina HiSeq 1500	55
EGAD00001001092	Approximately 80% of clinically clearly diagnosed patients suffering from primary ciliary dyskinesia (PCD) cannot be assigned to a specific gene defect. Despite extensive research on PCD and despite the increasing number of PCD genes and knowledge about their sites of action as e.g structural component or cytoplasmic pre-assembly factor, the biology of motile cilia and the pathomechanism leading to PCD is largely unknown. The aim of this study is to identify novel PCD related genes and processes relevant for motile cilia function.We will perform exome sequencing, aiming on the analysis of family trios. In these families, the diagnosis of PCD is secured, but the underlying gene defects has so far not been identified.	Illumina HiSeq 2000	150
EGAD00001001093	Noninvasive detection of cancer-associated genome-wide hypomethylation and copy number aberrations by plasma DNA bisulfite sequencing	Illumina HiSeq 2000	2
EGAD00001001094	200PG : WGS Raw Sequence (fastq) : Raw WG sequence data (fastq) in this dataset are from the 124 CPCGene Tumour/Normal Pairs used in the 200PG Study. https://www.ncbi.nlm.nih.gov/pubmed/28068672	Illumina HiSeq 2500	247
EGAD00001001095	Supporting data for ICGC PACA-CA Release 18	Illumina HiSeq 2000 Illumina HiSeq 2500	506
EGAD00001001096		Illumina HiSeq 2000	419
EGAD00001001098	DATA FILES FOR SJINF RNASeq	Illumina HiSeq 2000	63
EGAD00001001100	DCC Project Code: SKCA-BR Skin Adenocarcinoma - BR Brazil	AB 5500 Genetic Analyzer Illumina HiSeq 2500	200
EGAD00001001104	MMP-seq tumor samples, UDG treated (FASTQ)	Illumina MiSeq	16
EGAD00001001105	Whole-exome sequencing in 16 RMS casesWhole-transcriptome sequencing in 8 RMS cases	Illumina HiSeq 2000	38
EGAD00001001106	In the first part of this project, we will differentiate IPS cells from 5 human donors into macrophages, and extract RNA from unstimulated and LPS stimulated macrophages to perform RNA sequencing. We will also extract RNA before and after stimulation in blood- derived macrophages from 5 additional, unrelated healthy samples. In the second part of the project, RNA-seq data will be analysed to compare LPS response of these two macrophage populations. In summary, we will perform 75bp PE RNA-seq on 20 samples (10 pre and post stimulus), on the HiSeq 2500 platform. Samples will be multiplexed at 5 samples / lane, so we will require 4 flow cells in total.This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2000	18
EGAD00001001107	MMP-seq cell lines (FASTQ)	Illumina Genome Analyzer IIx	154
EGAD00001001108	MMP-seq tumor samples (FASTQ)	Illumina Genome Analyzer IIx	218
EGAD00001001109		Illumina HiSeq 2000	46
EGAD00001001110		Illumina HiSeq 2000	46
EGAD00001001111		Illumina HiSeq 2000	46
EGAD00001001112		Illumina HiSeq 2000	46
EGAD00001001113		Illumina HiSeq 2000	46
EGAD00001001114	DDD DATAFREEZE 2013-12-18: 1133 trios - exome sequence BAM files (Ref: DDD Nature 2015)		1
EGAD00001001115	SeqControl	Illumina HiSeq 2500	54
EGAD00001001118	Gastric Cancer (GC) is a highly heterogeneous disease. To identify potential clinically actionable therapeutic targets that may inform individualized treatment strategies, we performed whole-exome sequencing on 78 GCs of differing histologies and anatomic locations, as well as whole-genome sequencing on two GC cases, each with 3 primary tumours and 2 matching lymph node metastases. The data showed two distinct GC subtypes with either high-clonality (HiC) or low-clonality (LoC).	Illumina HiSeq 2000	168
EGAD00001001119	Whole Genome Bisulfite Sequencing	Illumina HiSeq 2000 Illumina HiSeq 2500	10
EGAD00001001120	Whole Genome Sequencing	Illumina HiSeq 2000 Illumina HiSeq 2500	26
EGAD00001001121	RNA Sequencing	Illumina HiSeq 2000	10
EGAD00001001122	FFPE normal panel generation for use with V3 cancer panel 0618521	Illumina HiSeq 2000	94
EGAD00001001123	Deep sequencing of two skin biopsies to study the landscape of somatic mutations in human adult tissues.	Illumina HiSeq 2000	2
EGAD00001001124	Our aim is to analyze the genome of human melanoma cell lines and short term culture from human melanoma samples in order to identify genes that confer drug resistance to clinically relevant targeted therapies. We will perform whole-exome sequencing, copy number variation analysis and methylome analysis in a collection of human melanoma cell lines and short term culture that will be then screened for drug sensitivity/resistance through a library of clinically relevant drugs and drug combinations. By the combined analysis of the genomic lesion and the drug sensitivity/resistance profile of different cell lines, we will look for genes whose mutation is associated to the sensitivity or resistance to a specific drug in different samples.	Illumina HiSeq 2000	14
EGAD00001001125	Exome sequencing of Untreated BCC samples.	Illumina HiSeq 2000	91
EGAD00001001126			340
EGAD00001001127	ChIP-Seq data for 2 effector memory CD8-positive, alpha-beta T cell sample(s). 10 run(s), 10 experiment(s), 10 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	2
EGAD00001001128	Bisulfite-Seq data for 3 cytotoxic CD56-dim natural killer cell sample(s). 38 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	3
EGAD00001001129	RNA-Seq data for 10 mature neutrophil sample(s). 10 run(s), 10 experiment(s), 10 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	10
EGAD00001001130	DNase-Hypersensitivity data for 5 CD14-positive, CD16-negative classical monocyte sample(s). 5 run(s), 5 experiment(s), 5 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_dnaseseq_analysis_20140811	Illumina HiSeq 2000	5
EGAD00001001131	Bisulfite-Seq data for 1 memory B cell sample(s). 20 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	1
EGAD00001001132	RNA-Seq data for 3 inflammatory macrophage sample(s). 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	3
EGAD00001001133	Bisulfite-Seq data for 2 erythroblast sample(s). 35 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	2
EGAD00001001134	Bisulfite-Seq data for 1 precursor lymphocyte of B lineage sample(s). 8 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	1
EGAD00001001135	Bisulfite-Seq data for 2 endothelial cell of umbilical vein (resting) sample(s). 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	2
EGAD00001001136	ChIP-Seq data for 2 endothelial cell of umbilical vein (proliferating) sample(s). 13 run(s), 13 experiment(s), 13 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	2
EGAD00001001137	RNA-Seq data for 2 CD8-positive, alpha-beta T cell sample(s). 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	2
EGAD00001001138	ChIP-Seq data for 6 Acute promyelocytic leukemia sample(s). 25 run(s), 23 experiment(s), 23 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	6
EGAD00001001139	Bisulfite-Seq data for 3 inflammatory macrophage sample(s). 38 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	3
EGAD00001001140	RNA-Seq data for 4 megakaryocyte-erythroid progenitor cell sample(s). 4 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	4
EGAD00001001141	Bisulfite-Seq data for 1 hematopoietic multipotent progenitor cell sample(s). 8 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	1
EGAD00001001142	RNA-Seq data for 1 endothelial cell of umbilical vein (resting) sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001001143	Bisulfite-Seq data for 4 alternatively activated macrophage sample(s). 64 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	4
EGAD00001001144	ChIP-Seq data for 1 central memory CD4-positive, alpha-beta T cell sample(s). 6 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	1
EGAD00001001145	RNA-Seq data for 2 CD38-negative naive B cell sample(s). 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	2
EGAD00001001146	RNA-Seq data for 3 granulocyte monocyte progenitor cell sample(s). 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	3
EGAD00001001147	ChIP-Seq data for 7 CD4-positive, alpha-beta T cell sample(s). 46 run(s), 45 experiment(s), 45 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	7
EGAD00001001148	RNA-Seq data for 8 CD14-positive, CD16-negative classical monocyte sample(s). 8 run(s), 8 experiment(s), 8 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	8
EGAD00001001149	ChIP-Seq data for 7 mature neutrophil sample(s). 78 run(s), 60 experiment(s), 60 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	7
EGAD00001001150	Bisulfite-Seq data for 1 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 14 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	1
EGAD00001001151	Bisulfite-Seq data for 1 endothelial cell of umbilical vein (proliferating) sample(s). 21 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	1
EGAD00001001152	Bisulfite-Seq data for 2 Multiple myeloma sample(s). 16 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	2
EGAD00001001153	RNA-Seq data for 1 effector memory CD8-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001001154	ChIP-Seq data for 5 CD8-positive, alpha-beta T cell sample(s). 28 run(s), 28 experiment(s), 28 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	5
EGAD00001001155	ChIP-Seq data for 5 alternatively activated macrophage sample(s). 36 run(s), 35 experiment(s), 35 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	5
EGAD00001001156	RNA-Seq data for 6 hematopoietic stem cell sample(s). 13 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	6
EGAD00001001157	Bisulfite-Seq data for 3 CD4-positive, alpha-beta T cell sample(s). 61 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	3
EGAD00001001158	ChIP-Seq data for 4 cytotoxic CD56-dim natural killer cell sample(s). 16 run(s), 16 experiment(s), 16 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	4
EGAD00001001159	RNA-Seq data for 3 cytotoxic CD56-dim natural killer cell sample(s). 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	3
EGAD00001001160	Bisulfite-Seq data for 1 plasma cell sample(s). 11 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	1
EGAD00001001161	DNase-Hypersensitivity data for 1 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_dnaseseq_analysis_20140811	Illumina HiSeq 2000	1
EGAD00001001162	Bisulfite-Seq data for 1 Acute myeloid leukemia sample(s). 18 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	1
EGAD00001001163	RNA-Seq data for 1 effector memory CD4-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001001164	RNA-Seq data for 1 class switched memory B cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001001165	RNA-Seq data for 5 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 23 run(s), 5 experiment(s), 5 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	5
EGAD00001001166	RNA-Seq data for 1 endothelial cell of umbilical vein (proliferating) sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001001167	Bisulfite-Seq data for 3 Acute promyelocytic leukemia sample(s). 24 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	3
EGAD00001001168	ChIP-Seq data for 2 mature eosinophil sample(s). 12 run(s), 12 experiment(s), 12 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	2
EGAD00001001169	RNA-Seq data for 3 common myeloid progenitor sample(s). 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	3
EGAD00001001170	RNA-Seq data for 1 conventional dendritic cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001001171	RNA-Seq data for 1 memory B cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001001172	RNA-Seq data for 1 central memory CD4-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001001173	RNA-Seq data for 10 CD4-positive, alpha-beta T cell sample(s). 10 run(s), 10 experiment(s), 10 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	10
EGAD00001001174	RNA-Seq data for 1 regulatory T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001001175	RNA-Seq data for 1 central memory CD8-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001001176	Bisulfite-Seq data for 1 class switched memory B cell sample(s). 20 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	1
EGAD00001001177	RNA-Seq data for 7 erythroblast sample(s). 29 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	7
EGAD00001001178	RNA-Seq data for 1 Leukemia sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001001179	ChIP-Seq data for 10 CD14-positive, CD16-negative classical monocyte sample(s). 73 run(s), 69 experiment(s), 69 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	10
EGAD00001001180	Bisulfite-Seq data for 2 central memory CD8-positive, alpha-beta T cell sample(s). 27 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	2
EGAD00001001181	RNA-Seq data for 7 Acute promyelocytic leukemia sample(s). 7 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	7
EGAD00001001182	ChIP-Seq data for 1 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 7 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	1
EGAD00001001183	ChIP-Seq data for 2 endothelial cell of umbilical vein (resting) sample(s). 10 run(s), 10 experiment(s), 10 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	2
EGAD00001001184	RNA-Seq data for 5 common lymphoid progenitor sample(s). 20 run(s), 5 experiment(s), 5 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	5
EGAD00001001185	DNase-Hypersensitivity data for 2 monocyte sample(s). 4 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_dnaseseq_analysis_20140811	Illumina HiSeq 2000	2
EGAD00001001186	RNA-Seq data for 3 hematopoietic multipotent progenitor cell sample(s). 9 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	3
EGAD00001001187	ChIP-Seq data for 3 Chronic lymphocytic leukemia sample(s). 6 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	3
EGAD00001001188	ChIP-Seq data for 7 Acute myeloid leukemia sample(s). 23 run(s), 23 experiment(s), 23 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	7
EGAD00001001189	Bisulfite-Seq data for 4 CD8-positive, alpha-beta T cell sample(s). 56 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	4
EGAD00001001190	DNase-Hypersensitivity data for 1 Acute myeloid leukemia sample(s). 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_dnaseseq_analysis_20140811	Illumina HiSeq 2000	1
EGAD00001001191	RNA-Seq data for 8 monocyte sample(s). 8 run(s), 8 experiment(s), 8 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	8
EGAD00001001192	Bisulfite-Seq data for 5 macrophage sample(s). 72 run(s), 5 experiment(s), 5 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	5
EGAD00001001193	DNase-Hypersensitivity data for 2 inflammatory macrophage sample(s). 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_dnaseseq_analysis_20140811	Illumina HiSeq 2000	2
EGAD00001001194	ChIP-Seq data for 2 erythroblast sample(s). 14 run(s), 14 experiment(s), 14 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	2
EGAD00001001195	ChIP-Seq data for 1 effector memory CD8-positive, alpha-beta T cell, terminally differentiated sample(s). 4 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	1
EGAD00001001196	ChIP-Seq data for 13 macrophage sample(s). 55 run(s), 55 experiment(s), 55 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000 NextSeq 500	13
EGAD00001001197	ChIP-Seq data for 2 monocyte sample(s). 7 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000 NextSeq 500	2
EGAD00001001198	DNase-Hypersensitivity data for 14 macrophage sample(s). 18 run(s), 14 experiment(s), 14 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_dnaseseq_analysis_20140811	Illumina HiSeq 2000	14
EGAD00001001199	RNA-Seq data for 18 macrophage sample(s). 19 run(s), 18 experiment(s), 18 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	18
EGAD00001001200	Bisulfite-Seq data for 1 effector memory CD8-positive, alpha-beta T cell sample(s). 11 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	1
EGAD00001001201	Bisulfite-Seq data for 6 mature neutrophil sample(s). 79 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	6
EGAD00001001202	RNA-Seq data for 4 alternatively activated macrophage sample(s). 6 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	4
EGAD00001001203	Bisulfite-Seq data for 1 germinal center B cell sample(s). 8 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	1
EGAD00001001204	ChIP-Seq data for 6 inflammatory macrophage sample(s). 35 run(s), 35 experiment(s), 35 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	6
EGAD00001001205	Bisulfite-Seq data for 3 CD38-negative naive B cell sample(s). 29 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	3
EGAD00001001206	Bisulfite-Seq data for 6 CD14-positive, CD16-negative classical monocyte sample(s). 86 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	6
EGAD00001001207	ChIP-Seq data for 4 CD38-negative naive B cell sample(s). 14 run(s), 14 experiment(s), 14 alignment(s). Part of BLUEPRINT release January 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	4
EGAD00001001208	Targeted capture of cancer gene panel bait set in single cell derived organoids from colon tissue and colorectal cancer from 1 patient.	Illumina HiSeq 2000 Illumina HiSeq 2500	105
EGAD00001001209	To examined the reproducibility of nucleotide variant calls in replicate sequencing experiments of the same genomic DNA, we performed targeted sequencing of all known human protein kinase genes (kinome) (~3.3 Mb) using the SOLiD v4 platform. This data set contains 17 breast cancer samples that were sequenced in duplicate (n=14) or triplicate (n=3), in order to assess concordance of all calls and single nucleotide variant (SNV) calls.	AB SOLiD 4 System	37
EGAD00001001210	Medulloblastoma-associated DDX3 variant selectively alters the translational response to stress		28
EGAD00001001212	RNAseq profile of purified plasma cells from multiple myeloma patients and tonsils of healthy donors	Illumina HiSeq 2000	15
EGAD00001001213		Illumina HiSeq 2000	5
EGAD00001001214	Deep (>25x mean coverage) whole genome sequencing on 5-10 families drawn from the Scottish Family Health Study with four or more children.	Illumina HiSeq 2000	19
EGAD00001001215	Targeted sequencing follow-up of genomic lesions in multiple myeloma.	Illumina HiSeq 2000	424
EGAD00001001216	The aim of this project is to genotype and sequence single spermatozoa from two men, one in his twenties and the other in his seventies. The resulting data is used to quantify the mutations that have arisen in the gametes of both individuals in order to better understand the effect of aging on mutation rates and modes.Project Outline. In order to quantify mutations, semen from two individuals are sequenced. 48 single sperm cells are isolated from each individual, and their DNA is extracted. The resulting genomes are amplified using PicoPlex, GenomiPhi MDA, Repli-G MDA, and MALBAC. QC step is applied to check the quality of WGA DNA using standard Sequenom plex (26 SNPs). A subset of 32 amplification products which pass the intiall QC, are genotyped using Affymetrix SNP6 chips. 12 of the genotyped amplification products are also sequenced. In addition, one multi-cell sample per individual is sequenced as a reference and for validation purposes.Altogether, 12 single cell sperm genomes and two multi-cell genomes are sequenced, coming to a total of 14 genomes. Of the single cell sperm genomes, 2 are sequenced to 50x coverage, and the other 10 to 25x coverage. Both multi-cell genomes are sequenced to 25x coverage.	Illumina HiSeq 2000	12
EGAD00001001217			15
EGAD00001001218			10
EGAD00001001220		Illumina HiSeq 1000	10
EGAD00001001221		Illumina HiSeq 2500	54
EGAD00001001222	TGCT Whole Exome Sequencing data	Illumina HiSeq 2500	84
EGAD00001001226	smRNA-Seq assays for reference epigenomes generated by Centre for Epigenome Mapping Technologies at Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Canada as part of the International Human Epigenome Consortium.		28
EGAD00001001227	Strand-specific mRNA-Seq assays for reference epigenomes generated by Centre for Epigenome Mapping Technologies at Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.		32
EGAD00001001228	Whole genome shotgun sequencing assays for reference epigenomes generated by Centre for Epigenome Mapping Technologies at Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.		27
EGAD00001001229	ChIP-Seq (H3K27ac) assays for reference epigenomes generated by Centre for Epigenome Mapping Technologies at Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2000;ILLUMINA	48
EGAD00001001230	ChIP-Seq (H3K27me3) assays for reference epigenomes generated by Centre for Epigenome Mapping Technologies at Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2000;ILLUMINA	48
EGAD00001001231	ChIP-Seq (H3K36me3) assays for reference epigenomes generated by Centre for Epigenome Mapping Technologies at Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2000;ILLUMINA	48
EGAD00001001232	ChIP-Seq (H3K4me1) assays for reference epigenomes generated by Centre for Epigenome Mapping Technologies at Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2000;ILLUMINA	48
EGAD00001001233	ChIP-Seq (H3K4me3) assays for reference epigenomes generated by Centre for Epigenome Mapping Technologies at Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2000;ILLUMINA	48
EGAD00001001234	ChIP-Seq (H3K9me3) assays for reference epigenomes generated by Centre for Epigenome Mapping Technologies at Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2000;ILLUMINA	48
EGAD00001001235	ChIP-Seq (Input) assays for reference epigenomes generated by Centre for Epigenome Mapping Technologies at Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.		48
EGAD00001001236	Targetted capture and resequencing of 94 known myeloid genes across MPN trials (PT1 and Voriconazole study) and other MPN samples.	Illumina HiSeq 2000	1860
EGAD00001001237	This is a pilot project to determine whether the TAPG FFPE DNA's are suitable for deep sequencing. If successful an investigation of SNP distribution in a larger cohort will follow.	Illumina HiSeq 2000	15
EGAD00001001238	Extension analysis to pursue candidate genes of interest in chordoma	Illumina HiSeq 2000	262
EGAD00001001239	Extension analysis to pursue candidate genes of interest in chordoma	Illumina HiSeq 2000	262
EGAD00001001240	VCF files of somatic variants from tumor-normal pairs of Asian lung cancer patients		30
EGAD00001001242	Pilot study to set up sequencing protocols for targeted pulldown methylation profiling	Illumina MiSeq	2
EGAD00001001243	Capture Hi-C identifies the chromatin interactome of colorectal cancer risk loci.	Illumina HiSeq 2000	9
EGAD00001001244	RNA-sequencing (RNA-seq) was performed with RNA extracted from fresh-frozen human tumor tissue samples. cDNA libraries were prepared from poly-A selected RNA applying the Illumina TruSeq protocol for mRNA. The libraries were then sequenced with a 2 x 100bp paired-end protocol to a minimum mean coverage of 30x of the annotated transcriptome.	Illumina HiSeq 2000	59
EGAD00001001245	DATA FILES FOR PCGP SJINF WES	Illumina HiSeq 2000	40
EGAD00001001246	DATA FILES FOR PCGP SJMEL WXS	Illumina HiSeq 2000	28
EGAD00001001247	DATA FILES FOR PCGP SJMEL RNASEQ	Illumina HiSeq 2000	7
EGAD00001001248	DATA FILES FOR PCGP SJETP WXS	Illumina HiSeq 2000	13
EGAD00001001249	WES of HCC by HiSeq 2000,total 71 samples including Hepatocellular carcinoma cell lines and nornal sample(Peripheral Blood or the adjacent tissues of cancer)	Illumina HiSeq 2000	71
EGAD00001001250	Low coverage (4-6x) sequencing on samples from population cohorts (Finrisk, Health2000) will be done at Wellcome Trust Sanger Institute (WTSI) using Illumina HiSeq sequencing technology. We will produce 100bp paired end reads. Variants will be called using the 1000 Genomes Project pipeline. The samples have been selected from a national representative set of 8028 samples from persons of 30 years or older, which were screened for psychotic and bipolar disorders using the Composite International Diagnostic Interview, self-reported diagnoses, medical examination, and national registers.	Illumina HiSeq 2000	731
EGAD00001001251	Low coverage (4-6x) sequencing on samples from population cohorts (Finrisk, Health2000) will be done at Wellcome Trust Sanger Institute (WTSI) using Illumina HiSeq sequencing technology. We will produce 100bp paired end reads. Variants will be called using the 1000 Genomes Project pipeline. The samples have been selected from a national representative set of approximately 30,300 samples and comprises 500 individuals of each gender in the extreme tail of high density lipoprotein (HDL) concentrations. Included individuals were between 25 and 65 years of age. Individuals with a diagnosis of diabetes or BMI>30 were excluded from the study.	Illumina HiSeq 2000	966
EGAD00001001252	DNA was derived from the primary tumour, lung metastasis, and peri-aortic lymph node metastasis. DNA from the spleen was used as a normal control.For WE sequencing we user Hybrid capture (Nimblegen version 3.0) of the lymph node and lung metastases, primary tumour and spleen normal; we generated ~100-fold coverage.		4
EGAD00001001253	DNA was derived from the primary tumour, lung metastasis, and peri-aortic lymph node metastasis. DNA from the spleen was used as a normal control.WG sequencing produced ~30-fold (primary tumour, spleen normal)-50-fold (lung metastasis) coverage		3
EGAD00001001256	Clonal hematopoiesis was investigated in patients with aplastic anemia using next-generation sequencing and single-nucleotide polymorphism (SNP) array-based karyotyping.	Illumina HiSeq 2000	186
EGAD00001001257		Illumina HiSeq 2000	3
EGAD00001001258		Illumina HiSeq 2000	5
EGAD00001001259		Illumina HiSeq 2000	2
EGAD00001001260		Illumina HiSeq 2000	2
EGAD00001001261	Bisulfite-Seq of CD14-positive, CD16-negative classical monocyte samples for methylome saturation and COMET analysis	Illumina HiSeq 2000	2
EGAD00001001262	Unaligned bam of 31 samples derived from primary tumor	Illumina Genome Analyzer IIx Illumina HiSeq 2000	31
EGAD00001001263	Unaligned bam of 31 samples derived from blood	Illumina Genome Analyzer IIx Illumina HiSeq 2000	31
EGAD00001001264	We propose to definitively characterise the somatic genetics of ER+ve, HER2-ve breast cancer through generation of comprehensive catalogues of somatic mutations in 500 cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses.	Illumina HiSeq 2000	223
EGAD00001001265	Genomic architecture of mesothelioma parent study is project 925. This project is set up in parallel to project 925 in order to Whole genome sequence ten of the 59 tumours in that project.	HiSeq X Ten	18
EGAD00001001266	Whole genome sequencing of primary angiosarcoma	HiSeq X Ten	12
EGAD00001001267	Anaplastic meningiomas are a rare, malignant variant of meningioma. At present there is no effective treatment for this cancer. The aim of the study is to identify somatic mutations in anaplastic meningiomas. We plan to sequence a set of 500 known cancer genes in 50 anaplastic meningioma and corresponding peripheral blood DNA samples. Bioinformatics will be used to analyse the results to assess the probability of these mutations being causal and so likely of critical importance for the tumour growth. Identification of these mutations will guide selection of appropriate compounds to effectively treat the disease.	HiSeq X Ten	60
EGAD00001001268	H9 human embryonic stem cells (hESCs) were cultured in feeder-free chemically-defined conditions in medium containing 10ng/ml Activin A and 12ng/ml FGF2 (Vallier L. 2011, Methods in Molecular Biology, 690: 57-66). Chromatin immunoprecipitation was performed as described in Brown S. et al. 2011. Stem Cells 29: 1176-85 by using 5ug of anti-DPY30 antibody (Sigma, cat. number HPA043761). This protocol was performed in control hESCs (expressing a scrambled shRNA) and in hESCs stably expressing an shRNA against DPY30 (Sigma, clone n. TRCN0000131112).This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2000	4
EGAD00001001269	Exome bam files of 75 Individuals From Multiply Affected Coeliac Families	Illumina Genome Analyzer II Illumina Genome Analyzer IIx	75
EGAD00001001271	Around 50 samples of pre-invasive lung cancer lesions showing subsequent clinical and pathological progression or regression	HiSeq X Ten	50
EGAD00001001272		Illumina HiSeq 2000	15
EGAD00001001273	Whole genome sequencing was performed with DNA extracted from fresh-frozen tumor and normal material. Short insert DNA libraries were prepared with the TruSeq DNA PCRfree sample preparation kit (Illumina) for paired-end sequencing at a minimum read length of 2x100bp. Human DNA libraries were sequenced to an average coverage of minimum 30x for both tumor and matched normal. Murine DNA libraries of tumor and matched normal were both sequenced to a coverage of 25x.	Illumina HiSeq 2000	100
EGAD00001001274	Brain samples for this dataset were provided by the Medical Research Council Sudden Death Brain and Tissue Bank (Edinburgh, UK). All four individuals sampled were of European descent, neurologically normal during life and confirmed to be neuropathologically normal by a consultant neuropathologist using histology performed on sections prepared from paraffin-embedded tissue blocks. Twelve regions of the central nervous system were sampled from each individual. The regions studied were: cerebellar cortex, frontal cortex, temporal cortex, occipital cortex, hippocampus, the inferior olivary nucleus (sub-dissected from the medulla), putamen, substantia nigra, thalamus, hypothalamus, intralobular white matter and cervical spinal cord.	Illumina HiSeq 2000	48
EGAD00001001275		Illumina HiSeq 2000	1
EGAD00001001276	McGill EMC Release 4 for cell type "induced pluripotent stem cell"	unspecified	8
EGAD00001001277	McGill EMC Release 4 in tissue "fat pad" for cell type "fat cell"	unspecified	1
EGAD00001001278	McGill EMC Release 4 in tissue "venous blood" for cell type "B cell"	unspecified	41
EGAD00001001279	McGill EMC Release 4 in tissue "venous blood" for cell type "CD4-positive helper T cell"	unspecified	55
EGAD00001001280	McGill EMC Release 4 in tissue "venous blood" for cell type "CD4-positive, alpha-beta T cell"	unspecified	40
EGAD00001001281	McGill EMC Release 4 in tissue "venous blood" for cell type "eosinophil"	unspecified	3
EGAD00001001282	McGill EMC Release 4 in tissue "venous blood" for cell type "Monocyte"	unspecified	82
EGAD00001001283	McGill EMC Release 4 in tissue "venous blood" for cell type "T cell"	unspecified	20
EGAD00001001284	McGill EMC Release 4 in tissue "Brodmann (1909) area 11"	unspecified	1
EGAD00001001285	McGill EMC Release 4 in tissue "Brodmann (1909) area 44"	unspecified	1
EGAD00001001286	McGill EMC Release 4 in tissue "Brodmann (1909) area 8;Brodmann (1909) area 9"	unspecified	1
EGAD00001001287	McGill EMC Release 4 in tissue "kidney"	unspecified	2
EGAD00001001288	McGill EMC Release 4 in tissue "skeletal muscle tissue"	unspecified	29
EGAD00001001289	McGill EMC Release 4 for assay "Bisulfite-seq": Methylation profiling by high-throughput sequencing	unspecified	44
EGAD00001001290	McGill EMC Release 4 for assay "RNA-seq": Transcriptome profiling by high-throughput sequencing	unspecified	261
EGAD00001001291	McGill EMC Release 4 for assay "mRNA-seq": Transcriptome profiling by high-throughput sequencing	unspecified	40
EGAD00001001292	McGill EMC Release 4 for assay "smRNA-seq": Transcriptome profiling by high-throughput sequencing	unspecified	6
EGAD00001001293	McGill EMC Release 4 for assay "ChIP-Seq Input"	unspecified	52
EGAD00001001294	McGill EMC Release 4 for assay "H3K27me3"	unspecified	32
EGAD00001001295	McGill EMC Release 4 for assay "H3K36me3"	unspecified	37
EGAD00001001296	McGill EMC Release 4 for assay "H3K4me1"	unspecified	41
EGAD00001001297	McGill EMC Release 4 for assay "H3K4me3"	unspecified	42
EGAD00001001298	McGill EMC Release 4 for assay "H3K27ac"	unspecified	36
EGAD00001001299	McGill EMC Release 4 for assay "H3K9me3"	unspecified	29
EGAD00001001300	McGill EMC Release 4 for assay "ATAC-seq": Sequencing of transposase-accessible chromatin as described by Buenrostro et al. (Nature Methods 10, 1213?1218 (2013) doi:10.1038/nmeth.2688)	unspecified	1
EGAD00001001301	Whole exome sequencing data of 5 patients diagnosed with FL that had undergone several relapse episodes without evidence of transformation	Illumina HiSeq 2500	29
EGAD00001001302		Illumina HiSeq 2500	2
EGAD00001001303	The dataset for the PROP1 study consists of samples of patients with combined pituitary hormone deficiency due to two most prevalent mutations in the PROP1 gene (c.301_302delGA and c.150delA) and healthy relatives and controls. All subjects were genotyped for 21 single nucleotide polymorphisms surrounding the PROP1 gene in order to assess the potential ancestral origin of the respective mutations. The genotype data are displayed in the vcf format.		328
EGAD00001001304	We used whole-genome bisulfite sequencing (WGBS) to generate unbiased DNA methylation maps of six purified B-cell subpopulations: hematopoietic progenitor cells (HPC); pre-B-II cells (preB2C); naive B cells from peripheral blood (naiBC); germinal center B cells (gcBC); memory B cells from peripheral blood (memBC) and plasma cells from bone marrow (bm-PC). WGBS was performed in 2 biological replicates from each subpopulation.	Illumina HiSeq 2000	10
EGAD00001001305	Dataset contains WES data from 3 astrocytoma patients: blood as control, primary tumor and recurrent tumor		9
EGAD00001001306	Human melanoma samples were collected pre, on, and progression on BRAF inhibitor therapy. RNA was extracted and run on RNA-seq. This has provided insights into different categories of BRAF inhibitor resistance mechanisms.	Illumina HiSeq 2000	38
EGAD00001001307	Genome and transcriptome sequence data from a metastatic colorectal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001001308	Genome and transcriptome sequence data from a primary unknown cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study	MinION PromethION	1
EGAD00001001309	Genome and transcriptome sequence data from an appendix cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001001310	Genome and transcriptome sequence data from a peritoneal mesothelioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001001311	Genome and transcriptome sequence data from a peritoneal mesothelioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001001312	Fastq data for whole genome bisulfite sequencing assays for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2000 Illumina HiSeq 2500	30
EGAD00001001313	We enriched a panel of cancer associated genes using the Custom Sure Select Target Enrichment Kit. Identified mutations were validated with deep sequencing in order to assess mutated allele frequencies more accurately.	Illumina MiSeq	10
EGAD00001001314	Sequence data from L1-amplicon libraries prepared from plasma-DNA from a set of 24 female controls and 18 male controls without malignant disease and samples from patients breast (n= 28) and prostate cancer patients (n=61).	Illumina MiSeq	125
EGAD00001001315	Phenotype determination by SNP-Typing using PCR and snapshotPCR with subsequent fragment analysis. We investigated 400 individuals from Northern Germany and detected up to 12 different SNPs to determine eye, hair and skin colour. More than 1000 different runs on a ABI3130 were performedThis dataset includes:- Phenotype information for 400 samples- Summary and complete genotype calls for 12 SNPs on 400 samples.		399
EGAD00001001316	Exome sequence analysis of individuals with severe early onset inflammatory bowel disease, and their families. Individuals are ascertained through the COLORS in IBD study, which includes centres throughout UK and Europe.	Illumina HiSeq 2000 Illumina HiSeq 2500	149
EGAD00001001317	This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2000 Illumina MiSeq	12
EGAD00001001319	The aim of this study is to ascertain whether leukaemic mutations exist within the blood of people with otherwise normal haematopoeisis. To satisfy this aim we plan to look for 7 known leukaemic mutations in the whole blood DNA of a large cohort of blood donors who have normal haematopoesis. Genomic regions around mutational sites have been amplified using a 2 step PCR process which involves barcoding of individual patients	Illumina MiSeq	5817
EGAD00001001320	This is a study to test ATAC-seq protocols. CD4+ and CD8+ cells have been obtained from three different anatomical compartments. We aim to assay open-chromatin regions across these cells and perform comparative analyses.This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2000 Illumina MiSeq	138
EGAD00001001321	This dataset includes WGS & WTS alignment data generated from 1 ATC tumor, its matched peripheral blood specimen and 3 authenticated ATC cell lines, THJ-16T, THJ-21T and THJ-29T. In addition, it includes WTS data from extra 4 unique anaplastic cell lines, ACT-1, C643, HTh7 and T238.	Illumina HiSeq 2000 Illumina HiSeq 2500	13
EGAD00001001322	A comprehensive characterisation and analysis of human breast cancers through whole-genome sequencing.	Illumina HiSeq 2000	196
EGAD00001001326	Whole genome sequencing of single adult t-cell leukemia/lymphoma case	Illumina HiSeq 2000	2
EGAD00001001329	Aligned Sequence (bam format), Duplicates removed		28
EGAD00001001330	In this experiment we have sequenced tumour normal pairs from patients presenting with CRC who have a prior history of inflammatory bowel disease. The idea is to identify driver mutations, new genes and novel pathways associated with the development of these malignancies.	Illumina HiSeq 2000	70
EGAD00001001331	The aim of this work is to apply an integrated systems approach to understand the biological underpinnings of large joint (hip and knee) osteoarthritis which culminates in the need for total joint replacement (TJR). In this pilot we will assess the feasibility of the approach in the relevant tissue. We will obtain diseased and non-diseased tissue (cartilage and endochondral bone) following TJR, coupled with a blood sample, from 12 patients. We will characterise the 12 pairs of diseased and non-diseased tissue samples in terms of transcription (RNASeq) The pilot will help assess the feasibility of isolating sufficient levels of starting material for the different approaches, and will instigate the development of analytical approaches to synthesising the resulting data.	Illumina HiSeq 2000	24
EGAD00001001332	Development of a method for separation and parallel sequencing of the genomes and transcriptomes of single cells.	HiSeq X Ten Illumina HiSeq 2500 Illumina MiSeq	700
EGAD00001001333	Whole exome sequencing BAM files for samples from the BRIDGE Consortium with pathogenic or likely pathogenic variants on genes linked to bleeding or platelet disorders.	Illumina HiSeq 2000	28
EGAD00001001334	We propose to definitively characterise the somatic genetics of breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses.	Illumina HiSeq 2000	-
EGAD00001001335	We propose to definitively characterise the somatic genetics of breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses.	Illumina Genome Analyzer II Illumina HiSeq 2000	-
EGAD00001001336	We propose to definitively characterise the somatic genetics of breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses.	Illumina HiSeq 2000	-
EGAD00001001337	We propose to definitively characterise the somatic genetics of breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses.	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina MiSeq	88
EGAD00001001338	We propose to definitively characterise the somatic genetics of breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses.	Illumina Genome Analyzer II Illumina HiSeq 2000	-
EGAD00001001339	We propose to definitively characterise the somatic genetics of breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses.	Illumina HiSeq 2000	1
EGAD00001001340	We propose to definitively characterise the somatic genetics of breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses.	Illumina HiSeq 2000	-
EGAD00001001341	We propose to definitively characterise the somatic genetics of breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing coupled with integrated transcriptomic and methylation analyses.	Illumina HiSeq 2000	-
EGAD00001001343	Data from the study of subclonal metastatic expansion in prostate cancer. Whole genome shotgun sequencing of fifteen samples, tumour and whole blood, from the four initial patients.	Illumina HiSeq 2000	15
EGAD00001001344	Data from the study of subclonal metastatic expansion in prostate cancer. Whole genome shotgun sequencing of six samples, tumour and whole blood, from the three additional patients whose somatic variants were examined in depth.	Illumina HiSeq 2000	6
EGAD00001001345	Data from the study of subclonal metastatic expansion in prostate cancer. RNA-seq of twelve samples, tumour and benign tissue, from the four initial patients.	Illumina HiSeq 2000	12
EGAD00001001347	Exome sequencing of a case of lethal EBV-driven LPD	Illumina HiSeq 2000	3
EGAD00001001349	Frequent somatic transfer of mitochondrial DNA into the nuclear genome of human cancer cells	Illumina HiSeq 2000	4
EGAD00001001350	Frequent somatic transfer of mitochondrial DNA into the nuclear genome of human cancer cells	Illumina HiSeq 2000	8
EGAD00001001351	Frequent somatic transfer of mitochondrial DNA into the nuclear genome of human cancer cells	Illumina HiSeq 2000	2
EGAD00001001352	Data files for CONSERTING (WGS)	Illumina HiSeq 2000	38
EGAD00001001353	Frequent somatic transfer of mitochondrial DNA into the nuclear genome of human cancer cells	Illumina HiSeq 2000	2
EGAD00001001354	Whole exome sequencing of around 700 inflammatory bowel disease cases.This data can only be used for the identification of IBD/immune-mediated disease loci.	Illumina HiSeq 2000	702
EGAD00001001355	DDD DATAFREEZE 2013-12-18: 1133 trios - VCF files (Ref: DDD Nature 2015)		1
EGAD00001001356	Neuroblastoma, a clinically heterogeneous pediatric cancer, is characterized by distinct genomic profiles but few recurrent mutations. As neuroblastoma is expected to have high degree of genetic heterogeneity, study of neuroblastoma's clonal evolution with deep coverage whole-genome sequencing of diagnosis and relapse samples will lead to a better understanding of the molecular events associated with relapse. Samples were included in this study if sufficient DNA from constitutional, diagnosis and relapse tumors was available for WGS. Whole genome sequencing was performed on trios (constitutional, diagnose and relapse DNA) from eight patients using Illumina Hi-seq2500 leading to paired-ends (PE) 90x90 for 6 of them and 100x100 for two. Expected coverage for sample NB0175 100x100bp was 30X for tumor and constitutional samples. For the seven other patients expected coverage was 80X for tumor samples with PE 100x100, 100X in the other tumor samples and 50X for all constitutional samples (see table 1). Following alignment with BWA (Li et al., Oxford J, 2009 Jul) allowing up to 4% of mismatches, bam files were cleaned up according to the Genome Analysis Toolkit (GATK) recommendations (Van der Auwera et al., Current Protocols in Bioinformatics, 2013, picard-1.45, GenomeAnalysisTK-2.2-16). Variant calling was performed in parallel using 3 variant callers: GenomeAnalysisTK-2.2-16, Samtools-0.1.18 and MuTect-1.1.4 (McKenna et al., Genome Res, 2010; Li et al., Oxford J, 2009 Aug; Cibulskis et al., Nature, 2013). Annovar-v2012-10-23 with cosmic-v64 and dbsnp-v137 were used for the annotation and RefSeq for the structural annotation. For GATK and Samtools, single nucleotide variants (SNVs) with a quality under 30, a depth of coverage under 6 or with less than 2 reads supporting the variant were filter out. MuTect with parameters following GATK and Samtools thresholds have been used to filter our irrelevant variants. .SNVs within and around exons of coding genes overlapping splice sites.. Then,variants reported in more than 1% of the population in the 1000 genomes (1000gAprl_2012) or Exome Sequencing Project (ESP6500) have been discarded in order to filter polymorphisms. Finally, synonymous variants were filtered out. MuTect focuses on somatic by filtering with constitutional sample. Mpileup comparison between constitutional and somatic DNAs allowed us to focus also on tumor specific SNVs with GATK and Samtools. Finally, every SNV called by our pipeline and also supported in any constitutional samples were filtered our in order to prevent putative constitutional DNA coverage deficiency. Then we analyzed CNVs (copy number variants) with HMMcopy-v0.1.1 (Gavin et al., Genome Res, 2012) and control-FREEC-v6.7 (Boeva et al., Bioinformatics 2011) with a respective window of 2000bp and 1000 bp, and auto-correction of normal contamination of tumor samples for Control-FREEC. Finally we explored Structural variants (SVs) including deletions, inversions, tandem duplications and translocations using DELLY-v0.5.5 with standard parameters (Rausch et al., Oxford J, 2012). In tumors, at least 10 supporting reads were required to make a call and 5 supporting reads for the sample NB0175 with a coverage of only 40X (see table 2). To predict SVs in constitutional samples for subsequent somatic filtering, only 2 supporting reads were required in order not to miss one. To identify somatic events, all the SVs in each normal sample were first flanked by 500 bp in both directions and any SVs called in a tumor sample which was in the combined flanked regions of respective normal sample was removed (see graph 1). Deletions with more than 5 genes impacted or larger than 1Mb and inversions or tandem duplications covering more than 4 genes, were removed. We focused on exonic and splicing events for deletions, inversions, and tandem duplications. For translocation, we keep all SVs that occurred in intronic, exonic, 5'UTR, upstream or splicing regions. Bioinformatics detection of variations with Deep sequencing approach Once PE reads merged and adaptors trimmed by SeqPrep with default parameters, merged reads were aligned via the BWA (Li H. and Durbin R. 2009 PMID 19451168) allowing up to 1 differences in the 22-base-long seeds and reporting only unique alignments. Only reads having a mapping quality 20 or more have been further analysed. Variant calling software was not used, since we aimed to predict variations at low frequencies, observed in less than 1% of reads. Such variants require a custom approach. Using DepthOfCoverage functions of the Genome Analysis Toolkit (GATK) v2.13.2 (McKenna A, et al., 2010 Genome Research PMID: 20644199), we focused on high quality coverage of bases A, C, G and T at the targeted variant position. Depth of coverage of each base following a mapping quality higher than 20 and a base quality higher than 10 have been taken into account in order to focus only on high quality data. Aiming to determine the background level of variability at the studied regions, 10 control samples were included in the analysis. The same approach and filtering criteria have been applied as introduced above over the entire amplicons. In order to highlight variants, for each sample the frequencies of each bases at each amplicon position were then compared to those observed in the set of controls. Statistical analyses were performed with the R statistical software (http://www.R-project.org). Fisher’s exact two-sided tests with a Bonferroni correction were performed to compare percentages of bases between the data sets, i.e. for a given base between a case and the controls. Finally, significant variations were filtered-in once (i) a significant increase in the percentage of avariant base and (ii) a significant decrease in the percentage of it's reference base following our p.values criteria was observed (p.val < 0.05).	Illumina HiSeq 2500	25
EGAD00001001357	Genomic characterisation of a large series of cancer cell lines.	Illumina HiSeq 2000	462
EGAD00001001358	463 newly diagnosed patients from the UK Myeloma XI clinical trial (NCT01554852) underwent whole exome sequencing plus targeted capture of the IGH/K/L and MYC loci. 200 ng of DNA were processed using NEBNext DNA library prepartion kit and hybridised to the SureSelect Human All Exon V5 Plus. Four samples were pooled and run on one lane of a HiSeq 2000 using 76-bp paired end reads. DNA from CD138+ selected bone marrow cells (myeloma tumour) as well as peripheral white blood cells were analysed and somatic mutations detected.	Illumina HiSeq 2000	926
EGAD00001001359	Dataset contains Exome-seq and RNA-seq from 2 GBM patients, as well as RNA-seq from the derived cultured cells (GNS).		6
EGAD00001001360	The majority of neuroblastoma patients have tumors that initially respond to chemotherapy, but a large proportion of patients will experience therapy-resistant relapses. The molecular basis of this aggressive phenotype is unknown. Whole genome sequencing of 23 paired diagnostic and relapsed neuroblastomas showed clonal evolution from the diagnostic tumor with a median of 29 somatic mutations unique to the relapse sample. Eighteen of the 23 relapse tumors (78%) showed RAS-MAPK pathway mutations. Seven events were detected only in the relapse tumor while the others showed clonal enrichment. In neuroblastoma cell lines we also detected a high frequency of activating mutations in the RAS-MAPK pathway (11/18, 61%) and these lesions predicted for sensitivity to MEK inhibition in vitro and in vivo. Our findings provide a rationale for genetic characterization of relapse neuroblastoma and show that RAS-MAPK pathway mutations may function as a biomarker for new therapeutic approaches to refractory disease.		221
EGAD00001001363	To generate an RNA-Seq dataset for organoids apically stimulated with Salmonella Typhimurium.These data are part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2500	12
EGAD00001001364	This dataset contains whole exome data from 8 esophageal adenocarcinoma tumors, that has been subjected to multiregion sequencing, ranging from 3-8 regions per tumor. In total, 40 tumor samples and 8 normal blood samples have been sequenced on Illumina HiSeq 2500 at a median dept of 90x.	Illumina HiSeq 2500	47
EGAD00001001372	All humans outside Africa are descendants of the same single exit, usually dated at 50-70 thousand years ago. However, the route taken out of Africa is still debated. The two main candidates are a northern route via Egypt and the Levant, or a southern route via Ethiopia and the Arabian Peninsula. We are generating genetic data to evaluate these two possibilities. In this study we propose to generate low-coverage sequencing data for 100 Egyptian samples.	Illumina HiSeq 2000	100
EGAD00001001373	The mtDNA and Y chromosome of up to 15 Australian Aborigines, concentrating on individuals with indigenous lineages, will be sequenced using the standard whole-genome sequencing followed by filtering out of autosomal and X sequences, so that only mtDNA and the Y chromosome will be analysed and released.	Illumina HiSeq 2000	7
EGAD00001001374	The mtDNA and Y chromosome of up to 15 Australian Aborigines, concentrating on individuals with indigenous lineages will be sequenced using the standard whole-genome sequencing followed by filtering out autosomal and X sequences, so that only mtDNA and the Y chromosome would be analysed and released.	Illumina HiSeq 2500	6
EGAD00001001375	Samples will be from the BRF113683 (BREAK-3) study which is a Phase III Randomized, Open-label Study Comparing GSK2118436 to Dacarbazine (DTIC) in Previously Untreated Subjects With BRAF Mutation Positive Advanced (Stage III) or Metastatic (Stage IV) Melanoma (n=250 enrolled)NGS [Agilent capture (Sanger V2 panel): 360 genes and 20 gene fusions; Illumina HiSEQ Sequencing]CNV: [via NGS or Affy SNP 6.0 or Illumina Omni (TBD)]Bioinformatics: Analysis will be performed using core Sanger informatics pipelines similar to those previously described (Papaemmanuil E et al. (2013) Blood. 22:3616 -3627). Briefly, copy number analysis will be performed using the ASCAT algorithm, and base substitutions, small insertions and deletions using the CAVEMAN and Pindel algorithms, respectively. Statistical approaches including generalized linear models will be used to predict clinical variables such as maximum clinical response and duration of response using genetic data. Sanger and EBI to conduct analysis; Raw data and correlation with clinical endpoints to be analyzed by both EBI/Sanger and GSK (unique pipeline analyses to increase call confidence)	Illumina HiSeq 2500	169
EGAD00001001379		Illumina HiSeq 2000	29
EGAD00001001380	All humans outside Africa are descendants of the same single exit, usually dated at 50-70 thousand years ago. However, the route taken out of Africa is still debated. The two main candidates are a northern route via Egypt and the Levant, or a southern route via Ethiopia and the Arabian Peninsula. We are generating genetic data to evaluate these two possibilities. In this study we propose to generate high-coverage sequencing data for 3 Egyptian samples.	Illumina HiSeq 2000	3
EGAD00001001381	This dataset includes 69 sampels of whole-exome sequencing data of high-grade serous ovarian carcinoma (HGSOC). We included patients with advanced (International Federation of Gynecology and bstetrics [FIGO] stage IIIeIV) HGSOC for which biopsies were obtained during debulking surgery, the first at initial diagnosis and the second at disease relapse. Where possible, matched normal DNA from each participating patient was obtained from a whole-blood sample. Written informed consent was obtained from all patients and approved by the local ethics committee.	Illumina HiSeq 2000	69
EGAD00001001382	TwinsUK whole exome sequencing using NimbleGen SeqCap EZ		248
EGAD00001001383	TwinsUK whole exome sequencing using NimbleGen 2.1M SeqCap		242
EGAD00001001384	Mutations that activate the RAF-MEK-ERK signaling pathway, in particular BRAFV600E, occur in many cancers, and mutant BRAF-selective inhibitors have clinical activity in these diseases. Activating BRAF alleles are usually considered to be mutually exclusive with mutant RAS, whereas inactivating mutations in the D594F595G596 motif of the BRAF activation segment can coexist with oncogenic RAS and cooperate via paradoxical MEK/ERK activation. We determined the functional consequences of a largely uncharacterized BRAF mutation, F595L, which was detected along with an HRASQ61R allele by clinical exome sequencing in a patient with histiocytic sarcoma and also occurs in epithelial cancers, melanoma, and neuroblastoma, and investigated its interaction with mutant RAS. We demonstrate that, unlike previously described DFG motif mutants, BRAFF595L is a gain-of-function variant with intermediate activity towards MEK that does not act paradoxically, but nevertheless cooperates with mutant RAS to promote oncogenic signaling. Of immediate clinical relevance, BRAFF595L shows divergent responses to different mutant BRAF-selective inhibitors, whereas signaling driven by BRAFF595L with and without mutant RAS is efficiently blocked by pan-RAF and MEK inhibitors. Mutation data from primary patient samples and cell lines show that BRAFF595L, as well as other BRAF mutations with intermediate activity, frequently coincide with mutant RAS in a broad spectrum of cancers. These data define a novel class of activating BRAF mutations that cooperate with oncogenic RAS in a non-paradoxical fashion to achieve an optimal level of MEK-ERK signaling, extend the spectrum of patients with systemic histiocytic disorders and other malignancies who are candidates for therapeutic blockade of the RAF-MEK-ERK pathway, and underscore the value of comprehensive genetic profiling for understanding the signaling requirements of individual cancers.	Illumina HiSeq 2500	2
EGAD00001001385	Exome sequencing in 3 Möbius patients	AB SOLiD 4 System	3
EGAD00001001386	Whole Genome Sequencing of Huh7 cell lines	Illumina HiSeq 2000 Illumina HiSeq 2500	2
EGAD00001001387	Using high-throughput sequencing technologies and analytical tools, we conduct an exome sequencing study that will help understand the population genetics of a Croatian island isolate, in a sample of 200 subjects from the Adriatic island of Vis who were selected to reflect islanders with at least four known ancestors in grandparental line who are original islanders.	Illumina HiSeq 2000	193
EGAD00001001388	Whole-genome bisulfite sequencing (WGBS) on 30 breast cancer cases from the BASIS project.	Illumina HiSeq 2000	-
EGAD00001001389	Genome wide CRISPR screen was performed to find resistance to targeted drugs for melanoma and lung	Illumina HiSeq 2500	15
EGAD00001001390	Human monocytes from a healthy male blood donor were obtained after written informed consent and anonymised. Library preparation was performed essentially as described in the “Whole‐genome Bisulfite Sequencing for Methylation Analysis (WGBS)” protocol as released by Illumina. The library was sequenced on an Illumina HiSeq2500 using 101 bp paired-end sequencing. Read mapping was done with BWA.		1
EGAD00001001391		Illumina HiSeq 2000	3
EGAD00001001393	The aim of this study is to assess translational changes in macrophages over a time course of Salmonella infection.This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2000	52
EGAD00001001394	Samples from Ross Innes et. al 2015 - doi:10.1038/ng.3357	Illumina HiSeq 2000	1
EGAD00001001395	Background: Invasive lobular breast cancer (ILBC) is the second most common histological subtype after ductal breast cancer (IDBC). In spite of significant clinical and pathological differences, ILBC is still treated as IDBC. Here, we aimed at identifying recurrent genomic alterations in ILBC with potential clinical implications.Methods: Starting from 630 ILBC primary tumors with a median follow up of 10 years, we interrogated oncogenic substitutions and indels of 360 cancer genes and genome-wide copy number alterations in 413 and 170 ILBC samples, respectively, and correlated those findings with clinical, pathological, and outcome features. The Cancer Genome Atlas database was used for comparison of frequency estimates.Results: Besides the high mutation frequency of CDH1 in 65% of the tumors, alterations in one of the three key genes of the PI3K pathway, PIK3CA, PTEN and AKT1, were present in more than half of the cases. ERBB2 and ERBB3 were mutated in 5.1 and 3.6% of the tumors. FOXA1 mutations and ESR1 copy number gains were detected in 9% and 25% of the samples. All these alterations were more frequent in ILBC than IDBC. The histological diversity of ILBC was associated with specific genomic alterations, such as enrichment for ERBB2 mutations in the mixed, non-classic subtype, and for ARID1A mutations and ESR1 gains in the solid subtype. Finally, ERBB2 and AKT1 mutations were associated with short-term risk of relapse, and chromosome 1q and 11p gain with increased and decreased breast cancer free survival, respectively.Conclusion: ERBB2, ERBB3 and AKT1 mutations represent high prevalence therapeutic targets in ILBC. FOXA1 mutations and ESR1 gains urgently deserve dedicated clinical investigation, especially in the context of endocrine treatment.	Illumina HiSeq 2000	541
EGAD00001001397	We sequenced 292 patients who were suffering NSCLC with Whole genome sequencing or Exome sequencing method.	Illumina HiSeq 2000	72
EGAD00001001398	We sequenced 205 patients who were suffering NSCLC with Exome sequencing method.	Illumina HiSeq 2000	147
EGAD00001001399	Data represent genome-wide DNA methylation profiles obtained by MethylCap-seq (Diagenode’s MethylCap-kit based purification followed by Illumina GAIIx sequencing), for 70 brain tissue samples, including 65 glioblastoma samples and 5 non-tumoral tissues (obtained from epilepsy surgery).	Illumina Genome Analyzer IIx	70
EGAD00001001400	Fastq data for whole genome shotgun sequencing assay for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2000 Illumina HiSeq 2500	27
EGAD00001001401	Fastq data for smRNA-Seq assays assay for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2000	28
EGAD00001001402	Fastq data for stranded mRNA-Seq assays assay for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2000 Illumina HiSeq 2500	32
EGAD00001001403	Fastq data for ChIP-Seq (H3K27ac) assays assay for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2000 Illumina HiSeq 2500	48
EGAD00001001404	Fastq data for ChIP-Seq (H3K27me3) assays assay for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2000 Illumina HiSeq 2500	48
EGAD00001001405	Fastq data for ChIP-Seq (H3K36me3) assays assay for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2000 Illumina HiSeq 2500	48
EGAD00001001406	Fastq data for ChIP-Seq (H3K4me1) assays assay for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2000 Illumina HiSeq 2500	48
EGAD00001001407	Fastq data for ChIP-Seq (H3K4me3) assays assay for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2000 Illumina HiSeq 2500	48
EGAD00001001408	Fastq data for ChIP-Seq (H3K9me3) assays assay for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2000 Illumina HiSeq 2500	48
EGAD00001001409	Fastq data for ChIP-Seq (Input) assays assay for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2000 Illumina HiSeq 2500	48
EGAD00001001410	Whole-exome sequencing of 81 tumor/normal pairs of adult T-cell leukemia/lymphoma	Illumina HiSeq 2000	162
EGAD00001001411	RNA sequencing of 57 tumor samples of adult T-cell leukemia/lymphoma as well as 3 samples of HTLV-1 carrier and 3 samples of healthy volunteers.	Illumina HiSeq 2000	63
EGAD00001001412	Whole genome sequencing of 48 tumor/normal pairs obtained from adult T-cell leukemia/lymphoma. This data set includes 11 full-pass WGS and 37 low-pass WGS data.	HiSeq X Ten Illumina HiSeq 2000	96
EGAD00001001413	DDD DATAFREEZE 2013-12-18: 1133 trios - README, family trios, phenotypes, validated DNMs (Ref: DDD Nature 2015)		1
EGAD00001001415	DATA FILES FOR PCGP Dyer_iPSC WGS	Illumina HiSeq 2000	2
EGAD00001001416	DATA FILES FOR PCGP Dyer_iPSC TEBS	Illumina HiSeq 2000	18
EGAD00001001417	bam files associated with the study EGAS00001001205		6
EGAD00001001418	DATA FILES FOR PCGP Dyer_iPSC 5hmc	Illumina HiSeq 2000	8
EGAD00001001421	Clinical Implications of Genomic Alterations in the Tumour and Circulation of Pancreatic Cancer Patients	Illumina MiSeq	125
EGAD00001001422	HipSci - Bardet-Biedl Syndrome - Exome Sequencing - April 2015	Illumina HiSeq 2000	3
EGAD00001001423		Illumina HiSeq 2000	7
EGAD00001001424	We obtained paired longitudinal specimens from a total of 38 glioblastoma (GBM) patients (34 primary and 4 secondary GBM patients). Treatment-naive initial tumors were available for 35 cases; for the other 3 cases, we used the first available recurrent tumors in lieu of initial tumors. Tumor specimens were subjected to whole-exome sequencing (27 of 38 cases, with the matched normal/blood for 22 of the 27 cases) and transcriptome sequencing (30 of 38 cases).	Illumina HiSeq 2000 Illumina HiSeq 2500	141
EGAD00001001425	The objectives of this project are the identification of markers related to cancer therapy resistance in the blood of breast cancer patients and to study the genetic changes in cancer cells during this development of resistance. Whole genome amplified DNA from Circulating Tumor Cells (CTCs), selected during the course of systemic treatment from blood of metastatic breast cancer patients, will be exome sequenced . The patients selected for this study did not respond to therapy.	Illumina HiSeq 2000	149
EGAD00001001426	Systematic next generation sequencing efforts are beginning to define the genomic landscape across a range of primary tumours, but we know very little of the mutational evolution that contributes to disease progression. We therefore propose to obtain a comprehensive description of genomic, transcriptomic and epigenomic changes in a cohort of matched primary and metastatic colorectal cancers, and additionally to explore the extent to which those mutations identified as recurrent in the metastatic setting are able to subvert normal biological processes using both genetically engineered mouse models and established cancer cell lines. This study will enable us to define to what extent primary tumour profiling can capture the biological processes operative in matched metastases as well as the significance of intratumoural heterogeneity. This dataset contains all the data available for this study on 2015-07-02.	Illumina HiSeq 2000	446
EGAD00001001427	Targeted cancer gene sequencing of samples enrolled in the SSGXVIII trial from Finland.	Illumina HiSeq 2000	312
EGAD00001001428	Identification of human deubiquitylating enzymes whose knock out result in hypersensitivity to DNA damaging agents, by comparing the sequence reads of 'barcode region' from mixed cell culture.	Illumina HiSeq 2000	6
EGAD00001001429	Profiling subclonal architecture and phylogeny in tumors by whole-genome sequence data mining and single-cell genome sequencing	HiSeq X Ten	2
EGAD00001001430	Investigation into causal genes underlying anaplastic meningioma	Illumina HiSeq 2000	73
EGAD00001001431	SCLC - RNA sequencing data Publication Peifer et al., 2012, Nature Genetics	Illumina HiSeq 2000	15
EGAD00001001432	PCGP Germline Study Whole Genome Sequencing	Illumina HiSeq 2000	1337
EGAD00001001433	PCGP Germline Study Whole Exome Sequencing	Illumina HiSeq 2000	906
EGAD00001001435	Aligned whole genome bisulfite sequencing data for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.		30
EGAD00001001436		AB 5500 Genetic Analyzer	4
EGAD00001001437	HipSci - Healthy Normals - Exome Sequencing - April 2015	Illumina HiSeq 2000	122
EGAD00001001438	HipSci - Healthy Normals - RNA Sequencing - May 2015	Illumina HiSeq 2000	116
EGAD00001001439	Mammary cell samples from donors 28/32/33. Contains 12 MiSeq sequencefiles and 12 alignment files derived from HiSeq runs.	Illumina MiSeq	12
EGAD00001001440	This project entailed generation of high depth WGS (30x) of 100 individuals from the general Greek population.	HiSeq X Ten	100
EGAD00001001441	Despite the established role of the transcription factor MYC in cancer, little is known about the impact of a new class of transcriptional regulators, the long non-coding RNAs (lncRNAs), on the way MYC is able to influence cellular transcriptome. To this aim we have intersected RNA-sequencing data from two MYC-inducible cell lines and from a cohort of 91 mature B-cell lymphomas carrying, or not carrying, genetic variants resulting in MYC over-expression. By this approach, we identified 13 lncRNAs differentially expressed in IG-MYC-positive Burkitt lymphoma and regulated in the same direction by MYC in the model cell lines. Among them we focused on a lncRNA that we named MINCR, for MYC-Induced long Non-Coding RNA, showing a strong correlation with MYC expression in MYC-positive lymphomas and also in pancreatic ductal adenocarcinomas. To understand its cellular role we performed RNA interference (RNAi) experiments and found that MINCR knock-down is associated with a reduction in cellular viability, due to an impairment in cell cycle progression. Differential gene expression analysis following RNAi showed a strongly significant enrichment of cell cycle genes among the genes down-regulate following MINCR knock-down. Interestingly these genes are enriched in MYC binding sites in their promoters, suggesting that MINCR acts as a modulator of MYC transcriptional program. Accordingly, following MINCR knock-down, we observed a reduction in the binding of MYC to the promoters of selected cell cycle genes. Finally we provide evidences that down-regulation of AURKA, AURKB and CTD1 may explain the reduction in cellular proliferation observed upon MINCR knock-down. We therefore suggest that MINCR is a newly identified player in the MYC transcriptional network able to control the expression of cell cycle genes.	Illumina HiSeq 2000 Illumina HiSeq 2500	49
EGAD00001001442	This project is to explore the contribution of de novo mutations to severe structural malformations diagnosed prenatally using ultrasound. These malformations include heart, CNS, renal and GI abnormalities. In this pilot project we aim to exome sequence 30 parent-foetus trios to ~50X mean coverage and identify de novo functional variants using an algorithm developed in the Hurles group	Illumina HiSeq 2000	86
EGAD00001001443	RNASeq sequencing. Each library was sequenced using TruSeq SBS Kit v3-HS, in paired-end mode with a read length of 2 × 76 bp. We generated more than 20 million paired-end reads for each sample in a fraction of a sequencing lane on HiSeq2000 (Illumina Inc.) following the manufacturer’s protocol. Image analysis, base calling and quality scoring of the run were processed using the manufacturer’s software Real Time Analysis (RTA 1.13.48) and followed by generation of FASTQ sequence files.	Illumina Genome Analyzer II	199
EGAD00001001444	Atypical teratoid/rhabdoid tumor (ATRT) is one of the most common brain tumors in infants and young children. Although the prognosis of ATRT patients is poor, some patients respond very well to current treatments, suggesting inter-tumor molecular heterogeneity. To investigate this further, we genetically and epigenetically analyzed a large cohort of ATRTs (n = 170). Three distinct molecular subgroups of ATRTs, associated with differences in demographics, tumor location and type of SMARCB1 alterations, were identified using DNA-methylation or gene expression analyses. Whole genome DNA- and RNA-sequencing found no other recurrent mutations explaining the differences between subgroups. However, whole genome bisulfite-sequencing and H3K27Ac ChIP-sequencing of primary tumors revealed clear differences in methylation patterns and enhancer landscapes, leading to the identification of subgroup-specific regulatory networks.	Illumina HiSeq 2000 Illumina HiSeq 2500	55
EGAD00001001445	Deep sequencing of melanoma for driver mutations	Illumina MiSeq	3
EGAD00001001446	Genomic and transcriptomic characterization of drug-resistant colon cancer stem cell lines.	Illumina HiSeq 2000	4
EGAD00001001447	Whole genome sequencing of single cell derived organoids from normal colon tissue and colorectal cancer.	HiSeq X Ten	19
EGAD00001001448	Testing the feasibility of genome-scale sequencing in routinely collected formalin-fixed paraffin-embedded (FFPE) cancer specimens versus matched fresh-frozen samples using targeted pulldown capture prior to Illumina sequencing.	Illumina MiSeq	11
EGAD00001001449	PCR products were obtained from each target loci using genomic DNA from human iPS cells. Subsequently, PCR products are pooled and subjected to Illumina library preparation. The library will be sequenced either by HiSeq or MiSeq. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina MiSeq	6
EGAD00001001450	This study is to ascertain whether it is feasible to extract single cell from a tumour, perform amplification, generate a library and sequence a targeted pulldown.	Illumina HiSeq 2000	3
EGAD00001001451	JMML targeted sequencing of candidate genes	Illumina MiSeq	75
EGAD00001001452	Anaplastic oligodendrogliomas (AOs) are rare primary brain tumors which are generally incurable, with heterogeneous prognosis and few treatment targets identified. Most oligodendrogliomas have chromosome 1p/19q co-deletion and IDH mutation. We analyzed 51 AOs by whole-exome sequencing, identifying previously reported frequent somatic mutations in CIC and FUBP1. We also identified recurrent mutations in TCF12 and in an additional series of 83 AO. Overall 7.5% of AO are mutated for TCF12, which encodes an oligodendrocyte-related transcription factor. 80% of TCF12 mutations identified were in either the bHLH domain, which is important for TCF12 function as a transcription factor, or were frame shift mutations leading to TCF12 truncated for this domain. We show that these mutations compromise TCF12 transcriptional activity and are associated with a more aggressive tumor type. Our analysis provides further insights into the unique and shared pathways driving AO.	Illumina HiSeq 2000	102
EGAD00001001453	The project is to evaluate the genomic binding sites of the histone demethylase JARID1C. This gene was recently identified in CGP as a novel recessive cancer gene in human renal cell carcinoma.	Illumina Genome Analyzer II	4
EGAD00001001454	Previously we performed deep WGS on 6 parents and 13 children from 3 large families from the Scottish Family Health Study to identify de novo mutations. This prelim is cover the additional sequencing of one grandchild from one of these three families. The inclusion of a third generation individual will provide additional experimental validation for the de novo mutations found in the initial trio. As in the previous study, the DNA will be WGS to a depth of approximately 25X to achieve this purpose.These data can only be used for the investigation of the genetic causes of the reported clinical phenotypes in these patients	Illumina HiSeq 2000	1
EGAD00001001456	1000Genomes imputed data set of 581 cases and 417 controls for male-pattern baldness		1
EGAD00001001457	All samples from the "100" project	Illumina HiSeq 2000	24
EGAD00001001458	Whole genome sequencing of EBV-transformed B cells in order to determine whether EBV induction of activation-induced cytidine deaminase (AID) produces genome-wide mutations and/or chromosomal rearrangements.	HiSeq X Ten	12
EGAD00001001459	Transcriptome sequencing of tumour tissue, adjacent normal tissue and derived organoids/tumoroids from colorectal cancer. This dataset contains all the data available for this study on 2015-08-05.	Illumina HiSeq 2000	76
EGAD00001001460	Whole-exome sequencing of a cohort of families (probands and affected/unaffected relatives) suffering from one of two rare thyroid disorders: congenital hypothyroidism (CH) and resistance to thyroid hormone (RTH). This dataset contains all the data available for this study on 2015-08-05.	Illumina HiSeq 2000	62
EGAD00001001461	CBP has opposing functions during cerebellar development and is a targetable tumor suppressor at late stages of medulloblastoma initiation		30
EGAD00001001462	Exome sequencing of 142 samples with corresponding Sanger sequencing results for 416 variants and 288 negative sites. DNA library preps prepared with Illumina TruSeq sample preparation kit. The captured DNA libraries were PCR amplified using the supplied paired-end PCR primers. Sequencing was performed with an Illumina HiSeq2000 (SBS Kit v3, one pool per lane) generating 2x101-bp reads.	Illumina HiSeq 2500	142
EGAD00001001464	Exome Sequencing. 3 μg of genomic DNA from each sample were sheared and used for the construction of a paired-end sequencing library as described in the paired-end sequencing sample preparation protocol provided by Illumina41. Enrichment of exonic sequences was then performed for each library using either the Sure Select Human All Exon 50 Mb or All Exon+UTRs v4 kits following the manufacturer’s instructions (Agilent Technologies). Exon-enriched DNA was pulled down by magnetic beads coated with streptavidin (Invitrogen), followed by washing, elution and 18 additional cycles of amplification of the captured library. Enriched libraries were sequenced (2 × 76 bp) in one lane of an Illumina GAIIx sequencer or in two lanes of a HiSeq2000 when using pools of eight samples.		-
EGAD00001001465	18 Exomes for discovery set and 60 Targeted panel for prevalence set	Illumina HiSeq 2000	127
EGAD00001001466	Whole Genome sequencing. 2 μg of genomic DNA from each sample was used for the construction of two short-insert paired-end sequencing libraries. Both types of libraries were sequenced in paired-end mode on Illumina GAIIx (2 × 151 bp) using Sequencing kit v4 or Illumina HiSeq2000 (2x101 bp) using TruSeq SBS Kit v3.		-
EGAD00001001467	WGS of 8 trios - affected child and both normal parents		24
EGAD00001001468	PAR-CLIP was performed on the Argonaute-2 protein (AGO2) in four lymphoma cell lines:NamalwaRajiSU-DHL-4SU-DHL-6	Illumina HiSeq 2500	4
EGAD00001001469	RNA-Seq data for 1 T-cell acute leukemia sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	1
EGAD00001001470	ChIP-Seq data for 2 plasma cell sample(s). 13 run(s), 12 experiment(s), 12 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	2
EGAD00001001471	RNA-Seq data for 11 Multiple myeloma sample(s). 11 run(s), 11 experiment(s), 11 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	11
EGAD00001001472	ChIP-Seq data for 2 effector memory CD8-positive, alpha-beta T cell sample(s). 10 run(s), 10 experiment(s), 10 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	2
EGAD00001001473	Bisulfite-Seq data for 2 cytotoxic CD56-dim natural killer cell sample(s). 24 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	2
EGAD00001001474	RNA-Seq data for 14 mature neutrophil sample(s). 14 run(s), 14 experiment(s), 14 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	14
EGAD00001001475	DNase-Hypersensitivity data for 1 CD8-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_dnaseseq_analysis_20150820	Illumina HiSeq 2000	1
EGAD00001001476	DNase-Hypersensitivity data for 4 CD14-positive, CD16-negative classical monocyte sample(s). 4 run(s), 4 experiment(s), 4 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_dnaseseq_analysis_20150820	Illumina HiSeq 2000	4
EGAD00001001477	RNA-Seq data for 3 neutrophilic myelocyte sample(s). 3 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	3
EGAD00001001478	RNA-Seq data for 1 CD8-positive, alpha-beta thymocyte sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	1
EGAD00001001479	Bisulfite-Seq data for 1 memory B cell sample(s). 20 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	1
EGAD00001001480	RNA-Seq data for 3 inflammatory macrophage sample(s). 3 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	3
EGAD00001001481	ChIP-Seq data for 15 Acute Myeloid Leukemia sample(s). 75 run(s), 72 experiment(s), 72 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	15
EGAD00001001482	Bisulfite-Seq data for 6 Acute Myeloid Leukemia sample(s). 66 run(s), 6 experiment(s), 6 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	6
EGAD00001001483	RNA-Seq data for 1 CD3-negative, CD4-positive, CD8-positive, double positive thymocyte sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	1
EGAD00001001484	Bisulfite-Seq data for 2 erythroblast sample(s). 35 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	2
EGAD00001001485	ChIP-Seq data for 3 Acute Myeloid Leukemia - SAHA sample(s). 11 run(s), 11 experiment(s), 11 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	3
EGAD00001001486	Bisulfite-Seq data for 2 endothelial cell of umbilical vein (resting) sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	2
EGAD00001001487	ChIP-Seq data for 2 endothelial cell of umbilical vein (proliferating) sample(s). 12 run(s), 12 experiment(s), 12 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	2
EGAD00001001488	RNA-Seq data for 2 CD8-positive, alpha-beta T cell sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	2
EGAD00001001489	RNA-Seq data for 1 CD4-positive, alpha-beta thymocyte sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	1
EGAD00001001490	ChIP-Seq data for 6 Acute promyelocytic leukemia sample(s). 29 run(s), 27 experiment(s), 27 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	6
EGAD00001001491	Bisulfite-Seq data for 6 inflammatory macrophage sample(s). 83 run(s), 6 experiment(s), 6 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	6
EGAD00001001492	RNA-Seq data for 4 megakaryocyte-erythroid progenitor cell sample(s). 4 run(s), 4 experiment(s), 4 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	4
EGAD00001001493	Bisulfite-Seq data for 1 hematopoietic multipotent progenitor cell sample(s). 5 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	1
EGAD00001001494	Bisulfite-Seq data for 1 memory B cells sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	1
EGAD00001001495	ChIP-Seq data for 4 neutrophilic metamyelocyte sample(s). 18 run(s), 12 experiment(s), 12 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	4
EGAD00001001496	RNA-Seq data for 2 endothelial cell of umbilical vein (resting) sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	2
EGAD00001001497	Bisulfite-Seq data for 2 conventional dendritic cell sample(s). 30 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	2
EGAD00001001498	Bisulfite-Seq data for 5 alternatively activated macrophage sample(s). 79 run(s), 5 experiment(s), 5 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	5
EGAD00001001499	ChIP-Seq data for 1 central memory CD4-positive, alpha-beta T cell sample(s). 9 run(s), 7 experiment(s), 7 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	1
EGAD00001001500	RNA-Seq data for 2 CD38-negative naive B cell sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	2
EGAD00001001501	RNA-Seq data for 3 granulocyte monocyte progenitor cell sample(s). 3 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	3
EGAD00001001502	ChIP-Seq data for 2 germinal center B cell sample(s). 12 run(s), 11 experiment(s), 11 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	2
EGAD00001001503	ChIP-Seq data for 1 CD3-positive, CD4-positive, CD8-positive, double positive thymocyte sample(s). 3 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	1
EGAD00001001504	RNA-Seq data for 3 band form neutrophil sample(s). 3 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	3
EGAD00001001505	ChIP-Seq data for 7 CD4-positive, alpha-beta T cell sample(s). 39 run(s), 39 experiment(s), 39 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	7
EGAD00001001506	RNA-Seq data for 8 CD14-positive, CD16-negative classical monocyte sample(s). 8 run(s), 8 experiment(s), 8 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	8
EGAD00001001507	Bisulfite-Seq data for 1 mature eosinophil sample(s). 15 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	1
EGAD00001001508	ChIP-Seq data for 9 mature neutrophil sample(s). 48 run(s), 45 experiment(s), 45 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	9
EGAD00001001509	Bisulfite-Seq data for 1 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 14 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	1
EGAD00001001510	Bisulfite-Seq data for 2 endothelial cell of umbilical vein (proliferating) sample(s). 36 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	2
EGAD00001001511	ChIP-Seq data for 4 band form neutrophil sample(s). 18 run(s), 17 experiment(s), 17 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000 NextSeq 500	4
EGAD00001001512	RNA-Seq data for 1 effector memory CD8-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	1
EGAD00001001513	ChIP-Seq data for 5 CD8-positive, alpha-beta T cell sample(s). 26 run(s), 26 experiment(s), 26 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	5
EGAD00001001514	ChIP-Seq data for 4 alternatively activated macrophage sample(s). 22 run(s), 22 experiment(s), 22 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	4
EGAD00001001515	RNA-Seq data for 6 hematopoietic stem cell sample(s). 13 run(s), 6 experiment(s), 6 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	6
EGAD00001001516	Bisulfite-Seq data for 3 CD4-positive, alpha-beta T cell sample(s). 61 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	3
EGAD00001001517	ChIP-Seq data for 4 neutrophilic myelocyte sample(s). 14 run(s), 14 experiment(s), 14 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	4
EGAD00001001518	ChIP-Seq data for 4 cytotoxic CD56-dim natural killer cell sample(s). 17 run(s), 17 experiment(s), 17 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	4
EGAD00001001519	ChIP-Seq data for 6 naive B cell sample(s). 34 run(s), 28 experiment(s), 28 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	6
EGAD00001001520	RNA-Seq data for 3 mature neutrophil - G-CSF/Dex. Treatment (16-20 hrs) sample(s). 3 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	3
EGAD00001001521	RNA-Seq data for 3 cytotoxic CD56-dim natural killer cell sample(s). 3 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	3
EGAD00001001522	Bisulfite-Seq data for 2 plasma cell sample(s). 17 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	2
EGAD00001001523	RNA-Seq data for 4 plasma cell sample(s). 4 run(s), 4 experiment(s), 4 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	4
EGAD00001001524	DNase-Hypersensitivity data for 1 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_dnaseseq_analysis_20150820	Illumina HiSeq 2000	1
EGAD00001001525	RNA-Seq data for 1 mature eosinophil sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	1
EGAD00001001526	RNA-Seq data for 1 effector memory CD4-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	1
EGAD00001001527	ChIP-Seq data for 3 mature neutrophil - G-CSF/Dex. Treatment (16-20 hrs) sample(s). 18 run(s), 18 experiment(s), 18 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	3
EGAD00001001528	ChIP-Seq data for 1 Leukemia sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	1
EGAD00001001529	Bisulfite-Seq data for 1 precursor B cell sample(s). 6 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	1
EGAD00001001530	Bisulfite-Seq data for 1 Acute Myeloid Leukemia - CTR sample(s). 18 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	1
EGAD00001001531	RNA-Seq data for 1 class switched memory B cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	1
EGAD00001001532	RNA-Seq data for 4 monocyte - None sample(s). 4 run(s), 4 experiment(s), 4 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	4
EGAD00001001533	ChIP-Seq data for 4 Acute Myeloid Leukemia - CTR sample(s). 21 run(s), 21 experiment(s), 21 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	4
EGAD00001001534	RNA-Seq data for 5 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 23 run(s), 5 experiment(s), 5 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	5
EGAD00001001535	RNA-Seq data for 2 endothelial cell of umbilical vein (proliferating) sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	2
EGAD00001001536	ChIP-Seq data for 1 Acute Myeloid Leukemia - MC2884 sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	1
EGAD00001001537	Bisulfite-Seq data for 3 Acute promyelocytic leukemia sample(s). 24 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	3
EGAD00001001538	RNA-Seq data for 3 common myeloid progenitor sample(s). 3 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	3
EGAD00001001539	ChIP-Seq data for 2 mature eosinophil sample(s). 12 run(s), 12 experiment(s), 12 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	2
EGAD00001001540	RNA-Seq data for 1 conventional dendritic cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	1
EGAD00001001541	Bisulfite-Seq data for 1 effector memory CD8-positive, alpha-beta T cell, terminally differentiated sample(s). 15 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	1
EGAD00001001542	RNA-Seq data for 1 memory B cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	1
EGAD00001001543	RNA-Seq data for 1 central memory CD4-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	1
EGAD00001001544	RNA-Seq data for 10 CD4-positive, alpha-beta T cell sample(s). 10 run(s), 10 experiment(s), 10 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	10
EGAD00001001545	DNase-Hypersensitivity data for 1 alternatively activated macrophage sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_dnaseseq_analysis_20150820	Illumina HiSeq 2000	1
EGAD00001001546	RNA-Seq data for 1 regulatory T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	1
EGAD00001001547	RNA-Seq data for 1 central memory CD8-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	1
EGAD00001001548	Bisulfite-Seq data for 2 class switched memory B cell sample(s). 21 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	2
EGAD00001001549	DNase-Hypersensitivity data for 1 Acute Myeloid Leukemia sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_dnaseseq_analysis_20150820	Illumina HiSeq 2000	1
EGAD00001001550	RNA-Seq data for 7 erythroblast sample(s). 29 run(s), 7 experiment(s), 7 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	7
EGAD00001001551	RNA-Seq data for 1 Leukemia sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	1
EGAD00001001552	ChIP-Seq data for 9 CD14-positive, CD16-negative classical monocyte sample(s). 56 run(s), 53 experiment(s), 53 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	9
EGAD00001001553	Bisulfite-Seq data for 1 central memory CD8-positive, alpha-beta T cell sample(s). 13 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	1
EGAD00001001554	ChIP-Seq data for 1 adult endothelial progenitor cell sample(s). 8 run(s), 7 experiment(s), 7 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	1
EGAD00001001555	RNA-Seq data for 7 Acute promyelocytic leukemia sample(s). 7 run(s), 7 experiment(s), 7 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	7
EGAD00001001556	Bisulfite-Seq data for 1 naive B cell sample(s). 5 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	1
EGAD00001001557	ChIP-Seq data for 1 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 7 run(s), 6 experiment(s), 6 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	1
EGAD00001001558	RNA-Seq data for 5 common lymphoid progenitor sample(s). 20 run(s), 5 experiment(s), 5 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	5
EGAD00001001559	ChIP-Seq data for 2 endothelial cell of umbilical vein (resting) sample(s). 11 run(s), 11 experiment(s), 11 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	2
EGAD00001001560	DNase-Hypersensitivity data for 2 monocyte sample(s). 4 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_dnaseseq_analysis_20150820	Illumina HiSeq 2000	2
EGAD00001001561	RNA-Seq data for 3 hematopoietic multipotent progenitor cell sample(s). 9 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	3
EGAD00001001562	ChIP-Seq data for 5 Chronic lymphocytic leukemia sample(s). 24 run(s), 23 experiment(s), 23 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	5
EGAD00001001563	Bisulfite-Seq data for 1 central memory CD4-positive, alpha-beta T cell sample(s). 15 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	1
EGAD00001001564	Bisulfite-Seq data for 1 regulatory T cell sample(s). 15 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	1
EGAD00001001565	Bisulfite-Seq data for 1 monocytes - T=0days sample(s). 15 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	1
EGAD00001001566	RNA-Seq data for 2 neutrophilic metamyelocyte sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	2
EGAD00001001567	Bisulfite-Seq data for 1 effector memory CD4-positive, alpha-beta T cell sample(s). 15 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	1
EGAD00001001568	ChIP-Seq data for 1 CD8-positive, alpha-beta thymocyte sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	1
EGAD00001001569	ChIP-Seq data for 1 Acute lymphocytic leukemia - CTR sample(s). 7 run(s), 7 experiment(s), 7 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	1
EGAD00001001570	ChIP-Seq data for 1 CD3-negative, CD4-positive, CD8-positive, double positive thymocyte sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	1
EGAD00001001571	Bisulfite-Seq data for 4 CD8-positive, alpha-beta T cell sample(s). 56 run(s), 4 experiment(s), 4 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	4
EGAD00001001572	RNA-Seq data for 4 monocyte sample(s). 4 run(s), 4 experiment(s), 4 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	4
EGAD00001001573	DNase-Hypersensitivity data for 3 inflammatory macrophage sample(s). 3 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_dnaseseq_analysis_20150820	Illumina HiSeq 2000	3
EGAD00001001574	ChIP-Seq data for 2 erythroblast sample(s). 12 run(s), 12 experiment(s), 12 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	2
EGAD00001001575	Bisulfite-Seq data for 8 macrophage sample(s). 117 run(s), 8 experiment(s), 8 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	8
EGAD00001001576	ChIP-Seq data for 12 macrophage sample(s). 49 run(s), 49 experiment(s), 49 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000 NextSeq 500	12
EGAD00001001577	ChIP-Seq data for 1 effector memory CD8-positive, alpha-beta T cell, terminally differentiated sample(s). 4 run(s), 4 experiment(s), 4 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	1
EGAD00001001578	ChIP-Seq data for 1 mesenchymal stem cell of the bone marrow sample(s). 9 run(s), 7 experiment(s), 7 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	1
EGAD00001001579	RNA-Seq data for 3 segmented neutrophil of bone marrow sample(s). 3 run(s), 3 experiment(s), 3 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	3
EGAD00001001580	ChIP-Seq data for 2 monocyte sample(s). 6 run(s), 6 experiment(s), 6 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000 NextSeq 500	2
EGAD00001001581	DNase-Hypersensitivity data for 16 macrophage sample(s). 20 run(s), 16 experiment(s), 16 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_dnaseseq_analysis_20150820	Illumina HiSeq 2000	16
EGAD00001001582	RNA-Seq data for 18 macrophage sample(s). 19 run(s), 18 experiment(s), 18 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	18
EGAD00001001583	Bisulfite-Seq data for 1 effector memory CD8-positive, alpha-beta T cell sample(s). 11 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	1
EGAD00001001584	ChIP-Seq data for 1 CD4-positive, alpha-beta thymocyte sample(s). 2 run(s), 2 experiment(s), 2 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	1
EGAD00001001585	Bisulfite-Seq data for 6 mature neutrophil sample(s). 79 run(s), 6 experiment(s), 6 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	6
EGAD00001001586	RNA-Seq data for 4 alternatively activated macrophage sample(s). 6 run(s), 4 experiment(s), 4 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	4
EGAD00001001587	Bisulfite-Seq data for 1 germinal center B cell sample(s). 6 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	1
EGAD00001001588	ChIP-Seq data for 4 segmented neutrophil of bone marrow sample(s). 20 run(s), 19 experiment(s), 19 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000 NextSeq 500	4
EGAD00001001589	ChIP-Seq data for 7 inflammatory macrophage sample(s). 36 run(s), 36 experiment(s), 36 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	7
EGAD00001001590	Bisulfite-Seq data for 4 CD38-negative naive B cell sample(s). 44 run(s), 4 experiment(s), 4 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	4
EGAD00001001591	Bisulfite-Seq data for 7 CD14-positive, CD16-negative classical monocyte sample(s). 101 run(s), 7 experiment(s), 7 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_bisulphite_analysis_CNAG_20150820	Illumina HiSeq 2000	7
EGAD00001001592	ChIP-Seq data for 2 Multiple myeloma sample(s). 16 run(s), 14 experiment(s), 14 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	2
EGAD00001001593	RNA-Seq data for 1 CD3-positive, CD4-positive, CD8-positive, double positive thymocyte sample(s). 1 run(s), 1 experiment(s), 1 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_rnaseq_analysis_crg_20150820	Illumina HiSeq 2000	1
EGAD00001001594	ChIP-Seq data for 6 CD38-negative naive B cell sample(s). 20 run(s), 20 experiment(s), 20 alignment(s) on human genome GRCh38. Part of BLUEPRINT release August 2015. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20150820/homo_sapiens/README_chipseq_analysis_ebi_20150820	Illumina HiSeq 2000	6
EGAD00001001595	ICGC PACA-CA Release 20	Illumina HiSeq 2000 Illumina HiSeq 2500	516
EGAD00001001596	Whole Exome Sequencing data from the germline of the patient as well as the tumors in bone marrow (T-ALL), Liver (Histiocytic Sarcoma) and ileum (non-Langerhans Cell Histiocytosis).	AB 5500xl Genetic Analyzer	4
EGAD00001001598	RNA-sequencing data from teh hT-RPE-MycER cell line after MYC activation and after MINCR knock-down in conditions of MYC ON or OFF	Illumina HiSeq 2500	18
EGAD00001001600	PCR and MiSeq validation for early embryonic substitution candidates from 400 Breast cancer patients. This dataset contains all the data available for this study on 2015-09-03.	Illumina MiSeq	2
EGAD00001001601	The intersection of genome-wide association analyses with physiological and functional data indicates that variants regulating islet gene transcription influence type 2 diabetes (T2D) predisposition and glucose homeostasis. However, the specific genes through which these regulatory variants act remain poorly characterized. To identify such effector transcripts for T2D and glycemic traits, we generated expression quantitative trait locus (eQTL) data in 118 human islet samples using RNA-sequencing and high-density genotyping.	Illumina HiSeq 2000	118
EGAD00001001602		Illumina HiSeq 2000	1
EGAD00001001607	In this dataset, 16 trios- primary tumor, relapse and corresponding normals- for patients with neuroblastoma are provided. For one patient, more than one relapse was available for the analyses.	Illumina HiSeq 2000	50
EGAD00001001608	Aligned BAM files of whole exome sequencing of 20 syCRCs and 10 normal counterparts. Each sample of 4 patients (S13, S3, S12 and S6) underwent two sequencing rounds.	Illumina HiSeq 2000 Illumina HiSeq 2500	42
EGAD00001001609	Maternal Plasma RNA Sequencing for Genomewide Transcriptomic Profiling and Identification of Pregnancy-Associated Transcripts		14
EGAD00001001612	After overexpression and knockdown of both described novel miRs nmiR-1 and nmiR-2 in BL cell lines (SU-DHL4 for nmiR-1 and Raji for nmiR-2), we performed regular RNA-Seq (including Mock controls for all cell lines) to identify their direct and indirect downstream mRNA targets.	Illumina HiSeq 2500	16
EGAD00001001613			10
EGAD00001001614			26
EGAD00001001615			10
EGAD00001001616			2
EGAD00001001618	Sequence data from two medullary thyroid carcinoma patients: WGS datasets generated from tumors and matched normal tissues and RNA-Seq from tumors are included.	Illumina HiSeq 2000 Illumina HiSeq 2500	6
EGAD00001001619	miRNA seq data of 43 cases out of dataset EGAD00001000650 (MMML)		43
EGAD00001001620	release_2: ICGC PedBrain: RNA sequencing	Illumina HiSeq 2000	45
EGAD00001001621	release_2: ICGC PedBrain: ChIP-Seq	Illumina HiSeq 2000	31
EGAD00001001622	BBMRI - BIOS project - Freeze 1 - Fastq files	Illumina HiSeq 2000	2199
EGAD00001001623	BBMRI - BIOS project - Freeze 1 - Bam files		2117
EGAD00001001624	release_2: ICGC PedBrain: whole exome sequencing and Target-Seq	Illumina HiSeq 2000	188
EGAD00001001625	release_2: ICGC PedBrain: whole genome sequencing	Illumina Genome Analyzer IIx Illumina HiSeq 2000	209
EGAD00001001626	RNA-Seq Illumina GAII dataset for the TraIT cell-line use case (added reverse and forward reads).	Illumina Genome Analyzer II	6
EGAD00001001627	This dataset contains RNA sequencing raw data from four parental tumors that were used for classification of gene expression subtypes (Verhaak, Cancer Cell 2010) using ssGSEA.	Illumina HiSeq 2000	4
EGAD00001001628		Illumina HiSeq 2500 Illumina MiSeq	299
EGAD00001001629	Whole-genome somatic rearrangement and point mutation analysis in cell lines with induced telomere fusions.	HiSeq X Ten	20
EGAD00001001630	release_2: ICGC PedBrain: whole genome bisulfite sequencing	Illumina HiSeq 2000	108
EGAD00001001631		Illumina MiSeq	334
EGAD00001001632	miRNA seq data of 13 cases (MMML)		13
EGAD00001001633	BAM files for two WES TRAIP patients	Illumina HiSeq 2000	2
EGAD00001001634	This dataset includes the whole genomes, sequenced to high depth (30x) of 25 individuals from Papua New Guinea. The individuals were chosen from several geographically distinct Papuan groups, focusing on the highland regions: Bundi, Kundiawa, Mendi, Marawaka and Tari.	HiSeq X Ten	25
EGAD00001001635	Whole genome sequencing detected structural rearrangements of TERT in 17/75 high stage neuroblastoma with 5 cases resulting from chromothripsis. Rearrangements were associated with increased TERT expression and targeted immediate up- and down-stream regions of TERT, placing in 7 cases a super-enhancer close to the breakpoints. TERT rearrangements (23%), ATRX deletions (11%) and MYCN amplifications (37%) identify three almost non-overlapping groups of high stage neuroblastoma, each associated with very poor prognosis. This submission contains all newly sequenced samples only.study_refcenter AMC		42
EGAD00001001636	Whole-genome sequencing at 4x of 250 samples from the Greek isolatecollection HELIC	Illumina HiSeq 2000	250
EGAD00001001637	Whole-genome sequencing at 1x of samples from the Cretan Greek isolate collection HELIC-MANOLIS. Genome-wide association studies of complex traits have been successful in identifying common variant associations, but a substantial heritability gap remains. The field of complex trait genetics is shifting towards the study of low frequency and rare variants, which are hypothesised to have larger effects. The study of these variants can be empowered by focusing on isolated populations, in which rare variants may have increased in frequency and linkage disequilibrium tends to be extended. This work focuses on an isolated population from Crete, Greece. Sequencing is very efficient in isolated populations, because variants found in a few samples will be shared by others in extended haplotype contexts, supporting accurate imputation.	Illumina HiSeq 2000	1003
EGAD00001001638	The HELIC study has been whole genome sequencing individuals from 2 Greek isolatedpopulations at 1x depth. The genotype calling process crucially involves a VQSR stepfollowed by imputation-based refinement. We have been investigating optimal ways toincrease calling accuracy. To aid us in setting appropriate parameters for VQSR and otherQC steps, we have carried out whole exome sequencing of a small number ofHELIC samples.	Illumina HiSeq 2000	5
EGAD00001001639	Low depth (4x) Illumina HiSeq raw sequence data for 2000 Ugandans from various ethno-linguistic group from rural South-West Uganda (related individuals included).	Illumina HiSeq 2000	2000
EGAD00001001642	RIKEN collection of WGS reads of 530 liver cancer and matched blood samples from 260 donors.	Illumina Genome Analyzer IIx Illumina HiSeq 2000	530
EGAD00001001643	RIKEN collection of WGS read of 59 multi-centric liver cancers or intra-haptatic metastasis and matched blood samples from 19 donors.	Illumina Genome Analyzer IIx Illumina HiSeq 2000	59
EGAD00001001644	MicroRNAs (miRs) have been recognized as promising biomarkers. It is unknown to what extent tumor-derived miRs are differentially expressed between primary colorectal cancers (pCRCs) and metastatic lesions, and to what extent the expression profiles of tumor tissue differ from the surrounding normal tissue. Next-generation sequencing (NGS) of 220 fresh-frozen samples, including paired primary and metastatic tumor tissue and non-tumorous tissue from 38 patients, revealed expression of 2245 known unique mature miRs and 515 novel candidate miRs. Unsupervised clustering of miR expression profiles of pCRC tissue with paired metastases did not separate the two entities, whereas unsupervised clustering of miR expression profiles of pCRC with normal colorectal mucosa demonstrated complete separation of the tumor samples from their paired normal mucosa. Two hundred and twenty-two miRs differentiated both pCRC and metastases from normal tissue samples (false discovery rate (FDR) <0.05). The highest expressed tumor-specific miRs were miR-21 and miR-92a, both previously described to be involved in CRC with potential as circulating biomarker for early detection. Only eight miRs, 0.5% of the analysed miR transcriptome, were differentially expressed between pCRC and the corresponding metastases (FDR <0.1), consisting of five known miRs (miR-320b, miR-320d, miR-3117, miR-1246 and miR-663b) and three novel candidate miRs (chr 1-2552-5p, chr 8-20656-5p and chr 10-25333-3p). These results indicate that previously unrecognized candidate miRs expressed in advanced CRC were identified using NGS. In addition, miR expression profiles of pCRC and metastatic lesions are highly comparable and may be of similar predictive value for prognosis or response to treatment in patients with advanced CRC.	Illumina HiSeq 2000	125
EGAD00001001645		Illumina Genome Analyzer II Illumina HiSeq 2000	28
EGAD00001001646	Fastq files corresponding to RNA-Seq dataset for PTPN1 project (EGAS00001000554)	Illumina Genome Analyzer Illumina Genome Analyzer II Illumina HiSeq 2000	10
EGAD00001001655	Genome and transcriptome sequence data from an atypical teratoid rhabdoid tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001001656	Genome and transcriptome sequence data from an atypical chronic lymphocytic leukemia patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001001657	Genome and transcriptome sequence data from a parotid gland cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001001658	Genome and transcriptome sequence data from an odontogenic ghost cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001001660	Whole exome sequencing was performed to explore the mutational landscape and potential molecular signature of HPV-positive versus HPV-negative OAC. Four hr-HPV-positive and 8 HPV-negative treatment-naive fresh-frozen OAC tissue specimens and matched normal tissue were analysed to identify somatic genomic mutations		24
EGAD00001001661	Genotype and exome data for an Australian Aboriginal population: a reference panel for health-based research.		72
EGAD00001001662	Whole genome sequences of ACC primagrafts, Histone modification maps and transcription factor binding maps for ACC primagrafts and primary tumors. Processed ChIP-seq data is available on GEO under accession number GSE76465.	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina MiSeq NextSeq 500	58
EGAD00001001663	Low coverage (4x-8x) Illumina HiSeq curated sequence data from 3 African populations from the AGV project; 100 Baganda from Uganda (4x), 100 Zulu from South Africa (4x), and 120 Gumuz, Wolayta, Oromo, Somali and Amhara from Ethiopia (8x). Pre-processed, jointly called and filtered with GATK, refined with Beagle3, phased with SHAPEIT2.		1
EGAD00001001664	LGG Epilepsy Cohort WGS	Illumina HiSeq 2000	18
EGAD00001001665	LGG Epilepsy Cohort WXS	Illumina HiSeq 2000	61
EGAD00001001666	LGG Epilepsy Cohort RNA-Seq	Illumina HiSeq 2000	34
EGAD00001001667	Data from the paper Context-specific Effects of TGFβ/SMAD3 in Cancer Are Modulated by the Epigenome. Tufegdzic et al, Cell Reports 2015	Illumina MiSeq	12
EGAD00001001668	Data from the paper Context-specific Effects of TGFβ/SMAD3 in Cancer Are Modulated by the Epigenome. Tufegdzic et al, Cell Reports 2015	Illumina HiSeq 2500	12
EGAD00001001669	Data from the paper Context-specific Effects of TGFβ/SMAD3 in Cancer Are Modulated by the Epigenome. Tufegdzic et al, Cell Reports 2015	Illumina HiSeq 2500	42
EGAD00001001672	Part of RNA sequencing data of Malignant Lymphoma Study (ICGC)	Illumina HiSeq 2000	56
EGAD00001001673	Part of WGS seq data of Maligant Lymphoma study (ICGC)	Illumina HiSeq 2000 Illumina HiSeq 2500	112
EGAD00001001674		Illumina HiSeq 2500 Illumina MiSeq	299
EGAD00001001675	RNA-seq of peripheral blood samples from CLL patients.	Illumina HiSeq 2000	42
EGAD00001001676	Tagmentation-based whole-genome bisulfite sequencing of isolated cell types from healthy controls.	Illumina HiSeq 2000	12
EGAD00001001686	In the autozygosity exome sequencing of Born-in-Bradford samples of Pakistani origin there is a mother who is homozygous for an apparent truncating stop codon in PRDM9, the gene responsible for localising recombination during meiosis. We plan to deep sequence mother and child with X10, and physically phase the mother with PacBio sequencing. We will use this data to identify recombination locations, and test whether these are consistent with the known fine scale recombination map. Data Access is controlled by the Wellcome Trust Sanger Institute DAC and the Born In Bradford Executive Group.	HiSeq X Ten Illumina HiSeq 2500	2
EGAD00001001687		Illumina HiSeq 2000	56
EGAD00001001688		Illumina HiSeq 2500	34
EGAD00001001689		Illumina HiSeq 2500	27
EGAD00001001690	Tumor-Normal paired samples of PTC	Illumina HiSeq 2000	182
EGAD00001001691	Esophageal cancer is one of the most aggressive cancers and the sixth leading cause of cancer death worldwide1. Approximately 70% of the global esophageal cancers occur in China and over 90% histopathological forms of this disease are esophageal squamous cell carcinoma (ESCC)2-3. Currently, there are limited clinical approaches for early diagnosis and treatment for ESCC, resulting in a 10% 5-year survival rate for the patients. Meanwhile, the full repertoire of genomic events leading to the pathogenesis of ESCC remains unclear. Here we show a comprehensive genomic analysis in 158 ESCC cases, as part of the International Cancer Genome Consortium (ICGC) Research Projects (http://icgc.org/icgc/cgp/72/371/1001734). We conducted whole-genome sequencing in 14 ESCC cases and whole-exome sequencing in 90 cases.	Illumina HiSeq 2000	208
EGAD00001001692	Whole exome sequencing of germline DNA was performed and subsequent polymorphisms in genes known and putatively involved in the innate immune response to fungi were identified	Illumina HiSeq 2500	1
EGAD00001001693	Fastq files of RNAseq of 182 samples of biliary tract cancer	Illumina HiSeq 2000	182
EGAD00001001694	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB10_C		1
EGAD00001001695	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB10_F		1
EGAD00001001696	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB10_M		1
EGAD00001001697	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB15_C		1
EGAD00001001698	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB15_F		1
EGAD00001001699	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB15_M		1
EGAD00001001700	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB1_C		1
EGAD00001001701	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB1_F		1
EGAD00001001702	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB1_M		1
EGAD00001001703	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB21_C		1
EGAD00001001704	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB21_F		1
EGAD00001001705	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB21_M		1
EGAD00001001706	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB22_C		1
EGAD00001001707	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB22_F		1
EGAD00001001708	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB22_M		1
EGAD00001001709	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB23_C		1
EGAD00001001710	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB23_F		1
EGAD00001001711	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB23_M		1
EGAD00001001712	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB24_C		1
EGAD00001001713	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB24_F		1
EGAD00001001714	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB24_M		1
EGAD00001001715	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB25_C		1
EGAD00001001716	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB25_F		1
EGAD00001001717	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB25_M		1
EGAD00001001718	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB27_C		1
EGAD00001001719	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB27_F		1
EGAD00001001720	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB27_M		1
EGAD00001001721	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB28_C		1
EGAD00001001722	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB28_F		1
EGAD00001001723	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB28_M		1
EGAD00001001724	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB30_C		1
EGAD00001001725	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB30_F		1
EGAD00001001726	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB30_M		1
EGAD00001001727	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB31_C		1
EGAD00001001728	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB31_F		1
EGAD00001001729	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB31_M		1
EGAD00001001730	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB33_C		1
EGAD00001001731	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB33_F		1
EGAD00001001732	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB33_M		1
EGAD00001001733	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB35_C		1
EGAD00001001734	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB35_F		1
EGAD00001001735	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB35_M		1
EGAD00001001736	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB38_C		1
EGAD00001001737	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB38_F		1
EGAD00001001739	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB40_C		1
EGAD00001001740	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB40_F		1
EGAD00001001741	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB40_M		1
EGAD00001001742	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB41_C		1
EGAD00001001743	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB41_F		1
EGAD00001001744	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB41_M		1
EGAD00001001745	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB42_C		1
EGAD00001001746	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB42_F		1
EGAD00001001747	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB42_M		1
EGAD00001001748	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB43_C		1
EGAD00001001749	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB43_F		1
EGAD00001001750	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB43_M		1
EGAD00001001751	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB44_C		1
EGAD00001001752	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB44_F		1
EGAD00001001753	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB44_M		1
EGAD00001001754	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB4_C		1
EGAD00001001755	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB4_F		1
EGAD00001001756	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB4_M		1
EGAD00001001757	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB50_C		1
EGAD00001001758	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB50_F		1
EGAD00001001759	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB50_M		1
EGAD00001001760	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB51_C		1
EGAD00001001761	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB51_F		1
EGAD00001001762	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB51_M		1
EGAD00001001763	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB52_C		1
EGAD00001001764	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB52_F		1
EGAD00001001765	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB52_M		1
EGAD00001001766	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB55_C		1
EGAD00001001767	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB55_F		1
EGAD00001001768	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB55_M		1
EGAD00001001769	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB57_C		1
EGAD00001001770	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB57_F		1
EGAD00001001771	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB57_M		1
EGAD00001001772	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB58_C		1
EGAD00001001773	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB58_F		1
EGAD00001001774	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB58_M		1
EGAD00001001775	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB60_C		1
EGAD00001001776	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB60_F		1
EGAD00001001777	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB60_M		1
EGAD00001001778	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB62_C		1
EGAD00001001779	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB62_F		1
EGAD00001001780	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB62_M		1
EGAD00001001781	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB8_C		1
EGAD00001001783	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: BvB8_M		1
EGAD00001001784	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW12_C		1
EGAD00001001786	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW12_M		1
EGAD00001001787	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW14_C		1
EGAD00001001788	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW14_F		1
EGAD00001001789	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW14_M		1
EGAD00001001790	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW15_C		1
EGAD00001001792	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW15_M		1
EGAD00001001793	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW18_C		1
EGAD00001001794	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW18_F		1
EGAD00001001795	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW18_M		1
EGAD00001001796	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW20_C		1
EGAD00001001797	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW20_F		1
EGAD00001001798	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW20_M		1
EGAD00001001799	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW22_C		1
EGAD00001001800	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW22_F		1
EGAD00001001802	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW24_C		1
EGAD00001001803	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW24_F		1
EGAD00001001804	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW24_M		1
EGAD00001001805	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW27_C		1
EGAD00001001806	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW27_F		1
EGAD00001001807	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW27_M		1
EGAD00001001808	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW29_C		1
EGAD00001001809	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW29_F		1
EGAD00001001810	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW29_M		1
EGAD00001001811	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW2_C		1
EGAD00001001812	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW2_F		1
EGAD00001001813	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW2_M		1
EGAD00001001814	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW32_C		1
EGAD00001001815	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW32_F		1
EGAD00001001816	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW32_M		1
EGAD00001001817	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW38_C		1
EGAD00001001818	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW38_F		1
EGAD00001001819	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW38_M		1
EGAD00001001820	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW3_C		1
EGAD00001001821	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW3_F		1
EGAD00001001822	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW3_M		1
EGAD00001001823	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW46_C		1
EGAD00001001824	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW46_F		1
EGAD00001001825	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW46_M		1
EGAD00001001826	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW47_C		1
EGAD00001001827	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW47_F		1
EGAD00001001828	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW47_M		1
EGAD00001001829	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW49_C		1
EGAD00001001830	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW49_F		1
EGAD00001001831	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW49_M		1
EGAD00001001833	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW4_F		1
EGAD00001001834	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW4_M		1
EGAD00001001835	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW50_C		1
EGAD00001001836	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW50_F		1
EGAD00001001837	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW50_M		1
EGAD00001001838	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW51_C		1
EGAD00001001839	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW51_F		1
EGAD00001001840	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW51_M		1
EGAD00001001841	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW52_C		1
EGAD00001001842	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW52_F		1
EGAD00001001843	50 trios were whole genome sequenced with Complete Genomics to a depth of 80x. For each trio the child was affected with severe ID, and the parents were unaffected. All trios were negative for array, targeted gene and whole exome screening. Dataset consists of sample: MW52_M		1
EGAD00001001844	Whole genome sequencing of 64 HER2-Positive Breast Cancer	Illumina HiSeq 2000	128
EGAD00001001845	Leeds Melanoma Cohort	Illumina HiSeq 2000	16
EGAD00001001846	2 BRAFV600E cell lines that have been made resistance to 1. the BRAF inhibitor PLX4720 and 2. the combination therapy of dabrafenib and trametinib seem to have a internal duplication in the kinase domain. We would like to know if this is caused by a translocation.	HiSeq X Ten	4
EGAD00001001847	4C-seq data was generated for regions of interest to confirm enhancer-gene promoter interactions	Illumina HiSeq 2000	1
EGAD00001001848	DDD DATAFREEZE 2014-11-04: 4293 trios - VCF files		1
EGAD00001001849	The genomic sequence of brain expressed miRNA genes was sequenced in Swedish schizophrenia patients	Illumina MiSeq	186
EGAD00001001850	Genomic DNA from Swedish control individuals was pooled. Then the genomic sequence of brain expressed miRNA genes was determined in the pools.	Illumina MiSeq	149
EGAD00001001851	The genomic sequence of brain expressed miRNA genes was sequenced in Belgian epilepsy patients.	Illumina MiSeq	163
EGAD00001001852	Genomic DNA from Belgian control individuals was pooled. Then the genomic sequence of brain expressed miRNA genes was determined in the pools.	Illumina MiSeq	39
EGAD00001001853	In this dataset are the data from :- 17 patients studied by WGS- 49 patients studied by WES- 9 (/49) patients studied by RNASeq at 2 time points- the same 9 patients studied by ERRBS at 2 time points	Illumina HiSeq 2000	199
EGAD00001001854	Exome sequencing of nine PCC/PGL tumors, SF and FFPE samples		18
EGAD00001001856			100
EGAD00001001857		Illumina HiSeq 2000	381
EGAD00001001858	Raw fastq files from WGS sequencing of CLL and matching blood normal for the ICGC Techval Benchmark1 study. Sequence data was provided to multiple centers for independent analysis and comparison.	Illumina HiSeq 2500	2
EGAD00001001859	Raw fastq files for sequence data generated at 5 sequencing centers from a Medulloblastoma sample and matching blood normal control.	Illumina HiSeq 2500	2
EGAD00001001860			19
EGAD00001001861	Exome Sequencing to Define the Landscape of Plasma Cells in Systemic Light chain Amyloidosis	Illumina HiSeq 2000	48
EGAD00001001862	RNA-seq of PDXs	Illumina HiSeq 2000	12
EGAD00001001863	Exome data of PDX models.	Illumina HiSeq 2500	4
EGAD00001001864	DATA FILES FOR PCGP MB WGS - Supersedes (EGAD00001000269)	Illumina HiSeq 2000	76
EGAD00001001865	Sequence Data of total RNA, miRNA, WGB, mRNA, NOMe, Chip (H3K27ac,H3K27me, H3K36me3, H3K4me1, H3K4me3, H3K9me3, Input) Short Desrciption: Epigenetic profiling of human CD4+ memory T cells reveals their proliferative history and argues in favor of a progressive differentiation model driven by epigenetically controlled master regulators.	Illumina HiSeq 2000 Illumina HiSeq 2500 NextSeq 500	75
EGAD00001001869	We report the first combined analysis of whole genome sequence, detailed clinical history, and transcriptome sequence of multiple prostate cancer metastases in a single patient (A21). Whole genome and transcriptome sequence was obtained from 9 anatomically separate metastases, and targeted DNA sequencing was performed in cancerous and noncancerous foci within the primary tumor specimen removed 5 years prior to death. Transcriptome analysis revealed increased expression of AR-regulated genes in liver metastases that harbored an AR p.L702H mutation, suggesting a dominant effect by the mutation despite being present in only 1 of an estimated 16 copies per cell. The metastases harbored several alterations to the PI3K/AKT pathway, including a clonal truncal mutation in PIK3CG and present in all metastatic sites studied. The list of truncal genomic alterations shared by all metastases included homozygous deletion of TP53, hemizygous deletion of RB1 and CHD1, and amplification of FGFR1. If the patient were treated today given this knowledge, use of second-generation androgen-directed therapies, cessation of glucocorticoid administration, and therapeutic inhibition of the PI3K/AKT pathway or FGFR1 receptor could provide personalized benefit. Three previously unreported truncal clonal missense mutations (ABCC4 p.R891L, ALDH9A1 p.W89R, and ASNA1 p.P75R) were expressed at the RNA level and assessed as druggable. The truncal status of mutations is critical for actionability, and can only be determined through analysis of multiple sites of metastasis. Our findings suggest that a large set of deeply analyzed cases could serve as powerful guide to more effective prostate cancer basic science and personalized cancer medicine clinical trials.	Illumina HiSeq 2000	7
EGAD00001001870	Deep sequencing of 151 cancer genes in 6 synchronous CRC of 3 patients	Illumina MiSeq	6
EGAD00001001871	Megakaryocytes and erythroblasts derive from the same progenitor cell type but carry out very different functions. In order to understand how the different functional phenotypes arise we have characterised the epigenetic landscape of these cells.	Illumina HiSeq 2500	20
EGAD00001001872	Targeted exome sequencing of patient derived xenografts from primary colorectal tumours and liver metastases. This dataset contains all the data available for this study on 2016-01-06.	Illumina HiSeq 2000	333
EGAD00001001873	AML emerges as a consequence of accumulating independent genetic aberrations that direct regulation and/or dysfunction of genes resulting in aberrant activation of signalling pathways, resistance to apoptosis and uncontrolled proliferation. Given the significant heterogeneity of AML genomes, AML patients demonstrate a highly variable response rate and poor median survival in response to current chemotherapy regimens. For the past 4 years we have conducted gene expression profiling on purified bone marrow populations equating to normal haematopoietic stem and progenitor cells from healthy subjects and patients with de novo AML in order to identify AML signatures of aberrantly expressed genes in cancer versus normal. We are now applying a series of bioinformatic methodologies combined with clinical and conventional diagnostic data to establish novel genomics strategies for improved prognostication of AML. Additionally, we use our AML signatures to unravel oncogenic signalling pathway activities in AML patients and test inhibitory drugs for these pathways inn preclinical therapeutic programmes. We consider that superimposing GEP and clinical data for our AML patient cohort with additional data on their mutational status will significantly improve the prognostic power of the study as well as unravel yet unknown mutations associated with aberrant signalling activities of oncogenic pathways.	Illumina HiSeq 2000	215
EGAD00001001874		Illumina HiSeq 2000	16
EGAD00001001876	Genome and transcriptome sequence data from a colorectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study. These data are included in the manuscript entitled, "Response to Angiotensin Blockade with Irbesartan in a Patient with Metastatic Colorectal Cancer".		1
EGAD00001001879	A pilot to establish the feasability of using a custom Agilent targeted pulldown of 110 genes implicated in colorectal tumourigensis to sequence for driver mutations in a set of 30 FFPE colorectal adenomas. If successful, we propose to sequence an additional 350 adenomas as part of a MRC research study in order to define the pattern of driver mutations across the spectrum of pathological subtypes including coventional adenomas, serrated adenomas and hyperplastic polyps	Illumina HiSeq 2000	30
EGAD00001001880	RIKEN collection of RNA-seq reads for 458 liver cancer samples and matched normal liver from 247 donors.	Illumina Genome Analyzer IIx Illumina HiSeq 2000	458
EGAD00001001881	RIKEN collection of WGS reads for 269 liver cancer tumors and matched normal blood or liver tissue from 258 donors. In total there are 1864 paired fastq sets sequenced on Illumina HiSeq 2000 or Genome Analyzer II instruments with paired reads of 75–101 bp. Quality control and duplication removal has not been performed.	Illumina Genome Analyzer IIx Illumina HiSeq 2000	528
EGAD00001001885	January 2016 update of RNA-Seq data (bams, fastqs) for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2500	17
EGAD00001001887	Exome sequencing VCF files describing mutations during glioma progression.		82
EGAD00001001889	*THIS DATA CAN ONLY BE USED FOR NON-COMMERCIAL CANCER RESEARCH* Sequencing of organoid cell lines derived from oesophageal tumour sections taken from patients diagnosed with primary oesophageal cancer who underwent tumour resection surgery.	HiSeq X Ten	9
EGAD00001001891	Whole genome bisulfite sequencing of pedbrain - medulloblastoma	Illumina HiSeq 2000	10
EGAD00001001892	BLUEPRINT Bisulfite-seq and Whole Genome Sequencing of mantle cell lymphoma	Illumina HiSeq 2000	4
EGAD00001001897	15x whole genome sequencing in samples from the Cretan Greek isolate collection HELIC MANOLIS	HiSeq X Ten	1482
EGAD00001001898	The study will investigate serial samples from the same patient taken at the time of MGUS or SMM diagnosis, and later at the time of evolution towards MM. Samples will be sequenced by whole genome along with a matched normal to obtain the highest possible amount of information toinvestigate genomic changes at disease evolution. This dataset contains all the data available for this study on 2016-01-27.	HiSeq X Ten	131
EGAD00001001899	HDAC and PI3K Antagonists Cooperate to Inhibit Growth of MYC-driven Medulloblastoma		102
EGAD00001001900	DNA sequencing reads of human adult stem cell cultures from liver, colon and small intestine. Including biopsy or blood samples of the donors.	HiSeq X Ten Illumina HiSeq 2500 NextSeq 500	61
EGAD00001001901	Monoclonal gammopathy of undetermined significance (MGUS) is a premalignant precursor of multiple myeloma (MM) with a 1% risk of progression per year. Although targeted analyses have shown the presence of specific genetic abnormalities such as IGH translocations, RB1 deletion, 1q gain, hyperdiploidy or RAS genes mutations, little is known about molecular mechanism of malignant transformation. We have performed whole exome sequencing together with SNP array analysis in 33 flow-cytometry separated abnormal PC samples of MGUS patients to describe somatic gene mutations and chromosome changes at the genome-wide level. Non-synonymous mutations (NS-SNVs) and copy number alterations (CNAs) were present in 97.0% and in 63.6% of cases, respectively. Importantly, the number of somatic mutations was significantly lower in MGUS compared to MM (p<10-4) and we have identified 6 myeloma significantly mutated genes which are KRAS, NRAS, DIS3, HIST1H1E, EGR1 and LTB in the MGUS dataset. We also found a positive correlation with increasing chromosome changes and somatic mutations. IGH translocations were present in 27.3% of cases comprising t(4;14), t(11;14), t(14;16) or t(14;20) and were in a similar frequency to MM, which corresponded with primary lesion hypothesis. Data from this study showed MGUS is a genetically comprehensive disease, however overall genetic instability is significantly lower compared to MM.	Illumina HiSeq 2000	66
EGAD00001001909	Paired-end whole exome sequenncing (Illumina) of primary enucleated retinoblastoma and matching lymphocyte DNA was performed to find somatic alterations that are related to oncogenesis.	Illumina HiSeq 2500	143
EGAD00001001913	Exome sequencing data for Mesothelioma	Illumina HiSeq 2500	198
EGAD00001001914	RNA-seq data for mesothelioma cell lines after spliceostatin (SSA) or control (DMSO) treatment.	Illumina HiSeq 2000	12
EGAD00001001915	RNA-Seq data for Mesothelioma.	Illumina HiSeq 2000	211
EGAD00001001916	Targeted sequencing using SPET for Mesothelioma.	Illumina HiSeq 2000	207
EGAD00001001917	PacBio data for mesothelioma cell line NCI-H2595.	PacBio RS II	1
EGAD00001001918	Multi-region Illumina whole-exome and/or whole-genome sequencing on tumor regions collected from early-stage NSCLC patients who underwent definitive surgical resection prior to receiving adjuvant therapy.Patients covered by this dataset: L012, L013, L015, L017	Illumina HiSeq 1000	15
EGAD00001001920	TEST3 dataset containing 1 FASTQ file with mRNA reads.	Illumina HiSeq 2500	1
EGAD00001001921	All pituitary samples	Illumina HiSeq 2500	84
EGAD00001001922	RNA-seq from normal human tissues (2 x 250 bp)	Illumina HiSeq 2000	14
EGAD00001001923	RNA sequence data for conditionally reprogrammed cells from patient HUB_5	Illumina HiSeq 2500	1
EGAD00001001925	1461 Neuropathological and clinically characterised cases from the MRC Brain Bank		1461
EGAD00001001926	Esophageal Squamous Cell Carcinoma (ESCC) is one of the deadliest cancers worldwide. We performed 71 Whole-exome sequencing of Esophageal Squamous Cell Carcinoma on Chinese Patients.	Illumina HiSeq 2000	141
EGAD00001001927		Illumina HiSeq 2000	27
EGAD00001001928	This study will analyse the guide sequence which were used for making mutations in the Cas9-expressing cells. We used GeCKO v2 library which were released by Feng Zhang, 2014. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2500 Illumina MiSeq	61
EGAD00001001930	Cancer genes can affect ribosomal RNA processing and this can underlie their essentiality to cells, making them cell-essential in the same way as ribosomal genes themselves. We want to confirm this, in order to understand the results of our CRISPR drop-out screens.NOTE FROM BESPOKE TEAM: Run a single read 1 (forward read) of 30 bases, then an index 1 read as normal. This would fit a 50cycle kit	Illumina MiSeq	6
EGAD00001001932	HipSci - Healthy Normals - Exome Sequencing - January 2016	Illumina HiSeq 2000	123
EGAD00001001933	HipSci - Healthy Normals - RNA Sequencing - January 2016	Illumina HiSeq 2000	118
EGAD00001001935	Cancer amplicon reads consisting of BAM paired end reads from primary multiple myeloma samples.	Illumina MiSeq	88
EGAD00001001936	Firs 1106 16S rDNA data for the Flemish Gut Flora Project	Illumina MiSeq	1061
EGAD00001001937	Targeted amplicon sequencing of samples as part of the study "Methanol-based fixation is superior to buffered formalin for next-generation sequencing of DNA from clinical cancer samples. The amplicon panel consists of 48 amplicons in TP53, PTEN, EGFR, PIK3CA, KRAS and BRAF genes as described previously [Forshew, STM 2012]. All libraries were pooled and quantify using DNA 1000 kit on Agilent 2100 Bioanalyzer and KAPA SYBR FAST ABI Prism qPCR Kit (KAPA Biosystems) on 7900HT Fast Real-Time PCR System (Applied Biosystems) according to the supplier's recommendations. Reads were aligned using bwa-mem v0.7.12-r1039 to the 1000 genomes version of human genome build GRCh37, retaining duplicate reads.	Illumina MiSeq	66
EGAD00001001938	Shallow whole-genome sequencing of samples from the study "Methanol-based fixation is superior to buffered formalin for next-generation sequencing of DNA from clinical cancer samples". DNA from each sample (100ng) was sheared on Covaris S220 (Covaris): duty cycle - 10%, intensity -5.0, bursts per sec - 200, duration - 300 sec, mode - frequency sweeping, power - 23V, temperature -5:5 C to 6 C, water level - 13. Libraries were prepared with the TruSeq Nano DNA LT Sample Prep Kit (Illumina) using a modi?ed protocol - Sample Puri?cation Beads were replaced by Agencourt AMPure XP beads (Beckman Coultier) and size selection after the End Repair was done to remove only the short fragments. Quality and quantity for contructed libraries were assessed with DNA 7500 kit on Agilent 2100 Bioanalyzer and with Kapa Quanti?cation kit (KAPA Biosystems) on 7900HT Fast Real-Time PCR System (Applied Biosystems) according to the supplier's recommendations, respectively. Libraries from 18 barcoded samples were pooled together in equimolar amounts and each pool was loaded on a single lane of a HiSeq Single End Flowcell (Illumina), followed by cluster generation on a cBot (Illumina) and sequencing on a HiSeq 2500 (Illumina) in a single-read 50bp mode. Reads were aligned using bwa-mem v0.7.12-r1039 to the 1000 genomes version of human genome build GRCh37. Picard (http://picard.sourceforge.net) was used to remove duplicate reads.	Illumina HiSeq 2500	60
EGAD00001001939	Mapped whole transcriptome RNA-Seq data from 476 human samples of early stage urothelial carcinoma.	Illumina HiSeq 2000	476
EGAD00001001940	Un-mapped whole transcriptome RNA-Seq data from 476 human samples of early stage urothelial carcinoma.	Illumina HiSeq 2000	476
EGAD00001001941	Variants derived from mapped whole transcriptome RNA-Seq data from 476 human samples of early stage urothelial carcinoma.		476
EGAD00001001942	We performed target re-sequencing for 1.29 Mb interval of chromosome 9 (chr9:21299764–22590271, hg19). NimbleGen SeqCap EZ choice system was used as a target enrichment method (Roche Diagnostics). A DNA probe set complementary to the target region was designed by NimbleDesign. The libraries were sequenced on the Illumina MiSeq platform with 2×150-bp paired-end module (Illumina). Fastq files for 48 Japanese patients with endometriosis are deposited.	Illumina MiSeq	48
EGAD00001001943	Here, we studied well-phenotyped individuals from the Flemish Gut Flora Project (FGFP, N=1,106, Belgium) and the effect of environments on microbiome. The 69 major significant phenotypes found in this study are provided.		1068
EGAD00001001944	RNA sequencing of paediatric glioblastoma in the ICGC PedBrain project	Illumina HiSeq 2500	42
EGAD00001001947	Cetuximab is a targeted monoclonal antibody against the epidermal growth factor receptor (EGFR) which is used therapeutically for the treatment of KRAS wild-type colorectal cancer (CRC). The Cetuximab sensitive KRAS wild-type CRC cell line NCI-H508 has been treated with a fixed concentration of ENU for 24 hours and then selected with Cetuximab until drug resistant clones were ready to be picked and grown up as sub-clones of the parental cell line. These will have genes causally implicated in cancer sequenced to identify common point mutations in multiple independently derived drug resistant clones as a forward genetic screen for mechanisms of resistance to Cetuximab in CRC.	Illumina HiSeq 2000	16
EGAD00001001948	Cetuximab is a targeted monoclonal antibody against the epidermal growth factor receptor (EGFR) which is used therapeutically for the treatment of KRAS wild-type colorectal cancer (CRC). The Cetuximab sensitive KRAS wild-type CRC cell line NCI-H508 has been treated with a fixed concentration of ENU for 24 hours and then selected with Cetuximab until drug resistant clones were ready to be picked and grown up as sub-clones of the parental cell line. These will have genes causally implicated in cancer sequenced to identify common point mutations in multiple independently derived drug resistant clones as a forward genetic screen for mechanisms of resistance to Cetuximab in CRC	Illumina HiSeq 2000	16
EGAD00001001949	HipSci - Monogenic Diabetes - Exome Sequencing - April 2015	Illumina HiSeq 2000	1
EGAD00001001950	HipSci - Bardet-Biedl Syndrome - Exome Sequencing - January 2016	Illumina HiSeq 2000	3
EGAD00001001951	HipSci - Monogenic Diabetes - Exome Sequencing - January 2016	Illumina HiSeq 2000	1
EGAD00001001952	HipSci - Bardet-Biedl Syndrome - RNA Sequencing - April 2015	Illumina HiSeq 2000	2
EGAD00001001953	HipSci - Monogenic Diabetes - RNA Sequencing - April 2015	Illumina HiSeq 2000	1
EGAD00001001954	HipSci - Bardet-Biedl Syndrome - RNA Sequencing - January 2016	Illumina HiSeq 2000	3
EGAD00001001955	HipSci - Monogenic Diabetes - RNA Sequencing - January 2016	Illumina HiSeq 2000	1
EGAD00001001956	ICGC Release 21 for PACA-CA from OICR	Illumina HiSeq 2000 Illumina HiSeq 2500	516
EGAD00001001957	March 2016 update of Whole genome bisulfite sequencing assay data (fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2500	18
EGAD00001001958	March 2016 update of whole genome shotgun sequencing data (bam/fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2500	17
EGAD00001001959	March 2016 update of smRNA-Seq assays data (bam/fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2500	20
EGAD00001001960	upcoming publication	Illumina HiSeq 2000	1
EGAD00001001961	Genome and transcriptome sequence data from a lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001001962	Genome and transcriptome sequence data from a lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001001963	Genome and transcriptome sequence data from a non small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001001964	Genome and transcriptome sequence data from a non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001001965	Genome and transcriptome sequence data from a lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001001966	Genome and transcriptome sequence data from a non-small cell lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study	PromethION	1
EGAD00001001967	Genome and transcriptome sequence data from an adenocarcinoma of right lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001001968	Genome and transcriptome sequence data from a non-small cell lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study	PromethION	1
EGAD00001001969	Genome and transcriptome sequence data from a non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001001973	Exome sequencing of 184 samples from consanguineous families with different congenital heart defects collected at KAIMRC, Riyadh, Saudi Arabia.	Illumina HiSeq 2000 Illumina HiSeq 2500	179
EGAD00001001977	DDD DATAFREEZE 2014-11-04: 4293 trios - phenotypic and family descriptions		1
EGAD00001001978	This dataset contains FASTQ files for multi-region exome-sequencing of EGFR-mutant lung adenocarcinomas from Asian patient. There are 16 patients and 95 samples in total, including 16 controls and 79 tumors. Multiple runs for each sample, and 368 fastq in total. Please refer to the sample-ID from filename for merging.	Illumina HiSeq 2000	95
EGAD00001001979	This dataset contains BAM file for multi-region exome-sequencing of EGFR-mutant lung adenocarcinomas from Asian patient. There are 16 patients and 95 samples in total, including 16 controls and 79 tumors.	Illumina HiSeq 2000	95
EGAD00001001980	This dataset contains BAM files of targeted Amplicon deep-sequencing data, for validation of the mutations found in WES. There are 16 patients and 95 samples in total, including 16 controls and 79 tumors.	Illumina HiSeq 2500	95
EGAD00001001981	This dataset contains FASTQ files of targeted Amplicon deep-sequencing data, for validation of the mutations found in WES. There are 16 patients and 95 samples in total, including 16 controls and 79 tumors. 140 fastq in total, multiple runs for some of the samples. Please refer to the sample-ID from filename for merging.	Illumina HiSeq 2500	95
EGAD00001001983	Immunoglobulin heavy chain gene high throughput sequencing of paediatric acute lymphoblastic leukaemia samples, for the purpose of MRD on the Illumina MiSeq platform. This dataset contains summary fastq files and raw bcl files from the MiSeq for this study. In the study we identify errors associated with multiplexing that could potentially impact on the accuracy of MRD analysis. We optimise a strategy combining high purity, sequence-optimised oligonucleotides, dual-indexing and an error-aware demultiplexing approach to minimise errors and maximise sensitivity.	Illumina MiSeq	491
EGAD00001001984	To identify recurrent somatic alterations in this unique subset of gastric cancers, whole exome and SNP6 analyses were performed using frozen cancer tissue. The somatic mutation analyses were also performed using blood of the same patients.	Illumina HiSeq 2500	160
EGAD00001001986	This study is meant to gain further knowledge in haematological cancers. Patients samples (mainly DNAs or PCR products) from haematolocical cancer patients will be sequenced, and the outputs will be correlated to their diagnosis and/or prognosis; the findings may also add more insight into the understanding of biology in this type of tumour. We will be sequencing Primary Testicular Lymphomas (PTL) to identify genetic drivers of this rare cancer	Illumina HiSeq 2500	7
EGAD00001001987	March 2016 update of Whole genome bisulfite sequencing assay data (bams) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.		18
EGAD00001001988	Cholangiocarcinoma whole genome sequencing data	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500	118
EGAD00001001991	Meta-genomic sequencing of 1,200 LifeLines-DEEP participants	Illumina HiSeq 2000	1135
EGAD00001001994	CCA targeted sequencing	Illumina HiSeq 2500	376
EGAD00001001995	Whole genome sequencing (30X) using Hiseq X TEN on 4 HCC cell lines, primary HCCs and early-passage PDCs	HiSeq X Ten	12
EGAD00001001996	RIKEN collection of WGS reads for 13 multicentric liver cancers or intrahepatic metastasis and matched blood samples for 12 donors.	Illumina HiSeq 2000	13
EGAD00001001998	This dataset consists of sequencing data on 15 patients with Sezary syndrome. On 12 of these patients, we have exome sequencing data while on 10 patients, we have RNA sequencing data. In total for seven patients, we have both exome as well as RNA sequencing data. We looked for gene mutations and fusion events in these patients to identify genes that could be involved in the pathogenesis of the disease.	Illumina HiSeq 2000 Illumina HiSeq 2500	30
EGAD00001001999	HipSci - Embryonic Stem Cells - Exome Sequencing - April 2016	Illumina HiSeq 2000	2
EGAD00001002000	HipSci - Embryonic Stem Cells - RNA Sequencing - April 2016	Illumina HiSeq 2000	2
EGAD00001002001	Mapped data (bam files) for high-throughput whole genome sequence data for 83 modern Aboriginal Australians		83
EGAD00001002002	To characterize the subclonal genomic architecture of non-androgen-deprived metastatic prostate cancer, we performed whole-genome sequencing (WGS) of pelvic lymph node metastases and matching noncancerous blood from 10 patients to an average sequencing depth of 55x. The patients are part of PELICAN (Project to ELIminate Lethal Cancer) study led by G. Steven Bova at Johns Hopkins University (USA) and Tampere University (Finland). As of September 2020, study using these data is: Wedge et al, Nature Genetics 2018 (PMID: 29662167)	Illumina HiSeq 2000	20
EGAD00001002003	Human subjects (COPD patients or apparently healthy controls) where investigated by bronchoscopy and a 5 mm brush was used to sample the subsegment airways of the right lung. The material obtained mainly consist of bronchial epithelial cells plus some contamination with leukocytes. For further details see Ziegler-Heitbrock et al, European Respiratory Journal, 40:823-829, 2012.	Illumina HiSeq 2000	544
EGAD00001002005	Using whole exome sequencing (WES), we identified homozygosity for a missense variant, VPS11: c.2536T>G (p.C846G), as the genetic cause of a leukoencephalopathy syndrome in two individuals from two unrelated Ashkenazi Jewish (AJ) families. Both patients exhibited highly concordant disease progression characterized by infantile onset leukoencephalopathy with brain white matter abnormalities, severe motor impairment, cortical blindness, intellectual disability, and seizures.		2
EGAD00001002006	Whole genome sequencing of paediatric glioblastoma in the ICGC PedBrain project	Illumina HiSeq 2500	115
EGAD00001002007	To determine the clinical and genetic landscape of CRLF2 deregulated acute lymphoblastic leukaemia (CRLF2-d ALL). We identified 172 patients with a CRLF2 rearrangement treated on either the UKALL2003 trial for children and adolescents (1-24 years) or the UKALLXII trial for adolescents and adults (15-59 years). Genomic technologies from conventional karyotyping, and FISH through to whole genome and exome sequencing were used to characterise the genomes of patients with CRLF2-d ALL. This is the largest study to date to investigate the genomic landscape of CRLF2-d ALL and define CRLF2-d as a unique subgroup of B-other ALL. We have confirmed the high incidence of CRLF2-d in Down syndrome-ALL and demonstrated the co-existence of CRLF2-d with other primary chromosomal rearrangements, suggesting that in these patients CRLF2-d can be a secondary genetic abnormality. Other defining features included enrichment of IKZF1, BTG1 and ADD3 deletions in IGH-CRLF2 patients and specific chromosomal gains seen at much higher frequencies than B-other ALL . We report recurrent established and new co-operating abnormalities and the novel involvement of USP9X and DDX3X in CRLF2-d ALL. It is clear from these data that CRLF2-d ALL is heterogenoeus, requiring a combination of genetic abnormalities in functionally relevent genes, to work alongside the deregulated expression of CRLF2 in order to initiate and drive leukaemogenesis in this subtype. Although the functional relevance of many of the abnormalities presented here are currently unknown, many are likely to activate alternate pathways or sensitize patients to current therapies.	Illumina HiSeq 2000	11
EGAD00001002008	To determine the clinical and genetic landscape of CRLF2 deregulated acute lymphoblastic leukaemia (CRLF2-d ALL). We identified 172 patients with a CRLF2 rearrangement treated on either the UKALL2003 trial for children and adolescents (1-24 years) or the UKALLXII trial for adolescents and adults (15-59 years). Genomic technologies from conventional karyotyping, and FISH through to whole genome and exome sequencing were used to characterise the genomes of patients with CRLF2-d ALL. This is the largest study to date to investigate the genomic landscape of CRLF2-d ALL and define CRLF2-d as a unique subgroup of B-other ALL. We have confirmed the high incidence of CRLF2-d in Down syndrome-ALL and demonstrated the co-existence of CRLF2-d with other primary chromosomal rearrangements, suggesting that in these patients CRLF2-d can be a secondary genetic abnormality. Other defining features included enrichment of IKZF1, BTG1 and ADD3 deletions in IGH-CRLF2 patients and specific chromosomal gains seen at much higher frequencies than B-other ALL . We report recurrent established and new co-operating abnormalities and the novel involvement of USP9X and DDX3X in CRLF2-d ALL. It is clear from these data that CRLF2-d ALL is heterogenoeus, requiring a combination of genetic abnormalities in functionally relevent genes, to work alongside the deregulated expression of CRLF2 in order to initiate and drive leukaemogenesis in this subtype. Although the functional relevance of many of the abnormalities presented here are currently unknown, many are likely to activate alternate pathways or sensitize patients to current therapies.	Illumina HiSeq 2000	22
EGAD00001002009	Exome sequencing of high-risk prostate cancer	Illumina HiSeq 2000	78
EGAD00001002010	high-throughput sequencing of methylated and hydroxymethylated DNA from tumor and non-tumor tissue of patients with high-risk prostate cancer	Illumina HiSeq 2000	32
EGAD00001002011	RNA sequencing data of whole blood samples from smoking and non-smoking mothers and their children at gestation/birth and follow-up years.		64
EGAD00001002012	ChIPseq data of whole blood samples from smoking and non-smoking mothers and their children at gestation/birth and follow-up years.		16
EGAD00001002014	Isolated populations have unique population genetics characteristics that can help boost power in genetic association studies for complex traits. Leveraging these advantageous characteristics requires an in-depth understanding of parameters that have shaped sequence variation in isolates. This study performs a comprehensive investigation of these parameters using low-depth whole genome sequencing (WGS) across multiple isolates.		6840
EGAD00001002015	The use of reference DNA standards generated from cancer cell lines sequenced in the Cancer Genome Project to establish the sensitivity, specificity, accuracy and reproducibility of the WTSI GCLP sequencing pipeline	Illumina HiSeq 2000	57
EGAD00001002016	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: LICA-FR.		12
EGAD00001002017	Genome and transcriptome sequence data from a breast primary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002018	Genome and transcriptome sequence data from a melanoma skin cancer - squamous cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002019	Genome and transcriptome sequence data from a patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002020	Genome and transcriptome sequence data from a metastatic NPC patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002021	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002022	Genome and transcriptome sequence data from a colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002023	Genome and transcriptome sequence data from a lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002024	Genome and transcriptome sequence data from an anal rectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002025	Genome and transcriptome sequence data from a colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002026	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002027	Genome and transcriptome sequence data from a colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002028	Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002029	Genome and transcriptome sequence data from an ovarian granulosa patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002030	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002031	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002032	Genome and transcriptome sequence data from an adenoid cystic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002033	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002034	Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002035	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002036	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002037	Genome and transcriptome sequence data from an adrenal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002038	Genome and transcriptome sequence data from a peripheral T-cell lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002039	Genome and transcriptome sequence data from an ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002040	Genome and transcriptome sequence data from a squamous cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002041	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002042	Genome and transcriptome sequence data from an endometrial cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002043	Genome and transcriptome sequence data from a recurrent glioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002044	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002045	Genome and transcriptome sequence data from a lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002046	Genome and transcriptome sequence data from a liposarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002047	Genome and transcriptome sequence data from a breast ductal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002048	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002049	Genome and transcriptome sequence data from an adrenal cortical carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002050	In this project we will use exome sequencing to identify somatic mutations in lesions from a patient with a germline mutation in the protection of telomeres 1 gene (POT1). This dataset contains all the data available for this study on 2016-04-20.	Illumina HiSeq 2000 Illumina MiSeq	36
EGAD00001002051	BRAF V600E colorectal cancers do not respond to the only currently FDA approved targeted therapy for CRC. There is currently a trial underway in the UK recruiting V600E CRC patients for treatment with a triple therapy combination of Cetuximab, Trametinib and Dabrafenib. We have mutagenized a pool of V600E CRC cell lines and treated with this triple therapy to select out drug resistant clones. We will now sequence these drug resistant clones with the aim of identifying common point mutations engendering resistance to this new therapy.	Illumina HiSeq 2500	20
EGAD00001002053	dataset CML WGS VCF		29
EGAD00001002054	dataset CML WES VCF		24
EGAD00001002055	Whole exome sequencing from matched tumor-control samples of 121 primary lymphoma samples. Sequencing was performed on Illumina HiSeq2000. The dataset contains FASTQ files.	Illumina HiSeq 2000	242
EGAD00001002056	Paired-end RNA sequencing using total RNA from 136 primary lymphoma samples. Sequencing was performed on the Illumina HiSeq2000 with 300bp insert size. The dataset contains FASTQ files.	Illumina HiSeq 2000	136
EGAD00001002057	dataset CML WGS pairend fastq	Illumina HiSeq 2000	29
EGAD00001002058	dataset CML WGS pairend bam	HiSeq X Ten Illumina HiSeq 2000	33
EGAD00001002059	dataset CML WES pairend fastq	Illumina HiSeq 2000	24
EGAD00001002060	dataset CML WES pairend bam	Illumina HiSeq 2000	24
EGAD00001002061	BMI1 ChIP-seq on human K562	Illumina HiSeq 2500	3
EGAD00001002062	BMI1 ChIP-seq on human K562	Illumina HiSeq 2500	3
EGAD00001002064	Zhong Shan Hospital liver tumor single cell sequencing: 111 single cell and 6 tissues	HiSeq X Ten	117
EGAD00001002065	Cetuximab is a targeted monoclonal antibody against the epidermal growth factor receptor (EGFR) which is used therapeutically for the treatment of KRAS wild-type colorectal cancer (CRC). The Cetuximab sensitive KRAS wild-type CRC cell line NCI-H508 has been treated with a fixed concentration of ENU for 24 hours and then selected with Cetuximab until drug resistant clones were ready to be picked and grown up as sub-clones of the parental cell line. These will have genes causally implicated in cancer sequenced to identify common point mutations in multiple independently derived drug resistant clones as a forward genetic screen for mechanisms of resistance to Cetuximab in CRC	Illumina HiSeq 2500	50
EGAD00001002066	KRAS mutant CRC is currently in clinical trial with a combination of a MEK and Akt inhibitor. These patients will likely develop resistance to this combination. We aim to identify the mechanisms of resistance via ENU mutagenesis, with a view to identifying additional therapeutics which have the ability to overcome this resistance.	Illumina HiSeq 2500	86
EGAD00001002067	Renal cell carcinoma (RCC) is a genomically heterogeneous tumor. In the present project, the question whether intratumoral heterogeneity follows a zonal pattern indicating spatial niches was addressed. Whole exome sequencing of 16 paired samples from tumor periphery and center revealed a number of region-specific functional SNVs and Indels. Therefore, RCCs are not composed of evenly admixed tumor cells but show topological differences in their clonal composition.	Illumina HiSeq 2500	16
EGAD00001002068	The dataset consists of 232 RNA-seq samples (whole blood) obtained from healthy female from the TwinsUK adult registry cohort. The samples were obtained at two time points separated on average by 22 months.	Illumina HiSeq 2000	232
EGAD00001002069	Complete genomics data for VCaP and PC346c.		2
EGAD00001002070	Whole genome sequencing CRAM files for four samples from the BRIDGE Consortium (SPEED project) with pathogenic variants in a gene associated with a movement disorder.	Illumina HiSeq 2000	4
EGAD00001002071	qDNAseq shallow sequencing dataset of the cell line use case.		5
EGAD00001002072	RNAseq on Illumina HiSeq2000/2500 of colorectal cancer metastasis sample	Illumina HiSeq 2000	23
EGAD00001002073	RNAseq on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer metastasis sample	Illumina HiSeq 2000	12
EGAD00001002074	RNAseq on Illumina HiSeq2000/2500 of WNT reporter of PDO culture derived from colorectal cancer metastasis sample	Illumina HiSeq 2000	3
EGAD00001002075	RNAseq on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from PDO culture derived from colorectal cancer metastasis sample	Illumina HiSeq 2000	1
EGAD00001002076	RNAseq of Patient-derived xenograft derived from colorectal cancer metastasis sample	Illumina HiSeq 2000	19
EGAD00001002077	RNAseq on Illumina HiSeq2000/2500 of colorectal cancer primary tumor sample	Illumina HiSeq 2000	87
EGAD00001002078	RNAseq on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer primary tumor sample	Illumina HiSeq 2000	28
EGAD00001002079	RNAseq on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from PDO culture derived from colorectal cancer primary tumor sample	Illumina HiSeq 2000	1
EGAD00001002080	RNAseq on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer primary tumor sample	Illumina HiSeq 2000	37
EGAD00001002081	RNAseq on Illumina HiSeq2000/2500 of PDO culture derived from Patient-derived xenograft derived from colorectal cancer primary tumor sample	Illumina HiSeq 2000	4
EGAD00001002082	Whole-genome sequencing on Illumina HiSeq2000/2500 of Blood EDTA	Illumina HiSeq 2000	69
EGAD00001002083	Whole-genome sequencing on Illumina HiSeq2000/2500 of normal colon control tissue	Illumina HiSeq 2000	2
EGAD00001002084	Whole-genome sequencing on Illumina HiSeq2000/2500 of colorectal cancer metastasis sample	Illumina HiSeq 2000	23
EGAD00001002085	Whole-genome sequencing on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer metastasis sample	Illumina HiSeq 2000	12
EGAD00001002086	Whole-genome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from PDO culture derived from colorectal cancer metastasis sample	Illumina HiSeq 2000	1
EGAD00001002087	Whole-genome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer metastasis sample	Illumina HiSeq 2000	19
EGAD00001002088	Whole-genome sequencing on Illumina HiSeq2000/2500 of colorectal cancer primary tumor sample	Illumina HiSeq 2000	87
EGAD00001002089	Whole-genome sequencing on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer primary tumor sample	Illumina HiSeq 2000	25
EGAD00001002090	Whole-genome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from PDO culture derived from colorectal cancer primary tumor sample	Illumina HiSeq 2000	1
EGAD00001002091	Whole-genome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer primary tumor sample	Illumina HiSeq 2000	38
EGAD00001002092	Whole-genome sequencing on Illumina HiSeq2000/2500 of PDO culture derived from Patient-derived xenograft derived from colorectal cancer primary tumor sample	Illumina HiSeq 2000	5
EGAD00001002093	Whole-exome sequencing on Illumina HiSeq2000/2500 of Blood EDTA	Illumina HiSeq 2000	33
EGAD00001002094	Whole-exome sequencing on Illumina HiSeq2000/2500 of normal colon control tissue	Illumina HiSeq 2000	2
EGAD00001002095	Whole-exome sequencing on Illumina HiSeq2000/2500 of colorectal cancer metastasis sample	Illumina HiSeq 2000	14
EGAD00001002096	Whole-exome sequencing on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer metastasis sample	Illumina HiSeq 2000	12
EGAD00001002097	Whole-exome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from PDO culture derived from colorectal cancer metastasis sample	Illumina HiSeq 2000	1
EGAD00001002098	Whole-exome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer metastasis sample	Illumina HiSeq 2000	19
EGAD00001002099	Whole-exome sequencing on Illumina HiSeq2000/2500 of colorectal cancer primary tumor sample	Illumina HiSeq 2000	55
EGAD00001002100	Whole-exome sequencing on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer primary tumor sample	Illumina HiSeq 2000	25
EGAD00001002101	Whole-exome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from PDO culture derived from colorectal cancer primary tumor sample	Illumina HiSeq 2000	1
EGAD00001002102	Whole-exome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer primary tumor sample	Illumina HiSeq 2000	38
EGAD00001002103	Whole-exome sequencing on Illumina HiSeq2000/2500 of PDO culture derived from Patient-derived xenograft derived from colorectal cancer primary tumor sample	Illumina HiSeq 2000	5
EGAD00001002104	Whole-exome sequencing on AB 5500xl Genetic Analyzer of Blood EDTA	AB 5500xl Genetic Analyzer	76
EGAD00001002105	Whole-exome sequencing on AB 5500xl Genetic Analyzer of colorectal cancer metastasis sample	AB 5500xl Genetic Analyzer AB 5500xl-W Genetic Analysis System	16
EGAD00001002106	Whole-exome sequencing on AB 5500xl Genetic Analyzer of Patient-derived xenograft derived from colorectal cancer metastasis sample	AB 5500xl Genetic Analyzer	1
EGAD00001002107	Whole-exome sequencing on AB 5500xl Genetic Analyzer of colorectal cancer primary tumor sample	AB 5500xl Genetic Analyzer AB 5500xl-W Genetic Analysis System	66
EGAD00001002108	Exome and targeted amplicon sequencing data for tumor, germline and plasma samples from a patient with metastatic breast cancer.	Illumina HiSeq 2500 Illumina MiSeq	30
EGAD00001002109	TSACP TruSeq Amplicon Panel dataset for the TraIT cell line use case		5
EGAD00001002110	Chronic lymphocytic leukemia (CLL) is characterized by substantial clinical heterogeneity, despite relatively few genetic alterations. To provide a basis for studying epigenome deregulation in CLL, we established genome-wide chromatin accessibility maps for 88 CLL samples from 55 patients using the ATAC-seq assay, and we also performed ChIPmentation and RNA-seq profiling for ten representative samples. Based on the resulting dataset, we devised and applied a bioinformatic method that links chromatin profiles to clinical annotations. Our analysis identified sample-specific variation on top of a shared core of CLL regulatory regions. IGHV mutation status – which distinguishes the two major subtypes of CLL – was accurately predicted by the chromatin profiles, and gene regulatory networks inferred for IGHV-mutated vs. IGHV-unmutated samples identified characteristic differences between these two disease subtypes. In summary, we discovered widespread heterogeneity in the chromatin landscape of CLL, established a community resource for studying epigenome deregulation in leukemia, and demonstrated the feasibility of chromatin accessibility mapping in cancer cohorts and clinical research.	Illumina HiSeq 3000	138
EGAD00001002111	70 Whole exome sequencing from 9 patients with DIPG for project Spatial and Temporal Homogeneity of Driver Mutations in Diffuse Intrinsic Pointine Glioma	Illumina HiSeq 2500	70
EGAD00001002112	RNA-seq data from 195 pediatric BCP-ALL cases. Alignment: TopHat 2.0.7. Reference genome: hg19.	Illumina HiScanSQ	195
EGAD00001002113	Mate pair whole genome sequencing data from 15 pediatric BCP ALL cases. Reference genome: hg19. Alignment: BWA 0.7.9a.	NextSeq 500	15
EGAD00001002115	Targeted sequencing of 173 genes in 2433 primary breast tumours. Data includes 2433 tumour samples, 523 adjacent normal (breast) samples and 127 blood samples. Libraries were prepared with Illumina's Nextera custom enrichment kit targetting all the exons of the most frequently mutated breast cancer genes. Libraries were multiplexed (48 libraries per lane) and sequenced on Illumina HiSeq 2000 (100bp paired-end reads). Somatic mutations were calling with a custom pipeline. We identified 40 mutation-driver (Mut-driver) genes, and determined associations between mutations, driver CNA profiles, clinical-pathological parameters and survival. We assessed the clonal states of Mut-driver mutations, and estimated levels of intra-tumour heterogeneity using mutant-allele fractions. The results emphasize the importance of genome-based stratification of breast cancer, and have important implications for designing therapeutic strategies. Referece: Pereira et al. (2016) The somatic mutation profiles of 2,433 breast cancers refines their genomic and transcriptomic landscapes. Nature Communications		3083
EGAD00001002116	Raw data (fastq files) from whole exome sequencing of AML patients (paired diagnosis and complete remission samples)	Illumina HiSeq 2000	12
EGAD00001002117	Raw data (fastq files) from targeted resequencing of AML patients at diagnosis	Illumina MiSeq	68
EGAD00001002118	Raw data (fastq files) from targeted resequencing of AML patients at relapse	Illumina MiSeq	24
EGAD00001002119	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: LAML-KR.		18
EGAD00001002120	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: ORCA-IN.		26
EGAD00001002121	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: BTCA-SG.		24
EGAD00001002122	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: BRCA-UK.		90
EGAD00001002123	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: MALY-DE.		202
EGAD00001002124	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: EOPC-DE.		113
EGAD00001002125	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: BOCA-UK.		148
EGAD00001002126	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: PRAD-UK.		116
EGAD00001002127	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: PBCA-DE.		496
EGAD00001002128	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: PRAD-CA.		244
EGAD00001002129	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: BRCA-EU.		158
EGAD00001002130	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: CLLE-ES.		194
EGAD00001002131	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: RECA-EU.		190
EGAD00001002132	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: PACA-AU.		192
EGAD00001002133	Dataset contains Whole Exome Sequencing(WES) data from 37 individuals as aligned bam-files. The reads have been aligned using bowtie2 to human genome hg19 build.	Illumina HiSeq 2000	37
EGAD00001002135	ChIPseq data of Atypical teratoid/rhabdoid tumors (ATRT)	Illumina HiSeq 2000 Illumina HiSeq 2500	15
EGAD00001002136	RNA sequencing data of Atypical teratoid/rhabdoid tumors (ATRT)	Illumina HiSeq 2000	25
EGAD00001002137	WGBS data of Atypical teratoid/rhabdoid tumors (ATRT)	Illumina HiSeq 2000 Illumina HiSeq 2500	15
EGAD00001002138	WGS data of Atypical teratoid/rhabdoid tumors (ATRT)	Illumina HiSeq 2000	36
EGAD00001002142	Paired PCR-free whole genome sequencing data of a matched metastatic melanoma cell line (COLO829) and normal across three lineages and across separate institutions, with independent library preparations, sequencing, and analysis. The data was generated with mean mapped coverages of 99X for COLO829 and 103X for the paired normal across three institutions. Overall, common events include >35,000 point mutations, 446 small insertion/deletions, and >6,000 genes affected by copy number changes. We present this reference to the community as an initial standard for enabling quantitative evaluation of somatic mutation pipelines across institutions.		24
EGAD00001002143	We expanded our previous collection of longitudinal GBM patients (EGAS00001001041) by recruiting 21 additional patients. Tumor specimens were subjected to whole-exome sequencing (16 of 21 cases, with the matched normal/blood) and transcriptome sequencing (16 of 21 cases).	Illumina HiSeq 2500	86
EGAD00001002144	The morphology of the first humans in the Americas (Paleoamericans) differs from that of Native Americans, and has raised the question of whether or not there are also differences in origin or genetics. A few populations who survived until relatively recently have been suggested to retain Paleoamerican morphology. One of these populations is from La Jolla. Here, we have generated genome sequence data from four La Jolla individuals in order to investigate these questions This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2000	4
EGAD00001002145	Whole exome sequencing data of primary, secondary and tertiary tumor from a patient.	Illumina HiSeq 2500	4
EGAD00001002146	The dataset contains the whole genome sequencing data of a family with two unaffected parents and two probands that showed Hereditary spastic paraplegias symptoms. Sequencing reads were aligned to human genome (GRCh38) using BWA-MEM, followed by indel-realignment and PCR-duplicates marking. Alignment results are available for download in BAM format.	HiSeq X Ten	4
EGAD00001002148	Directed differentiation of stem cells offers a scalable solution to the need for human cell models recapitulating islet biology and T2D pathogenesis. We profiled mRNA expression at six stages of an induced pluripotent stem cell (iPSC) model of endocrine pancreas development from two donors, and characterized the distinct transcriptomic profiles associated with each stage. Established regulators of endodermal lineage commitment, such as SOX17 (log2 fold change [FC] compared to iPSCs=14.2, p-value=4.9x10-5) and the pancreatic agenesis gene GATA6 (log2 FC=12.1, p-value=8.6x10-5), showed transcriptional variation consistent with their known developmental roles. However, these analyses highlighted many other genes with stage-specific expression patterns, some of which may be novel drivers or markers of islet development. For example, the leptin receptor gene, LEPR, was most highly expressed in published data from in vivo-matured cells compared to the endocrine pancreas-like cells (log2 FC=5.5, p-value=2.0x10-12), suggesting a role for the leptin pathway in the maturation process. Endocrine pancreas-like cells showed significant stage-selective expression of adult islet genes, including INS, ABCC8, and GLP1R, and enrichment of relevant GO-terms (e.g. “insulin secretion”; odds ratio=4.2, p-value=1.9x10-3): however, principal component analysis indicated that in vitro-differentiated cells were more immature than adult islets. Integration of the stage-specific expression information with genetic data from T2D genome-wide association studies revealed that 46 of 82 T2D-associated loci harbor genes present in at least one developmental stage, facilitating refinement of potential effector transcripts. Together, these data show that expression profiling in an iPSC islet development model can further understanding of islet biology and T2D pathogenesis.	Illumina HiSeq 2000	12
EGAD00001002149	Low coverage whole genome sequencing for the identification of somatic copy number alterations (SCNA) and focal amplification mapping in plasma DNA of prostate cancer patients	Illumina MiSeq	95
EGAD00001002150	Low coverage whole genome sequencing for the identification of somatic copy number alterations (SCNA) and focal amplification mapping of corresponding tumor material	Illumina MiSeq	8
EGAD00001002151	Whole transcriptome sequencing of 231 children with newly-diagnosed ALL	Illumina HiSeq 2000	231
EGAD00001002152	Whole exome sequencing for the matched germline and tumor DNA from 10 ALL cases with ZNF384 rearrangements.	Illumina HiSeq 2000	20
EGAD00001002153	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: PAEN-IT.		74
EGAD00001002154	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: PAEN-AU.		98
EGAD00001002155	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: LIRI-JP.		524
EGAD00001002156	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: ESAD-UK.		198
EGAD00001002157	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: MELA-AU.		140
EGAD00001002158	This is an in vitro genome-wide CRISPR/cas9 screen in human glioblastoma stem cells, screening for genes essential for survival of these cells. These cells express cas9 and have been transfected with a guide RNA library causing gene knockouts. We will analyse the sequencing data for depletion of guide RNAs. This dataset contains all the data available for this study on 2016-06-02.	Illumina HiSeq 2000	6
EGAD00001002159	Exome Seq for Study EGAS00001001844	Illumina HiSeq 2000	2
EGAD00001002160	Exome Seq for EGAS00001001845	Illumina HiSeq 2500	2
EGAD00001002161	Transcriptome from EGAS00001001845	Illumina HiSeq 2500	1
EGAD00001002162	Exome Seq from EGAS00001001846	Illumina HiSeq 2500	2
EGAD00001002163	Transcriptome from EGAS00001001846	Illumina HiSeq 2500	1
EGAD00001002164	Exome from EGA00001001848	Illumina HiSeq 2000	2
EGAD00001002165	Samples were sequenced from 33 multiple myeloma patients including tumor presentation and relapse samples and a matched patient control sample. Tumor DNA was isolated from CD138-positive plasma cells. Control DNA originated from peripheral blood leukapheresis products collected after induction therapy. Libraries were prepared using the SureSelectQXT sample prep kit and the SureSelect Clinical Research Exome kit (Agilent), with additional baits covering the Ig and MYC loci. Paired-end sequencing was performed to an average sequencing depth of 118× on a HiSeq2500 (Illumina).	Illumina HiSeq 2500	99
EGAD00001002166	Exome from EGAS00001001861	Illumina HiSeq 2000	17
EGAD00001002167	A KNIH001 mRNA-seq paired end data for islet cells	Illumina HiSeq 2000	1
EGAD00001002168	A KNIH002 mRNA-seq paired end data for islet cells	Illumina HiSeq 2000	1
EGAD00001002169	A KNIH003 mRNA-seq paired end data for islet cells	Illumina HiSeq 2000	1
EGAD00001002170	A KNIH004 mRNA-seq paired end data for islet cells	Illumina HiSeq 2000	1
EGAD00001002171	A KNIH005 mRNA-seq paired end data for islet cells	Illumina HiSeq 2000	1
EGAD00001002172	A KNIH006 mRNA-seq paired end data for beta cells	Illumina HiSeq 2000	1
EGAD00001002173	A KNIH007 mRNA-seq paired end data for adipocytes	Illumina HiSeq 2000	1
EGAD00001002174	A KNIH008 mRNA-seq paired end data for adipocytes	Illumina HiSeq 2000	1
EGAD00001002175	A KNIH009 mRNA-seq paired end data for preadipocytes	Illumina HiSeq 2000	1
EGAD00001002176	A KNIH010 mRNA-seq paired end data for podocytes	Illumina HiSeq 2000	1
EGAD00001002177	A KNIH011 mRNA-seq paired end data for podocytes	Illumina HiSeq 2000	1
EGAD00001002178	The study will analyse by exome sequencing 8 Greek family members with an excess of potentially damaging mutations relating to premature MI and no vessel disease, to identify genetic factors underlying this condition. This is a follow on from project GPMI-NVD	Illumina HiSeq 2000	8
EGAD00001002179	Background: A rare subgroup of HIV infected individuals naturally controls infection without treatment. These ?elite controllers? constitute an important model for the natural control of HIV infection. Indeed, the study of these individuals may provide insights into strategies for the development of HIV vaccines. Although several HLA and chemokine alleles are known to be over-represented in elite controllers, only a small portion of HIV phenotypic variation is explained by known genetic variants. The elite controller phenotype is rare and distinct, representing the extreme of an infectious disease trait. As such, this phenotype may be partly explained by variation in host immune control, which may be characterized by differences in rare functional genetic variants. Genomic regions underlying elite control can be potentially identified by comparing the presence or frequency of variants in this group to that representing the opposite extreme. In this context, ?rapid progressors? is a group defined by its rapid immunological and clinical disease progression. Aim: To extend an existing study, in order to identify DNA sequence variants involved in the control of HIV infection with greater statistical resolution. Specifically, we aim to sequence up to 200 exomes from multiple cohort studies within the EuroCoord CASCADE collaboration (a collaboration of 25 HIV seroconversion cohort studies across Europe).	Illumina HiSeq 2000	183
EGAD00001002180	Targeted pulldown of genes known to be recurrently mutated in AML & MDS from patient and normal samples using Agilent Sureselect and for some cases also using Illumina Truseq technology.	Illumina HiSeq 2000	288
EGAD00001002181	Barrett?s oesophagus is common in the UK affecting 2 % of the population. Family history has been recorded among the 4000 Barrett's cases collected so far and have 241 families. Among them we have assessed 6 multiplex families with proven Barrett?s and defined as having 1 pro band and at least 3 affected first degree members. We propose to exome sequence the probands of these six families to assess the presence of pathogenic rare coding variants.	Illumina HiSeq 2000	6
EGAD00001002182	The BMP antagonist Grem1 has been shown to be associated with a rare human polyposis syndrome (HMPS). We have shown that there is a 40KB duplication on chrom 15 found in some patients with HMPS. Traditional serrated adenomas (rare sporadic polyps) share some morphological features with HMPS polyps and it has long been hypothesised that they are the sporadic version of HMPS polyps. We have obtained of one of these lesions and in this project we aim to characterise this tumour.	Illumina HiSeq 2000 Illumina HiSeq 2500	2
EGAD00001002183	This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2000	96
EGAD00001002184	Sequencing of rare human histiocytic tumour	Illumina HiSeq 2000	2
EGAD00001002185	Exome sequencing of 32 patient samples from Sri Lanka with the condition haemoglobin E beta thalassaemia	Illumina HiSeq 2000	32
EGAD00001002186	Around 10% of patients who present in melanoma clinics have a first degree relative with a previous diagnosis of melanoma. While around 3% have three or more relatives who have been diagnosed with the disease. In this project we will whole genome sequence patients from large Dutch familial melanoma pedigrees to identify mutations in genes that drive melanomagenesis. The identification of these genes will facilitate the management of familial melanoma patients and their families.	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina MiSeq	38
EGAD00001002187	To identify transcriptome profile in this unique subset of gastric cancers, RNA-seq analyses were performed using frozen cancer tissue. Adjacent normal tissue of the same patients were used in differently expressed gene selection and fusion gene prediction.	Illumina HiSeq 2500	138
EGAD00001002188	Paired-end BAM files of mitochondrial whole genome deep sequencing (mtWGDS) analysis	Illumina HiSeq 2500	105
EGAD00001002189	paired-end BAM files of the sequencing analysis of the mtDNA polymerase gamma (POLG) gene in the MS-affected co-twins	Illumina MiSeq	54
EGAD00001002190	Single-end BAM files of the targeted deep sequencing analysis of several mtDNA candidate regions in blood and buccal-derived DNA of the corresponding twin pairs.	Illumina MiSeq	140
EGAD00001002191		Illumina HiSeq 2000	28
EGAD00001002192	Additional sequencing data for 173 donors in EGAS00001000154, a study of Pancreatic Ductal Adenocarcinoma. WGS libraries were used for high-cellularity cases, WXS sequencing to high depth on low-cellularity cases. HiSeq 2xxx platform was used in all cases. The analysis files associated with this dataset are merged, de-duplicated bams aligned against GRCh37, one tumour and one normal bam per donor.		346
EGAD00001002193	Single case of T-ALL carrying t(4;6), a novel translocation.	Illumina HiSeq 2000	1
EGAD00001002194	This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ We performed exome sequencing on serial samples from a patient with CMML who progressed to AML. The exome sequencing suggests that NPM1, TET2 and DNMT3a mutations were present in the dominant clone in the CMML sample and that NRAS is a new subclonal mutation in the AML sample. Diagnostic data shows the presence of a FLT3-ITD mutation in the AML sample, which is likely to have driven progression. Here we are performing re-sequencing of the putative driver and some passenger mutations which appear to be in the same clone to validate these mutations and to verify the relative quantification of these abnormalities .	Illumina MiSeq	10
EGAD00001002195	The aim of this project is to identify rare genetic variants of large effect implicated in complex diseases by focusing on the study of cardiovascular diseases and related quantitative traits in a well characterized isolated population in Cilento area, Italy. The reference panel has been selected carefully in order to maximize the imputation coverage and quality on the all population samples. The selected individuals should meet three criteria: selected individuals should be chip-genotyped and closely related to the maximum number of chip-genotyped individuals so as to maximize imputation coverage; relatedness between selected individuals should be minimal, so as to minimize redundancy in genetic information of the reference panel. We perform exome sequencing on samples from 250 individuals from the Campora and Gioi-Cardile populations.	Illumina HiSeq 2000	247
EGAD00001002196	Our lab is currently using macrophages as a model system for understanding how genetic variation modulates the response to external environmental stimulus. We want to extend this beyond regular polyadenylated RNA to small RNAs such as miRNAs. This project would cover the costs of a pilot to study miRNA response to LPS stimulus, and will be performed as part of a rotation project in the lab. We will require a small number of miRNA libraries and a single lane of MiSeq	Illumina MiSeq	6
EGAD00001002197	Recent GWAS studies have made extensive use of large eQTL data sets to functionally annotate index SNPs. With a large number of association signals located outside coding regions there has been an intense search among sequence variants affecting gene expression at the transcriptional level. However, little progress has been made in mapping regulatory variants that affect protein levels at the translational or post-translational level. It is now possible to undertake a protein QTL scan for focused sets of e.g. oxidized proteins by mass spectrometry. We have established a collaboration with a longitudinal, family-based study in France, the Stanislas cohort, which comprises circa 1000 nuclear families (4,295 individuals) and has follow up data for 10 years (three visits). We have undertaken a pilot study in a focus set of 257 subjects from 79 families with the aim to integrate GWAS, transcriptomic and DNA methylation data with proteomic data on a set of 100 proteins measured in PBMCs. We have already generated GWAS data using Illumina's core-exome chip as well as DNA methylation profiles with the 450K array. We propose to use RNA seq to generate transcriptomic data of the corresponding PBMCs. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2500	155
EGAD00001002198	This set of samples is composed of eight young people (7-16 years old) that have developed melanoma with first-degree relatives that have also developed cancer, which suggests a genetic component to their disease. Here we want to sequence these samples in order to find the causative mutations. As these samples do not carry any of the high-penetrance mutations known to date, finding the genes(s) responsible will offer new insights into the genetic mechanisms underlying predisposition to melanoma.	HiSeq X Ten Illumina HiSeq 2000	7
EGAD00001002199	Sequencing of rare human histiocytic tumour	Illumina HiSeq 2000	2
EGAD00001002200	Whole exome sequencing of families with Congenital Heart Defects (182 trios). Collaboration with David Brook, University of Nottingham.	Illumina HiSeq 2000	541
EGAD00001002201	Data for paper: Epigenetic dynamics of monocyte to macrophage differentiation with Chip Seq, NOMe, mRNA, total RNA, noncoding RNA, whole genome bisulfite seq,	Illumina HiSeq 2000	8
EGAD00001002202	Here we have from 64 samples, their corresponding fastq and bam files. The study group consisted of 17 obese women with normal glucose tolerance and 15 obese women with T2DM classified according to WHO standards. The groups were matched for age, BMI and waist circumference. All the women had been morbidly obese (BMI>40 kg/m2) for at least five years.	Illumina HiSeq 2000	64
EGAD00001002203	Sequence data is from 4 samples from an adult patient with TCF3-PBX1 t(1;19)-positive acute lymphoblastic leukemia. Exome sequencing was performed on a skin biopsy (normal tissue control) and leukemic bone marrow biopsies taken at diagnosis and at two relapse time points. RNA-sequence data is from leukemic bone marrow from two relapse biopsies.	Illumina HiSeq 2500	4
EGAD00001002204	1006 Familial early onset gemrline CRC patients sequenced by the Molecular and Population Genetics group of the Institute of Cancer Research	Illumina HiSeq 2500	1006
EGAD00001002205	The BLUEPRINT project is a large-scale project investigating epigenetic mechanisms involved in blood formation, in health and disease. The human variation workpackage (WP10, led by NS) of the project seeks to characterize the effect of common sequence variation on the epigenome status of a cell. To do this, the project will use highly purified blood cells to minimise "experimental noise" and therefore enhance the power to discover modest effects. Two peripheral blood cell types, the CD14+CD16- monocyte (an important central orchestrator of adaptive immunity and a bridge between innate and adaptive immunity) and the CD65+CD9- neutrophilic granulocyte (the frontline cell for innate immunity) have been selected for this purpose. The two types of cells will be obtained at high purity from adult blood (AB) of 200 healthy males and females, respectively. Cells will be purified by using already validated and fully operational protocols that are based on density gradient centrifugation of the buffy coat obtained from whole blood, followed by magnetic bead-based purification using monoclonal antibodies against Cluster of Differentiation (CD) lineage-specific cell surface markers. Units of 475 ml of AB will be obtained from consenting volunteers of the Cambridge BioResource (CBR), a panel of 10,000 healthy volunteers local to Cambridge who have already consented to participate in biomedical research and of whom biological samples (DNA, plasma, serum) and lifestyle data have been deposited in a repository and database, respectively. We are requesting funding from the Human Diversity project to sequence the genomes of the 200 CBR volunteers at low pass (6x coverage). Nuclei, DNA and RNA will be recovered from the purified cells and made available for RNA-seq, DNA-seq and ChIP-seq and genomic DNA for entire genome sequencing will be recovered from the DNA repository.	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina MiSeq	155
EGAD00001002207	Our aim is to identify genes involved in resistance to anti-cancer therapies. In order to do this we have taken advantage of a lentiviral vector (LV)-based insertional mutagen to mutagenize cancer cell lines. LV-transduced cell lines were then treated with anti-cancer therapies and the emergence of resistant clones scored. DNA from pools of resistant clones was collected, subjected to custom capture by baits designed against the LV sequence, and then sequenced to identify the LV-genomic junction. We hope that the identification of recurrently targeted genes in resistant cell population will allow us to identify genes that mediate drug resistance.	Illumina HiSeq 2500 Illumina MiSeq	71
EGAD00001002208	Exome sequencing of short SGA children with IGF-I and insulin resistance. Collaboration with Professor David Dunger, University of Cambridge. Funded by NIHR.	Illumina HiSeq 2000	15
EGAD00001002210	Congenital anosmias can be complete (the lack of a sense of smell) or specific (the inability to detect specific smells). To date, only a single recessive gene underlying complete anosmia has been identified. Here we sequenced the exomes of 10 individuals from a single family, including three with complete anosmia, across three generations to identify the genetic basis of congenital anosmia in this family. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2000	10
EGAD00001002211	Given the central importance of Africa to studies of human origins, genetic diversity and disease susceptibility, large-scale and representative characterisation of genetic diversity in Africa is needed. Analyses of ancient DNA from Africa would complement sequencing of modern African populations and provide unique opportunities to transform our understanding of the pre-history of the region. This approach would greatly refine our understanding of population structure and gene flow in Africa and globally, including genetic signatures of ancient admixture. This low coverage sequencing experiment will allow us to test and refine our pipeline for ancient DNA sequencing. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2000	6
EGAD00001002212	Non-syndromic cases of congenital heart defects (CHD) exhibit variable modes of inheritance (Mendelian and non-Mendelian). Several studies have identified strong candidates in humans by taking a candidate gene approach as well as by using whole exome next generation sequencing (NGS). So far these studies could only explain a minor fraction of the observed phenotype in humans, most of them in syndromic cases and no single study has focused on the subset of cases with left ventricular outflow tract obstruction (LVOTO). To discover novel disease-causing genes a large cohort of patients with LVOTO, approximately 100 cases, 25 families and 100 trios have been exome sequenced. This study based on NGS sequencing data yielded several known and novel compelling candidate genes, such as MYH6, NR2F2 and MYH11, but also novel ones, such as ITGB4. To evaluate the significance of our findings in a replication cohort we assembled another 1614 cases with an LVOTO phenotype from our collaborators in Toronto, Berlin and Amsterdam. Targeted resequencing in this additional cohort will help to find additional cases with mutations in the identified candidate genes to strengthen genotype-phenotype association. We will use control data from the INTERVAL project for case/control analyses The pulldowns will be performed as 24-plex ISC with 192 or greater indexes, and the sequencing will be performed with 192 samples per lane, requiring 9 lanes of sequencing.	Illumina HiSeq 2000	1376
EGAD00001002213	This study involves exome sequencing of blood/bone marrow DNA from patients with myeloid malignancies. Blood DNA samples have been taken from patients at different timepoints of disease phenotype. We hope to elucidate mechanisms of clonal evolution in these patients.	Illumina HiSeq 2000	32
EGAD00001002214	Whole transcriptome sequencing generated from patient, neurosphere and xenograft samples	Illumina HiSeq 2000	64
EGAD00001002215	Low coverage whole genome sequencing plasma DNA from 50 male, 54 female non-cancer donors. For the analysis of nucleosomal positioning all data from the non-cancer controls were merged. Furthermore, two patients with metastasized breast cancer were sequenced on a NextSeq with higher depth.	Illumina MiSeq NextSeq 550	108
EGAD00001002216	RNA-Seq on an Ion Torrent Proton of corresponding tumor material of two metastasized breast cancer patients (Breast7, Breast13).	Ion Torrent Proton	2
EGAD00001002217	Merged file of low-coverage WGS from 179 plasma DNA samples from non-cancer controls and cancer patients for assessment of size distribution of plasma nuclear DNA fragments.	Illumina MiSeq	1
EGAD00001002218	Sequencing data for ICGC Oesophageal Adenocarcinoma tissue samples - 129_cohort EAC whole genomic sequencing data - Publication Secrier & Li et al., 2016, Nature Genetics	Illumina HiSeq 2000	10
EGAD00001002219	Whole exome sequencing generated from 13 sets of patient, neurosphere and xenograft samples	Illumina HiSeq 2000	82
EGAD00001002220	Enteropathy-associated T-cell lymphoma (EATL), a rare and aggressive intestinal malignancy of intraepithelial T lymphocytes, comprises two disease variants (EATL-I and EATL-II) differing in clinical characteristics and pathological features. Here we report findings derived from whole exome sequencing of 15 EATL-II tumor-normal tissue pairs.		15
EGAD00001002221	Whole exome sequencing of a subset of participants from the INTERVAL study.	Illumina HiSeq 2000	4502
EGAD00001002225	This study involves targeted sequencing of samples from myeloid malignancies at different timepoints to assess clonal evolution of malignancy a. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina MiSeq	147
EGAD00001002226	1. Odors are detected, firstly, by olfactory sensory neurons (OSNs) in the olfactory epithelium of the nose. This neurons then project directly to the olfactory bulb in the brain. Olfaction depends on cellular regeneration of the OE, olfactory bulb and hippocampus, and on their continual re-wiring. The olfactory neural pathway includes regions of the frontal, temporal and limbic brain, which in turn overlap with brain areas involved in brain disorders. OSNs are the only aspect of the human brain exposed to the external environment. This not only makes them vulnerable to environmental changes, but also accessible for biomedical studies. We have already sequenced and developed a protocol for analyzing the transcriptome of mouse main olfactory epithelium and single OSNs. We propose here to perform a similar study for samples from the human olfactory epithelium. We have developed a minimally invasive method for obtaining human OSNs, among other cells from the nasal epithelium. In this experiment, we have obtained cell samples from the olfactory epithelium, including OSN, from healthy volunteers. We would like to further characterize them by RNA sequencing. This will give us valuable insight into human olfaction. It will also provide a first step into a new avenue to study, and find biomarkers for, brain diseases though the analysis of these easily available neurons. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2500	8
EGAD00001002227	In collaboration with Dr David Savage, we have identified a patient with a very unusual phenotype, lacking almost all visceral fat, but showing a massive accumulation of white fat tissue behind her neck and significantly elevated liver fat. Whole exome sequencing of the proband and her unaffected parents and brother has been run previously, however no causative variant has been found and the sequencing coverage was generally poor. We propose to conduct whole genome sequencing of all 4 family members at a depth of 30X.	HiSeq X Ten	3
EGAD00001002228	Congenital anosmias can be complete (the lack of a sense of smell) or specific (the inability to detect specific smells). Here we obtained genomic DNA from families with multiple individuals with anosmia, suggesting they are congenital. These include those inherited in a manner consistent with dominant and recessive alleles. We have sequenced the exomes of both affected and unaffected family members on the Illumina platform.	Illumina HiSeq 2000	24
EGAD00001002229	Detection of BAP1 mutations in DNA from uveal melanoma and mesothelioma samples.	Illumina HiSeq 2000	22
EGAD00001002230	Patient-derived xenografts (n=96) were derived from metastatic melanoma patients. RNA expression profiling will be preformed to study 1. HLA-typing and 2. the effect of the tumour microenvironment on tumour growth This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2000	96
EGAD00001002231	Many studies over the past 10 years, culminating in the recent report of the International Stem Cell Initiative (ISCI, 2011) have shown that hPSC acquire genetic and epigenetic changes during their time in culture. Many of the genetic changes are non-random and recurrent, probably because they provide a selective growth advantage to the undifferentiated cells. Some are shared by embryonal carcinoma cells, the malignant counterparts of ES cells. The origins of these growth advantages are poorly understood, but may come from altered cell cycle dynamics, resistance to apoptosis or altered patterns of differentiation. Less is known about the nature and consequences of epigenetic changes, but it is likely that these similarly affect hPSC behaviour; e.g., enhanced expression of DLK1, an imprinted gene, is associated with altered hPSC growth (Enver et al 2005). Inevitably, these genetic and epigenetic changes will impact on our ability to use hPSC for regenerative medicine, either because malignant transformation of the undifferentiated cells or their differentiated derivatives to be used for transplantation compromises safety, or because they impede the function of those differentiated derivatives, or because they affect the efficiency with which the undifferentiated cells can be expanded and differentiated into desired cell types. Focusing initially upon the existing clinical grade hESC lines, later moving to iPSC, we will Consolidate and extend knowledge of the rate, type and functional impact of the genetic variations that occur during hPSC culture. We will use whole genome and exome sequencing as well as SNP arrays, together with clonal analysis and other cytogenetics techniques. Common changes will be compared with those found in the normal human population, at low frequency in the original cell population or observed during iPSC generation in the HIPSCI project currently based at the WTSI. These studies will provide a better understanding of the range of genetic changes that occur in hPSC beyond the CNVs already identified. In conjunction with cancer genome resources and expertise at WTSI, bioinformatic analyses of these hPSC data will allow us to assess potential impact on hPSC behaviour pertinent to applications in regenerative medicine, notably the likelihood that specific changes arising in undifferentiated PSC cultures may be associated with potential malignant transformation of differentiated progeny. This data is part of a pre-publication release. For information on the proper use of pre-publication data shred by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	HiSeq X Ten	80
EGAD00001002232	Mapping genetic evolution of pancreatic cancer precursor lesions such as IPMNs and PanINs.	Illumina HiSeq 2000	20
EGAD00001002233	RNA sequencing of peripheral immune cells from patients +/- an IBD risk variant. Peripheral immune cells +/- in vitro test compound treatment. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2000	24
EGAD00001002234	This study involves mutagenizing C32, a melanoma cell line, with ENU to identify those mutations which engender resistance to a targeted treatment.	Illumina HiSeq 2000	84
EGAD00001002235	Many studies over the past 10 years, culminating in the recent report of the International Stem Cell Initiative (ISCI, 2011) have shown that hPSC acquire genetic and epigenetic changes during their time in culture. Many of the genetic changes are non-random and recurrent, probably because they provide a selective growth advantage to the undifferentiated cells. Some are shared by embryonal carcinoma cells, the malignant counterparts of ES cells. The origins of these growth advantages are poorly understood, but may come from altered cell cycle dynamics, resistance to apoptosis or altered patterns of differentiation. Less is known about the nature and consequences of epigenetic changes, but it is likely that these similarly affect hPSC behaviour; e.g., enhanced expression of DLK1, an imprinted gene, is associated with altered hPSC growth (Enver et al 2005). Inevitably, these genetic and epigenetic changes will impact on our ability to use hPSC for regenerative medicine, either because malignant transformation of the undifferentiated cells or their differentiated derivatives to be used for transplantation compromises safety, or because they impede the function of those differentiated derivatives, or because they affect the efficiency with which the undifferentiated cells can be expanded and differentiated into desired cell types. Focusing initially upon the existing clinical grade hESC lines, later moving to iPSC, we will Consolidate and extend knowledge of the rate, type and functional impact of the genetic variations that occur during hPSC culture. We will use whole genome and exome sequencing as well as SNP arrays, together with clonal analysis and other cytogenetics techniques. Common changes will be compared with those found in the normal human population, at low frequency in the original cell population or observed during iPSC generation in the HIPSCI project currently based at the WTSI. These studies will provide a better understanding of the range of genetic changes that occur in hPSC beyond the CNVs already identified. In conjunction with cancer genome resources and expertise at WTSI, bioinformatic analyses of these hPSC data will allow us to assess potential impact on hPSC behaviour pertinent to applications in regenerative medicine, notably the likelihood that specific changes arising in undifferentiated PSC cultures may be associated with potential malignant transformation of differentiated progeny. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2000	80
EGAD00001002236	The disordered transcriptomes of cancer encompass direct effects of somatic mutation on transcription; co-ordinated secondary alterations in transcriptional pathways; and increased transcriptional noise. To catalogue the rules governing how somatic mutation Overall, 59% of 6980 exonic substitutions were expressed. Compared to other classes, nonsense mutations showed lower expression levels than expected with patterns characteristic of nonsense-mediated decay. 14% of 4234 genomic rearrangements caused transcriptional abnormalities, including exon skips, exon reusage, fusion transcripts and premature poly-adenylation. We found productive, stable transcription from sense-to-antisense gene fusions and gene-to-intergenic rearrangements, suggesting that these mutation classes may drive more transcriptional disruption than previously suspected. Systematic integration of transcriptome with genome data therefore reveals the rules by which transcriptional machinery interprets somatic mutation.	Illumina Genome Analyzer II Illumina HiSeq 2000	32
EGAD00001002237	The disordered transcriptomes of cancer encompass direct effects of somatic mutation on transcription; co-ordinated secondary alterations in transcriptional pathways; and increased transcriptional noise. To catalogue the rules governing how somatic mutation Overall, 59% of 6980 exonic substitutions were expressed. Compared to other classes, nonsense mutations showed lower expression levels than expected with patterns characteristic of nonsense-mediated decay. 14% of 4234 genomic rearrangements caused transcriptional abnormalities, including exon skips, exon reusage, fusion transcripts and premature poly-adenylation. We found productive, stable transcription from sense-to-antisense gene fusions and gene-to-intergenic rearrangements, suggesting that these mutation classes may drive more transcriptional disruption than previously suspected. Systematic integration of transcriptome with genome data therefore reveals the rules by which transcriptional machinery interprets somatic mutation.	Illumina Genome Analyzer II Illumina HiSeq 2000	59
EGAD00001002238	ChIP-Seq (H3K4me3, H3K4me1, H3K9me3, H3K27ac, H3K27me3, H3K36me3, Input) data for HL60 cell line generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency.	Illumina HiSeq 2000	1
EGAD00001002239	June 2016 data update (bam/fastq for CEMT0062, CEMT0068, CEMT0072, CEMT0086, CEMT0087 ChIP-Seq and RNA-Seq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2500	10
EGAD00001002240	Whole-exome sequencing of a RUNX1-mutated pedigree, including samples from mother, father and four offsprings. Recurrent somatic JAK-STAT mutations were found among the diseased individuals.	Illumina HiSeq 2000	6
EGAD00001002241	Sequencing data for ICGC Oesophageal Adenocarcinoma tissue samples - chemo_cohort	Illumina HiSeq 2000	6
EGAD00001002242	This dataset contains RNA-seq and Hi-C data files of induced pluripotent stem (iPS) cells and iPS cell-derived neural progenitors (NPCs) derived from a germline chromothripsis patient and both parents. iPS cells of the patient (cell lines 14 and 15), the father (lines 23 (with two replicates) and 32) and mother (line 30) were differentiated to NPCs and RNA was collected on day 0, day 7 and day 10 of differentiation. In addition, Hi-C data for two iPS cell-derived NPC lines from the patient (14 and 15) and two lines from the father (23 and 32) was generated.	AB 5500xl Genetic Analyzer Illumina HiSeq 2500 NextSeq 500	22
EGAD00001002243	RNA-seq data for clinical samples	Illumina HiSeq 2500	2
EGAD00001002244	WGS data for cell lines and clinical samples	Illumina HiSeq 2500	4
EGAD00001002245	This data set consists of 82 whole genome low pass sequencing bams used in HF-GBM-Tumor-Neurosphere-Xenograft	Illumina HiSeq 2000	82
EGAD00001002246	The T2D-GENES/GoT2D 13K exome sequencing study includes ~13,000 samples, half T2D cases and half T2D controls, from five ancestries (~5K Europeans, ~2K each of African-American, East-Asian, South-Asian, and Hispanic). Samples underwent deep exome sequencing, with SNVs and INDEls called according to GATK best practices; variant sites were then filtered according to the GATK best practices, and then samples and variants underwent further filtering based on aggregate genotype quality as described in Fuchsberger et al. (e.g. low call rate, excess heterozygosity for samples, low call rate or coverage for variants). Please note that one of the samples in the T2D-GENES vcf does not have phenotype data.		13007
EGAD00001002247	The GoT2D study includes ~2800 samples, half T2D cases and half T2D controls, of Northern European ancestry sequenced over 3 three technologies: deep whole exome sequencing, low-pass (4x) whole genome sequencing, and OMNI 2.5M genotyping. Samples were ascertained to be phenotypically "extreme" (e.g. leaner, younger cases and older, more obese controls). Genotypes (SNVs, INDELs, and SVs) were called separately for each technology and then integrated via genotype refinement into a single phased reference panel; samples and variants were then excluded based on QC procedures described in Fuchsberger et al. Please note that 2 of the samples in the GoT2D vcf do not have phenotype data.		2872
EGAD00001002248	Total of 49 tumor specimens from 20 patients were subjected for whole-exome and/or whole-transcriptome sequencing including matched normal/blood. Tumor samples are acquired based on 4 categories; 1) locally adjacent tumors, 2) multifocal/multicentric tumors, 3) 5-ALA (+/-) tumors and 4) Longitudinal tumors.	Illumina HiSeq 2500	104
EGAD00001002249	Single-Cell RNA Sequencing of 355 cells isolated from 7 tissue fragments of 3 patients corresponding to locally adjacent tumor, multifocal with recurrence and sections segregated by a marker of tumor cellularity (5-ALA).	Illumina HiSeq 2500	355
EGAD00001002250	mRNA-Seq, HiSeq 2000 dataset of the Cell-line use case	Illumina HiSeq 2000	1
EGAD00001002251	Exome sequencing of families with Congenital Heart Defects of diverse sub-phenotypes. Comprises both parent-offspring trios for sporadic cases and multiplex families. Collaboration with David Brook, University of Nottingham. Funded by the British Heart Foundation.	Illumina HiSeq 2000	646
EGAD00001002252	This data set contains next generation sequencing (NGS) data of two serial tumor samples (primary and a metastasis) from a patient with colorectal cancer showing an ERBB2 c.2264T>C (p.Leu755Ser). NGS was performed using the Illumina TruSeq Amplicon Cancer Panel (TSACP, Illumina) covering 212 amplicons in 48 cancer associated genes on the Illumina MiSeq sequencing platform. The dataset contains two BAM files.	Illumina MiSeq	2
EGAD00001002253	Thirty cutaneous SCC WES tumour samples with matched normal include 20 samples from South et al. JID and 10 new samples. These 30 samples has been used to support the findings in the TGFb Nature Communications paper (DOI: 10.1038/ncomms12493). They are also a part of the ongoing study of cSCC genomic landscape of 40 cSCC samples in total.	Illumina HiSeq 2500	60
EGAD00001002254	Single-end sequencing data (trimmed to 60bp) of 104 plasma samples from donors without tumors (male=50; female=54) were merged and used to establish coverage profiles around the TSS and to establish a gene expression prediction algorithm. Dataset includes merged alignements of low coverage whole genome sequencing from plasma DNA from 50 male, 54 female non-cancer donors. Furthermore, 2 patients with metastasized breast cancer were sequenced on a NextSeq with higher depth.	Illumina MiSeq	3
EGAD00001002255	Sequencing Data for DEEP Paper: "reChIP-seq reveals widespread bivalency of H3K4me3 and H3K27me3 in CD4+ memory T-Cells" Sample: 51_Hf01_BlCM_Ct (human, female, Blood, CD4+ central memory cell, normal control) Sequencing types are: total RNA, Whole Genome Bisulfite, ChipSeq (H3K27ac, H3K9me3, H3k36me3, H3K4me1, H3k27me3, H3K4me3, Input), reChipSeq (H3K27me3, H3K4me3)		1
EGAD00001002256	Corresponding data set is composed of whole exome sequencing of Korean ER positive breast cancer under 35. This set provides 100 alignment files from normal-tumor paired whole exome sequencing of 50 patients. This is a part of total project data set.	Illumina HiSeq 2500	100
EGAD00001002257	This dataset includes whole genome sequence information for three individuals (Mother, Father and Newborn) used in this study. Genomes were sequenced using Illumina HiSeq technology. Files included are fastq files in paired read format.	Illumina HiSeq 2000	3
EGAD00001002258		Illumina HiSeq 2000	2
EGAD00001002259		Illumina HiSeq 2000	37
EGAD00001002260	Sequencing data for ICGC Oesophageal Adenocarcinoma tissue samples - 129_rnaseq EAC expression data - Publication Secrier & Li et al., 2016, Nature Genetics	Illumina HiSeq 2000	15
EGAD00001002261	These files contain indels and structural variants on 769 GoNL samples (SV release 6, 2016-05-25).	Illumina HiSeq 2000;	-
EGAD00001002262	26 cell lines derived from human Diffuse Large B Cell lymphomas (DLBCL) or Burkit Lymphomas (BL) were subjected to whole exome sequencing. Exome capture was carried out using the SeqCap EZ Exome Library 2.0 kit (Roche/Nimblegen) and 100 bp single-read sequencing was performed on a HiSeq2500 (Illumina). 82% of the coding region was covered at least 30x.	Illumina HiSeq 2500	26
EGAD00001002263	This is the first dataset for the Botseq sequencing project	Illumina HiSeq 2000 Illumina HiSeq 2500	39
EGAD00001002264	This data set consists of whole genome SMRT sequencing fastqs generated from 2 xenograft samples.	PacBio RS II	2
EGAD00001002265	A pulldown experiment with Agilent SureSelect probes designed on regions that were more likely to contain de novo mutations. 266 candidate sites were selected based on whole genome sequencing data. The probes also included the exons of genes that have been identified as neurodevelopmental disorder genes in DDD (the DDG2P genes) 1,336 targets. In addition, the design included the standard iPLEX sites.		4
EGAD00001002266	The contribution of genetic predisposing factors to the development of pediatric acute lymphoblastic leukemia (ALL), the most frequently diagnosed cancer in childhood, has not been fully elucidated. Children presenting with multiple de novo leukemias are more likely to suffer from genetic predisposition. Here, we selected five of these patients and analyzed the mutational spectrum of normal and malignant tissues.	AB 5500xl Genetic Analyzer AB SOLiD 4 System	14
EGAD00001002268	PCHiC	Illumina HiSeq 2000	53
EGAD00001002269	We expressed PDGFRAmut, wild-type PDGFRA and a GFP control from lentivirus, in two primary GBM patient-derived cell lines that we had cultured as monolayers.	Illumina HiSeq 4000	1
EGAD00001002270	We collected fresh tissue from an untreated GBM (SF10282) directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine, resulting in sequencing libraries from 96 individual cells.	Illumina HiSeq 2500	1
EGAD00001002271	We collected fresh tissue from an untreated GBM (SF10345) directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine, resulting in sequencing libraries from 96 individual cells.	Illumina HiSeq 2500	1
EGAD00001002272	We collected fresh tissue from an untreated GBM (SF10360) directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine, resulting in sequencing libraries from 96 individual cells.	Illumina HiSeq 2500	1
EGAD00001002273	We performed bulk exome-seq on a primary GBM and a blood sample from SF10345	Illumina HiSeq 2500	1
EGAD00001002274	We performed bulk exome-seq on a primary GBM and a blood sample from SF10360	Illumina HiSeq 2500	1
EGAD00001002275	We performed bulk exome-seq on a primary GBM and a blood sample from SF10282	Illumina HiSeq 2500	1
EGAD00001002276	Exome sequencing reads of two UFM individuals and their family members (totally 11 individuals) belonging to two different Fragile X families. Alignment files in BAM format are provided.	Illumina HiSeq 2000	11
EGAD00001002277	Variation in the Glucose Transporter gene SLC2A2 is associated with glycaemic response to metformin		1
EGAD00001002278			58
EGAD00001002279	ChIP-Seq data for 3 monocyte - None sample(s). 17 run(s), 17 experiment(s), 17 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002280	ChIP-Seq data for 1 Acute Lymphocytic Leukemia - CTR sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	1
EGAD00001002281	ChIP-Seq data for 5 plasma cell sample(s). 24 run(s), 23 experiment(s), 23 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	5
EGAD00001002282	ChIP-Seq data for 1 unswitched memory B cell sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	1
EGAD00001002283	ChIP-Seq data for 3 effector memory CD8-positive, alpha-beta T cell sample(s). 16 run(s), 15 experiment(s), 15 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002284	Bisulfite-Seq data for 3 immature conventional dendritic cell - GM-CSF_IL4_T=6_days sample(s). 61 run(s), 4 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	3
EGAD00001002285	DNase-Hypersensitivity data for 1 CD8-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816	Illumina HiSeq 2000	1
EGAD00001002286	DNase-Hypersensitivity data for 28 CD14-positive, CD16-negative classical monocyte sample(s). 28 run(s), 28 experiment(s), 28 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816	Illumina HiSeq 2000	28
EGAD00001002287	RNA-Seq data for 1 T-cell Acute Lymphocytic Leukemia sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	1
EGAD00001002288	RNA-Seq data for 3 neutrophilic myelocyte sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	3
EGAD00001002289	RNA-Seq data for 1 CD8-positive, alpha-beta thymocyte sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	1
EGAD00001002290	DNase-Hypersensitivity data for 4 macrophage - T=6days LPS sample(s). 6 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816	Illumina HiSeq 2000	4
EGAD00001002291	Bisulfite-Seq data for 1 precursor lymphocyte of B lineage sample(s). 11 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	1
EGAD00001002292	ChIP-Seq data for 3 Acute Myeloid Leukemia - SAHA sample(s). 14 run(s), 14 experiment(s), 14 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002293	ChIP-Seq data for 2 regulatory T cell sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	2
EGAD00001002294	Bisulfite-Seq data for 2 endothelial cell of umbilical vein (resting) sample(s). 35 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002295	RNA-Seq data for 2 CD8-positive, alpha-beta T cell sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	2
EGAD00001002296	ChIP-Seq data for 3 monocyte - RPMI_LPS_T=24hrs_RPMI_T=5days_LPS_T=4hrs sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002297	ChIP-Seq data for 2 monocyte - RPMI_BG_T=1hr sample(s). 6 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000 unspecified	2
EGAD00001002298	ChIP-Seq data for 1 Acute Promyelocytic Leukemia - SAHA sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	1
EGAD00001002299	RNA-Seq data for 1 CD4-positive, alpha-beta thymocyte sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	1
EGAD00001002300	DNase-Hypersensitivity data for 2 monocyte - T=0days sample(s). 4 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816	Illumina HiSeq 2000	2
EGAD00001002301	Bisulfite-Seq data for 1 monocyte - RPMI_BG_T=4hrs sample(s). 14 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	1
EGAD00001002302	Bisulfite-Seq data for 1 monocyte - RPMI_LPS_T=24hrs sample(s). 22 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	1
EGAD00001002303	Bisulfite-Seq data for 3 T-cell Prolymphocytic Leukemia sample(s). 45 run(s), 3 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	3
EGAD00001002304	ChIP-Seq data for 1 Acute Promyelocytic Leukemia sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	1
EGAD00001002305	Bisulfite-Seq data for 6 alternatively activated macrophage sample(s). 94 run(s), 7 experiment(s), 12 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	6
EGAD00001002306	RNA-Seq data for 3 granulocyte monocyte progenitor cell sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	3
EGAD00001002307	ChIP-Seq data for 3 Activated B-Cell-Like Diffuse Large B-Cell Lymphoma sample(s). 12 run(s), 12 experiment(s), 12 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002308	RNA-Seq data for 8 CD14-positive, CD16-negative classical monocyte sample(s). 8 run(s), 8 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	8
EGAD00001002309	Bisulfite-Seq data for 2 mature eosinophil sample(s). 23 run(s), 2 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002310	ChIP-Seq data for 1 conventional dendritic cell sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	1
EGAD00001002311	Bisulfite-Seq data for 2 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 29 run(s), 2 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002312	ChIP-Seq data for 1 Acute Promyelocytic Leukemia - MC2884 (24h) sample(s). 6 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	1
EGAD00001002313	Bisulfite-Seq data for 7 Acute Lymphocytic Leukemia sample(s). 132 run(s), 9 experiment(s), 14 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	7
EGAD00001002314	ChIP-Seq data for 2 macrophage - T=6days B-glucan sample(s). 6 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000 NextSeq 500	2
EGAD00001002315	RNA-Seq data for 6 naive B cell sample(s). 6 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	6
EGAD00001002316	RNA-Seq data for 6 hematopoietic stem cell sample(s). 13 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	6
EGAD00001002317	ChIP-Seq data for 7 alternatively activated macrophage sample(s). 50 run(s), 49 experiment(s), 49 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	7
EGAD00001002318	ChIP-Seq data for 4 neutrophilic myelocyte sample(s). 28 run(s), 23 experiment(s), 23 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	4
EGAD00001002319	ChIP-Seq data for 3 Acute Promyelocytic Leukemia - ATRA sample(s). 21 run(s), 20 experiment(s), 20 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002320	RNA-Seq data for 1 effector memory CD8-positive, alpha-beta T cell, terminally differentiated sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	1
EGAD00001002321	RNA-Seq data for 4 cytotoxic CD56-dim natural killer cell sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	4
EGAD00001002322	Bisulfite-Seq data for 5 plasma cell sample(s). 77 run(s), 5 experiment(s), 10 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	5
EGAD00001002323	RNA-Seq data for 7 plasma cell sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	7
EGAD00001002324	Bisulfite-Seq data for 1 monocyte - RPMI_LPS_T=24hrs_RPMI_T=5days sample(s). 15 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	1
EGAD00001002325	Bisulfite-Seq data for 2 CD3-positive, CD4-positive, CD8-positive, double positive thymocyte sample(s). 29 run(s), 2 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002326	RNA-Seq data for 2 mature eosinophil sample(s). 3 run(s), 3 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	2
EGAD00001002327	Bisulfite-Seq data for 1 monocyte - RPMI_BG_T=1hr sample(s). 14 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	1
EGAD00001002328	ChIP-Seq data for 3 monocyte - RPMI_T=4hrs sample(s). 8 run(s), 8 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002329	ChIP-Seq data for 3 Burkitt Lymphoma sample(s). 13 run(s), 13 experiment(s), 13 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002330	Bisulfite-Seq data for 1 precursor B cell sample(s). 6 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	1
EGAD00001002331	Bisulfite-Seq data for 3 neutrophilic myelocyte sample(s). 31 run(s), 3 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	3
EGAD00001002332	RNA-Seq data for 1 Acute Lymphocytic Leukemia - CTR sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	1
EGAD00001002333	Bisulfite-Seq data for 2 Acute Myeloid Leukemia - CTR sample(s). 28 run(s), 2 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002334	RNA-Seq data for 2 adult endothelial progenitor cell sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	2
EGAD00001002335	Bisulfite-Seq data for 1 monocyte - RPMI_T=24hrs sample(s). 14 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	1
EGAD00001002336	RNA-Seq data for 5 Mantle Cell Lymphoma sample(s). 5 run(s), 5 experiment(s), 5 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	5
EGAD00001002337	RNA-Seq data for 4 macrophage - T=6days LPS sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	4
EGAD00001002338	RNA-Seq data for 4 monocyte - None sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	4
EGAD00001002339	RNA-Seq data for 6 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 24 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	6
EGAD00001002340	ChIP-Seq data for 7 Acute Myeloid Leukemia - CTR sample(s). 45 run(s), 44 experiment(s), 44 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	7
EGAD00001002341	RNA-Seq data for 2 endothelial cell of umbilical vein (proliferating) sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	2
EGAD00001002342	ChIP-Seq data for 1 Acute Promyelocytic Leukemia - MS275 sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	1
EGAD00001002343	RNA-Seq data for 2 Acute Myeloid Leukemia - SAHA sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	2
EGAD00001002344	ChIP-Seq data for 1 Acute Myeloid Leukemia - MC2884 sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	1
EGAD00001002345	RNA-Seq data for 2 conventional dendritic cell sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	2
EGAD00001002346	Bisulfite-Seq data for 2 effector memory CD8-positive, alpha-beta T cell, terminally differentiated sample(s). 34 run(s), 3 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002347	RNA-Seq data for 1 memory B cell sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	1
EGAD00001002348	RNA-Seq data for 10 CD4-positive, alpha-beta T cell sample(s). 10 run(s), 10 experiment(s), 10 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	10
EGAD00001002349	RNA-Seq data for 2 central memory CD4-positive, alpha-beta T cell sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	2
EGAD00001002350	DNase-Hypersensitivity data for 3 erythroblast sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816	Illumina HiSeq 2000	3
EGAD00001002351	RNA-Seq data for 1 regulatory T cell sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	1
EGAD00001002352	RNA-Seq data for 1 Acute Promyelocytic Leukemia - MC2884 sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	1
EGAD00001002353	RNA-Seq data for 3 Acute Promyelocytic Leukemia - CTR sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	3
EGAD00001002354	Bisulfite-Seq data for 3 class switched memory B cell sample(s). 43 run(s), 4 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	3
EGAD00001002355	DNase-Hypersensitivity data for 37 Acute Myeloid Leukemia sample(s). 38 run(s), 37 experiment(s), 37 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816	Illumina HiSeq 2000	37
EGAD00001002356	RNA-Seq data for 3 Acute Promyelocytic Leukemia - ATRA sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	3
EGAD00001002357	ChIP-Seq data for 1 Acute Promyelocytic Leukemia - MC2884 sample(s). 8 run(s), 8 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	1
EGAD00001002358	RNA-Seq data for 8 erythroblast sample(s). 30 run(s), 8 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	8
EGAD00001002359	RNA-Seq data for 1 Acute Promyelocytic Leukemia - MC2392 sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	1
EGAD00001002360	RNA-Seq data for 4 monocyte - T=0days sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	4
EGAD00001002361	Bisulfite-Seq data for 3 naive B cell sample(s). 39 run(s), 3 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	3
EGAD00001002362	ChIP-Seq data for 3 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 19 run(s), 18 experiment(s), 18 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002363	RNA-Seq data for 3 hematopoietic multipotent progenitor cell sample(s). 9 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	3
EGAD00001002364	Bisulfite-Seq data for 2 central memory CD4-positive, alpha-beta T cell sample(s). 41 run(s), 3 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002365	RNA-Seq data for 1 blast forming unit erythroid sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	1
EGAD00001002366	RNA-Seq data for 3 neutrophilic metamyelocyte sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	3
EGAD00001002367	Bisulfite-Seq data for 2 effector memory CD4-positive, alpha-beta T cell sample(s). 34 run(s), 3 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002368	ChIP-Seq data for 3 monocyte - RPMI_BG_T=4hrs sample(s). 8 run(s), 8 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000 unspecified	3
EGAD00001002369	ChIP-Seq data for 2 CD3-negative, CD4-positive, CD8-positive, double positive thymocyte sample(s). 7 run(s), 5 experiment(s), 5 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	2
EGAD00001002370	Bisulfite-Seq data for 2 CD8-positive, alpha-beta thymocyte sample(s). 28 run(s), 2 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002371	Bisulfite-Seq data for 1 monocyte - RPMI_T=6days sample(s). 14 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	1
EGAD00001002372	ChIP-Seq data for 1 Acute Promyelocytic Leukemia - MC2884 (4h) sample(s). 6 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	1
EGAD00001002373	Bisulfite-Seq data for 1 macrophage - T=6days untreated sample(s). 15 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	1
EGAD00001002374	DNase-Hypersensitivity data for 3 inflammatory macrophage sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816	Illumina HiSeq 2000	3
EGAD00001002375	RNA-Seq data for 2 monocyte sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	2
EGAD00001002376	ChIP-Seq data for 2 monocyte - RPMI_LPS_T=1hr sample(s). 8 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000 unspecified	2
EGAD00001002377	ChIP-Seq data for 2 erythroblast sample(s). 14 run(s), 14 experiment(s), 14 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	2
EGAD00001002378	Bisulfite-Seq data for 3 band form neutrophil sample(s). 34 run(s), 3 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	3
EGAD00001002379	ChIP-Seq data for 4 Multiple Myeloma sample(s). 34 run(s), 28 experiment(s), 28 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	4
EGAD00001002380	RNA-Seq data for 3 segmented neutrophil of bone marrow sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	3
EGAD00001002381	ChIP-Seq data for 2 mesenchymal stem cell of the bone marrow sample(s). 16 run(s), 14 experiment(s), 14 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	2
EGAD00001002382	DNase-Hypersensitivity data for 4 macrophage sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816	Illumina HiSeq 2000	4
EGAD00001002383	Bisulfite-Seq data for 2 effector memory CD8-positive, alpha-beta T cell sample(s). 30 run(s), 3 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002384	ChIP-Seq data for 106 Chronic Lymphocytic Leukemia sample(s). 174 run(s), 163 experiment(s), 162 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	107
EGAD00001002385	Bisulfite-Seq data for 1 monocyte - RPMI_BG_T=24hrs_RPMI_T=5days sample(s). 18 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	1
EGAD00001002386	ChIP-Seq data for 1 CD4-positive, alpha-beta thymocyte sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	1
EGAD00001002387	RNA-Seq data for 7 alternatively activated macrophage sample(s). 9 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	7
EGAD00001002388	ChIP-Seq data for 3 monocyte - RPMI_LPS_T=24hrs sample(s). 8 run(s), 8 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000 unspecified	3
EGAD00001002389	ChIP-Seq data for 3 Lymphoma_Follicular sample(s). 11 run(s), 11 experiment(s), 11 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002390	ChIP-Seq data for 4 segmented neutrophil of bone marrow sample(s). 24 run(s), 23 experiment(s), 23 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000 NextSeq 500	4
EGAD00001002391	ChIP-Seq data for 2 osteoclast sample(s). 17 run(s), 14 experiment(s), 14 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	2
EGAD00001002392	Bisulfite-Seq data for 4 Type 1 diabetes mellitus sample(s). 32 run(s), 4 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	4
EGAD00001002393	Bisulfite-Seq data for 3 segmented neutrophil of bone marrow sample(s). 34 run(s), 3 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	3
EGAD00001002394	Bisulfite-Seq data for 1 monocyte - RPMI_T=4hrs sample(s). 15 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	1
EGAD00001002395	Bisulfite-Seq data for 2 monocyte - None sample(s). 70 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002396	Bisulfite-Seq data for 6 Chronic Lymphocytic Leukemia sample(s). 84 run(s), 6 experiment(s), 12 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	6
EGAD00001002397	ChIP-Seq data for 5 Mantle Cell Lymphoma sample(s). 35 run(s), 35 experiment(s), 35 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	5
EGAD00001002398	DNase-Hypersensitivity data for 4 macrophage - T=6days untreated sample(s). 6 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816	Illumina HiSeq 2000	4
EGAD00001002399	ChIP-Seq data for 2 monocyte - RPMI_LPS_T=4hrs sample(s). 5 run(s), 5 experiment(s), 5 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	2
EGAD00001002400	ChIP-Seq data for 9 T-cell Acute Lymphocytic Leukemia sample(s). 41 run(s), 41 experiment(s), 41 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	9
EGAD00001002401	RNA-Seq data for 14 Multiple Myeloma sample(s). 14 run(s), 14 experiment(s), 14 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	14
EGAD00001002402	RNA-Seq data for 1 late basophilic and polychromatophilic erythroblast sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	1
EGAD00001002403	Bisulfite-Seq data for 4 cytotoxic CD56-dim natural killer cell sample(s). 54 run(s), 5 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	4
EGAD00001002404	Bisulfite-Seq data for 2 adult endothelial progenitor cell sample(s). 38 run(s), 3 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002405	Bisulfite-Seq data for 1 monocyte - T=0days sample(s). 15 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	1
EGAD00001002406	ChIP-Seq data for 2 Acute Promyelocytic Leukemia - MC2392 sample(s). 14 run(s), 12 experiment(s), 12 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	2
EGAD00001002407	Bisulfite-Seq data for 2 Acute Promyelocytic Leukemia - CTR sample(s). 27 run(s), 3 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002408	ChIP-Seq data for 3 monocyte - RPMI_BG_T=24hrs sample(s). 8 run(s), 8 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002409	RNA-Seq data for 14 mature neutrophil sample(s). 14 run(s), 14 experiment(s), 13 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	13
EGAD00001002410	Bisulfite-Seq data for 1 macrophage - T=6days B-glucan sample(s). 15 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	1
EGAD00001002411	ChIP-Seq data for 3 T-cell Prolymphocytic Leukemia sample(s). 21 run(s), 21 experiment(s), 21 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002412	Bisulfite-Seq data for 3 mature neutrophil - G-CSF/Dex. Treatment (16-20 hrs) sample(s). 33 run(s), 3 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	3
EGAD00001002413	ChIP-Seq data for 3 monocyte - RPMI_T=6days sample(s). 10 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002414	RNA-Seq data for 1 unswitched memory B cell sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	1
EGAD00001002415	ChIP-Seq data for 3 monocyte - Attached_T=1hr sample(s). 8 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000 unspecified	3
EGAD00001002416	Bisulfite-Seq data for 3 memory B cell sample(s). 47 run(s), 4 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	3
EGAD00001002417	RNA-Seq data for 6 inflammatory macrophage sample(s). 6 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	6
EGAD00001002418	ChIP-Seq data for 38 Acute Myeloid Leukemia sample(s). 244 run(s), 226 experiment(s), 226 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	38
EGAD00001002419	Bisulfite-Seq data for 19 Acute Myeloid Leukemia sample(s). 338 run(s), 32 experiment(s), 38 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	19
EGAD00001002420	ChIP-Seq data for 1 monocyte - T=10day_RANK_M-CSF sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	1
EGAD00001002421	ChIP-Seq data for 15 Acute Lymphocytic Leukemia sample(s). 79 run(s), 78 experiment(s), 78 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	15
EGAD00001002422	RNA-Seq data for 1 CD3-negative, CD4-positive, CD8-positive, double positive thymocyte sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	1
EGAD00001002423	Bisulfite-Seq data for 2 erythroblast sample(s). 35 run(s), 2 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002424	ChIP-Seq data for 2 endothelial cell of umbilical vein (proliferating) sample(s). 14 run(s), 14 experiment(s), 14 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	2
EGAD00001002425	DNase-Hypersensitivity data for 4 macrophage - T=6days B-glucan sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816	Illumina HiSeq 2000	4
EGAD00001002426	RNA-Seq data for 3 T-cell Prolymphocytic Leukemia sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	3
EGAD00001002427	Bisulfite-Seq data for 2 osteoclast sample(s). 88 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002428	Bisulfite-Seq data for 3 mature conventional dendritic cell - GM-CSF_IL4_T=6_days_R848_T=24hrs sample(s). 60 run(s), 4 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	3
EGAD00001002429	Bisulfite-Seq data for 6 inflammatory macrophage sample(s). 83 run(s), 6 experiment(s), 12 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	6
EGAD00001002430	ChIP-Seq data for 3 class switched memory B cell sample(s). 21 run(s), 21 experiment(s), 21 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002431	ChIP-Seq data for 1 Acute Promyelocytic Leukemia - MC3324 sample(s). 2 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	1
EGAD00001002432	Bisulfite-Seq data for 1 monocyte - RPMI_LPS_T=4hrs sample(s). 18 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	1
EGAD00001002433	RNA-Seq data for 4 megakaryocyte-erythroid progenitor cell sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	4
EGAD00001002434	Bisulfite-Seq data for 2 hematopoietic multipotent progenitor cell sample(s). 16 run(s), 3 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002435	ChIP-Seq data for 4 neutrophilic metamyelocyte sample(s). 32 run(s), 23 experiment(s), 23 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	4
EGAD00001002436	RNA-Seq data for 2 endothelial cell of umbilical vein (resting) sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	2
EGAD00001002437	Bisulfite-Seq data for 2 conventional dendritic cell sample(s). 30 run(s), 2 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002438	RNA-Seq data for 3 CD38-negative naive B cell sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	3
EGAD00001002439	ChIP-Seq data for 3 central memory CD4-positive, alpha-beta T cell sample(s). 11 run(s), 9 experiment(s), 9 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002440	Bisulfite-Seq data for 1 thymocyte sample(s). 14 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	1
EGAD00001002441	Bisulfite-Seq data for 1 monocyte - RPMI_BG_T=24hrs sample(s). 21 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	1
EGAD00001002442	ChIP-Seq data for 4 germinal center B cell sample(s). 24 run(s), 22 experiment(s), 22 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	4
EGAD00001002443	RNA-Seq data for 7 Acute Myeloid Leukemia - CTR sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	7
EGAD00001002444	ChIP-Seq data for 9 CD4-positive, alpha-beta T cell sample(s). 68 run(s), 63 experiment(s), 63 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	9
EGAD00001002445	ChIP-Seq data for 3 CD3-positive, CD4-positive, CD8-positive, double positive thymocyte sample(s). 11 run(s), 11 experiment(s), 11 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002446	RNA-Seq data for 3 band form neutrophil sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	3
EGAD00001002447	Bisulfite-Seq data for 1 monocyte - RPMI_T=1hr sample(s). 15 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	1
EGAD00001002448	ChIP-Seq data for 3 monocyte - RPMI_LPS_T=24hrs_RPMI_T=5days sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002449	ChIP-Seq data for 10 mature neutrophil sample(s). 105 run(s), 86 experiment(s), 86 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	10
EGAD00001002450	ChIP-Seq data for 3 central memory CD8-positive, alpha-beta T cell sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002451	Bisulfite-Seq data for 2 endothelial cell of umbilical vein (proliferating) sample(s). 36 run(s), 2 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002452	RNA-Seq data for 3 germinal center B cell sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	3
EGAD00001002453	ChIP-Seq data for 2 monocyte - RPMI_T=1hr sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000 unspecified	2
EGAD00001002454	ChIP-Seq data for 4 band form neutrophil sample(s). 26 run(s), 23 experiment(s), 23 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000 NextSeq 500	4
EGAD00001002455	ChIP-Seq data for 7 CD8-positive, alpha-beta T cell sample(s). 38 run(s), 38 experiment(s), 38 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	7
EGAD00001002456	RNA-Seq data for 2 effector memory CD8-positive, alpha-beta T cell sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	2
EGAD00001002457	RNA-Seq data for 4 macrophage - T=6days untreated sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	4
EGAD00001002458	ChIP-Seq data for 2 macrophage - T=6days untreated sample(s). 6 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000 NextSeq 500	2
EGAD00001002459	DNase-Hypersensitivity data for 2 Chronic Lymphocytic Leukemia sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816	Illumina HiSeq 2000	2
EGAD00001002460	Bisulfite-Seq data for 8 CD4-positive, alpha-beta T cell sample(s). 108 run(s), 8 experiment(s), 16 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	8
EGAD00001002461	RNA-Seq data for 1 Acute Promyelocytic Leukemia - MC2884 (24h) sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	1
EGAD00001002462	ChIP-Seq data for 3 mature conventional dendritic cell - GM-CSF_IL4_T=6_days_R848_T=24hrs sample(s). 20 run(s), 19 experiment(s), 19 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002463	ChIP-Seq data for 6 cytotoxic CD56-dim natural killer cell sample(s). 34 run(s), 34 experiment(s), 34 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	6
EGAD00001002464	Bisulfite-Seq data for 2 CD3-negative, CD4-positive, CD8-positive, double positive thymocyte sample(s). 29 run(s), 2 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002465	RNA-Seq data for 27 Acute Myeloid Leukemia sample(s). 27 run(s), 27 experiment(s), 27 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	27
EGAD00001002466	ChIP-Seq data for 15 naive B cell sample(s). 67 run(s), 59 experiment(s), 59 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	15
EGAD00001002467	RNA-Seq data for 3 mature neutrophil - G-CSF/Dex. Treatment (16-20 hrs) sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	3
EGAD00001002468	DNase-Hypersensitivity data for 3 CD34-negative, CD41-positive, CD42-positive megakaryocyte cell sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816	Illumina HiSeq 2000	3
EGAD00001002469	RNA-Seq data for 2 effector memory CD4-positive, alpha-beta T cell sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	2
EGAD00001002470	ChIP-Seq data for 3 mature neutrophil - G-CSF/Dex. Treatment (16-20 hrs) sample(s). 23 run(s), 23 experiment(s), 23 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002471	RNA-Seq data for 3 mature conventional dendritic cell - GM-CSF_IL4_T=6_days_R848_T=24hrs sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	3
EGAD00001002472	Bisulfite-Seq data for 1 Acute Promyelocytic Leukemia - ATRA sample(s). 9 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	1
EGAD00001002473	RNA-Seq data for 2 mesenchymal stem cell of the bone marrow sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	2
EGAD00001002474	ChIP-Seq data for 3 monocyte - RPMI_T=24hrs sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000 unspecified	3
EGAD00001002475	Bisulfite-Seq data for 1 monocyte - Attached_T=1hr sample(s). 23 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	1
EGAD00001002476	RNA-Seq data for 3 class switched memory B cell sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	3
EGAD00001002477	ChIP-Seq data for 2 mature eosinophil sample(s). 14 run(s), 14 experiment(s), 14 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	2
EGAD00001002478	RNA-Seq data for 3 common myeloid progenitor sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	3
EGAD00001002479	RNA-Seq data for 2 osteoclast sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	2
EGAD00001002480	DNase-Hypersensitivity data for 1 CD4-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816	Illumina HiSeq 2000	1
EGAD00001002481	DNase-Hypersensitivity data for 2 alternatively activated macrophage sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816	Illumina HiSeq 2000	2
EGAD00001002482	RNA-Seq data for 1 central memory CD8-positive, alpha-beta T cell sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	1
EGAD00001002483	Bisulfite-Seq data for 2 CD4-positive, alpha-beta thymocyte sample(s). 29 run(s), 2 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002484	ChIP-Seq data for 10 CD14-positive, CD16-negative classical monocyte sample(s). 80 run(s), 76 experiment(s), 76 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	10
EGAD00001002485	ChIP-Seq data for 3 immature conventional dendritic cell - GM-CSF_IL4_T=6_days sample(s). 20 run(s), 20 experiment(s), 20 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002486	Bisulfite-Seq data for 2 central memory CD8-positive, alpha-beta T cell sample(s). 36 run(s), 3 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002487	ChIP-Seq data for 2 adult endothelial progenitor cell sample(s). 16 run(s), 14 experiment(s), 14 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	2
EGAD00001002488	ChIP-Seq data for 2 endothelial cell of umbilical vein (resting) sample(s). 13 run(s), 13 experiment(s), 13 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	2
EGAD00001002489	RNA-Seq data for 5 common lymphoid progenitor sample(s). 20 run(s), 5 experiment(s), 5 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	5
EGAD00001002490	ChIP-Seq data for 3 Acute Promyelocytic Leukemia - CTR sample(s). 22 run(s), 21 experiment(s), 21 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002491	ChIP-Seq data for 2 monocyte - T=0days sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000 NextSeq 500	2
EGAD00001002492	Bisulfite-Seq data for 2 regulatory T cell sample(s). 41 run(s), 3 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002493	ChIP-Seq data for 1 Acute Promyelocytic Leukemia - MS-275 (20h) sample(s). 5 run(s), 5 experiment(s), 5 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	1
EGAD00001002494	ChIP-Seq data for 1 CD8-positive, alpha-beta thymocyte sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	1
EGAD00001002495	ChIP-Seq data for 3 monocyte - RPMI_T=6days_LPS_T=4hrs sample(s). 8 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002496	Bisulfite-Seq data for 4 CD8-positive, alpha-beta T cell sample(s). 57 run(s), 5 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	4
EGAD00001002497	Bisulfite-Seq data for 3 neutrophilic metamyelocyte sample(s). 32 run(s), 3 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	3
EGAD00001002498	ChIP-Seq data for 2 macrophage - T=6days LPS sample(s). 6 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000 NextSeq 500	2
EGAD00001002499	DNase-Hypersensitivity data for 2 Acute Lymphocytic Leukemia - CTR sample(s). 2 run(s), 2 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_dnaseseq_analysis_20160816	Illumina HiSeq 2000	2
EGAD00001002500	RNA-Seq data for 1 Acute Promyelocytic Leukemia - MS-275 (20h) sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	1
EGAD00001002501	Bisulfite-Seq data for 6 macrophage sample(s). 88 run(s), 7 experiment(s), 12 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	6
EGAD00001002502	Bisulfite-Seq data for 1 monocyte - RPMI_LPS_T=1hr sample(s). 14 run(s), 1 experiment(s), 2 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	1
EGAD00001002503	ChIP-Seq data for 2 effector memory CD8-positive, alpha-beta T cell, terminally differentiated sample(s). 6 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	2
EGAD00001002504	ChIP-Seq data for 9 macrophage sample(s). 55 run(s), 55 experiment(s), 55 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	9
EGAD00001002505	Bisulfite-Seq data for 5 Mantle Cell Lymphoma sample(s). 65 run(s), 5 experiment(s), 10 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	5
EGAD00001002506	ChIP-Seq data for 3 Germinal Center B-Cell-Like Diffuse Large B-Cell Lymphoma sample(s). 10 run(s), 10 experiment(s), 10 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002507	RNA-Seq data for 6 macrophage sample(s). 7 run(s), 6 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	6
EGAD00001002508	Bisulfite-Seq data for 9 mature neutrophil sample(s). 116 run(s), 9 experiment(s), 18 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	9
EGAD00001002509	RNA-Seq data for 1 colony forming unit erythroid sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	1
EGAD00001002510	ChIP-Seq data for 1 memory B cell sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	1
EGAD00001002511	Bisulfite-Seq data for 3 germinal center B cell sample(s). 37 run(s), 4 experiment(s), 6 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	3
EGAD00001002512	ChIP-Seq data for 3 monocyte - RPMI_BG_T=24hrs_RPMI_T=5days_LPS_T=4hrs sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002513	RNA-Seq data for 1 Acute Promyelocytic Leukemia - MC2884 (4h) sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	1
EGAD00001002514	ChIP-Seq data for 2 effector memory CD4-positive, alpha-beta T cell sample(s). 8 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	2
EGAD00001002515	ChIP-Seq data for 9 inflammatory macrophage sample(s). 58 run(s), 58 experiment(s), 58 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	9
EGAD00001002516	ChIP-Seq data for 1 Acute Promyelocytic Leukemia - MC2494 sample(s). 2 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	1
EGAD00001002517	ChIP-Seq data for 3 monocyte - RPMI_BG_T=24hrs_RPMI_T=5days sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	3
EGAD00001002518	RNA-Seq data for 7 Chronic Lymphocytic Leukemia sample(s). 7 run(s), 7 experiment(s), 7 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	7
EGAD00001002519	Bisulfite-Seq data for 2 mesenchymal stem cell of the bone marrow sample(s). 39 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	2
EGAD00001002520	Bisulfite-Seq data for 4 CD38-negative naive B cell sample(s). 51 run(s), 5 experiment(s), 8 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	4
EGAD00001002521	Bisulfite-Seq data for 5 Multiple Myeloma sample(s). 63 run(s), 7 experiment(s), 10 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	5
EGAD00001002522	RNA-Seq data for 4 macrophage - T=6days B-glucan sample(s). 4 run(s), 4 experiment(s), 4 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	4
EGAD00001002523	Bisulfite-Seq data for 6 CD14-positive, CD16-negative classical monocyte sample(s). 86 run(s), 6 experiment(s), 12 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_bisulphite_analysis_CNAG_20160816	Illumina HiSeq 2000	6
EGAD00001002524	ChIP-Seq data for 9 CD38-negative naive B cell sample(s). 48 run(s), 44 experiment(s), 44 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	9
EGAD00001002525	RNA-Seq data for 1 CD3-positive, CD4-positive, CD8-positive, double positive thymocyte sample(s). 1 run(s), 1 experiment(s), 1 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	1
EGAD00001002526	RNA-Seq data for 3 immature conventional dendritic cell - GM-CSF_IL4_T=6_days sample(s). 3 run(s), 3 experiment(s), 3 analysis(s) on human genome GRCh38. Part of BLUEPRINT release August 2016. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20160816/homo_sapiens/README_rnaseq_analysis_crg_20160816	Illumina HiSeq 2000	3
EGAD00001002527	DEEP (German Epigenome Project) sequence data of following samples (Sequencing Types: Chip-Seq, WGBS-Seq, RNA-Seq, sncRNA-Seq, NOMe-Se, DNase-Seq): 41_Hf01_LiHe_Ct, 41_Hf02_LiHe_Ct, 41_Hf03_LiHe_Ct, 01_HepG2_LiHG_Ct1, 01_HepG2_LiHG_Ct2, 01_HepaRG_LiHR_D31, 01_HepaRG_LiHR_D32, 01_HepaRG_LiHR_D33, 43_Hm01_BlMo_Ct, 43_Hm03_BlMo_Ct, 43_Hm05_BlMo_Ct, 43_Hm03_BlMa_Ct, 43_Hm05_BlMa_Ct, 43_Hm03_BlMa_TO, 43_Hm05_BlMa_TO, 43_Hm03_BlMa_TE, 43_Hm05_BlMa_TE, 51_Hf01_BlCM_Ct, 51_Hf03_BlCM_Ct, 51_Hf04_BlCM_Ct, 51_Hf02_BlCM_Ct, 51_Hf05_BlCM_Ct, 51_Hf06_BlCM_Ct, 51_Hf06_BlCM_T1, 51_Hf06_BlCM_T2, 51_Hf03_BlEM_Ct, 51_Hf04_BlEM_Ct, 51_Hf02_BlEM_Ct, 51_Hf05_BlEM_Ct, 51_Hf06_BlEM_Ct, 51_Hf06_BlEM_T1, 51_Hf06_BlEM_T2, 51_Hf03_BlTN_Ct, 51_Hf04_BlTN_Ct, 51_Hf02_BlTN_Ct, 51_Hf05_BlTN_Ct, 51_Hf06_BlTN_Ct, 51_Hf06_BlTN_T1, 51_Hf06_BlTN_T2, 51_Hf07_BmTM4_Ct, 51_Hf08_BlTM4_Ct, 51_Hf08_BmTM4_SP1, 51_Hf08_BmTM4_SP2, 51_Hf05_BlTA_Ct, 44_Mm01_WEAd_C2, 44_Mm03_WEAd_C2, 44_Mm02_WEAd_C2, 44_Mm07_WEAd_C2, 44_Mm04_WEAd_C1, 44_Mm05_WEAd_C1	Illumina HiSeq 2000 Illumina HiSeq 2500	46
EGAD00001002528	WGS from EGAS00001001857	Illumina HiSeq 2000	18
EGAD00001002530	Additional files for "The Genomic Landscape of Core-Binding Factor Acute Myeloid Leukemias" (EGAS00001000349). This dataset includes the processed RNASeq data referenced in this paper.	Illumina HiSeq 2000	36
EGAD00001002531	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002532	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002533	Genome and transcriptome sequence data from an ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002534	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002535	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002536	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002537	Genome and transcriptome sequence data from a cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002538	Genome and transcriptome sequence data from a uterine sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002539	Genome and transcriptome sequence data from an oligodendroglioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002540	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002541	Genome and transcriptome sequence data from a liposarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002542	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002543	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002544	Genome and transcriptome sequence data from an ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002545	Genome and transcriptome sequence data from a duodenal malignancy patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002546	Genome and transcriptome sequence data from a melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002547	Exome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002548	Genome and transcriptome sequence data from a lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002549	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002550	Genome and transcriptome sequence data from a primary unknown cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002551	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002552	Genome and transcriptome sequence data from a Ewing sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002553	Genome and transcriptome sequence data from an unknown cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002554	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002555	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002556	Genome and transcriptome sequence data from an ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002557	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002558	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002559	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002560	Genome and transcriptome sequence data from a cervical cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002561	Genome and transcriptome sequence data from a metastatic cervical cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002562	Genome and transcriptome sequence data from an osteogenic sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002563	Genome and transcriptome sequence data from a follicular lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002564	Genome and transcriptome sequence data from an adenocarcinoma of lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002565	Genome and transcriptome sequence data from an ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002566	Genome and transcriptome sequence data from a uveal melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002567	Genome and transcriptome sequence data from a rectosigmoid adenocarcinoma (colorectal cancer) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002568	Genome and transcriptome sequence data from a metastatic endometrial cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002569	Genome and transcriptome sequence data from a primary unknown cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002570	Genome and transcriptome sequence data from a leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002571	Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002572	Genome and transcriptome sequence data from an infiltrating ductal carcinoma of right breast patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002573	Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002574	Genome and transcriptome sequence data from a ductal carcinoma of left breast patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002575	Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002576	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002577	Genome and transcriptome sequence data from an adenocarcinoma of primary unknown cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002578	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002579	Genome and transcriptome sequence data from a carcinoma of left lower outer quadrant patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002580	Genome and transcriptome sequence data from a right breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002581	Genome and transcriptome sequence data from a metastatic myxofibrosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002582	Genome and transcriptome sequence data from a squamous cell carcinoma of anal canal patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002583	Genome and transcriptome sequence data from a retroperitoneal leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002584	Genome and transcriptome sequence data from a vulvar metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002585	Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002586	Genome and transcriptome sequence data from a squamous cell carcinoma of vulva patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002587	Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002588	Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002589	Genome and transcriptome sequence data from a metastatic neuroendocrine carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002590	Genome and transcriptome sequence data from an adenomacarcinoma of vulva patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002591	Genome and transcriptome sequence data from a neuroendocrine tumor likely pancreatic origin patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study	MinION	1
EGAD00001002592	Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002593	Genome and transcriptome sequence data from a leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002594	Genome and transcriptome sequence data from a peritoneal mesothelioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002595	Genome and transcriptome sequence data from a metastatic adenocarcinoma of the GE junction patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002596	Genome and transcriptome sequence data from a porocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002597	Genome and transcriptome sequence data from a pancreatic ductal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002598	Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002599	Genome and transcriptome sequence data from a medullary thyroid carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002600	Genome and transcriptome sequence data from an adnexal tumor probable of Wolffian origin patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002601	Genome and transcriptome sequence data from an invasive ductal carcinoma of left breast patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002602	Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002603	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002604	Genome and transcriptome sequence data from a clear cell carcinoma of ovary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002605	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002606	Genome and transcriptome sequence data from an adenocarcinoma of unknown primary cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002607	Genome and transcriptome sequence data from a pancreatic cancer (likely PNET) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002608	Genome and transcriptome sequence data from a pleomorphic spindle cell sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002609	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002610	Genome and transcriptome sequence data from an invasive carcinoma of left breast patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002611	Genome and transcriptome sequence data from an adenocarcinoma of unknown primary cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002612	Genome and transcriptome sequence data from an esophageal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002613	Genome and transcriptome sequence data from a breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002614	Genome and transcriptome sequence data from a thymic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002615	Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002616	Genome and transcriptome sequence data from a superficial pleomorphic liposarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002617	Genome and transcriptome sequence data from a small cell/neuroendocrine carcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002618	Genome and transcriptome sequence data from a metastatic rectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002619	Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002620	Genome and transcriptome sequence data from a myxoid liposarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002621	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002622	Genome and transcriptome sequence data from a metastatic colorectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002623	Genome and transcriptome sequence data from a cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002624	Genome and transcriptome sequence data from a squamous cell carcinoma of unknown primary cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002625	Genome and transcriptome sequence data from a Ewing sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002626	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002627	Genome and transcriptome sequence data from an adenoid cystic carcinoma of the trachea patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002628	Genome and transcriptome sequence data from a squamous cell carcinoma of anus patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002629	Genome and transcriptome sequence data from a metastatic colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002630	Genome and transcriptome sequence data from a metastatic gastric adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002631	Genome and transcriptome sequence data from a serous endometrial cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002632	Genome and transcriptome sequence data from a testicular cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002633	Genome and transcriptome sequence data from an endometrial carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002634	Genome and transcriptome sequence data from a metastatic rectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002635	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002636	Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002637	Genome and transcriptome sequence data from a metastatic colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002638	Genome and transcriptome sequence data from a metastatic prostate cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002639	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002640	Genome and transcriptome sequence data from a clival chordoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002641	Genome and transcriptome sequence data from a metastatic small cell carcinoma of unknown primary cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002642	Genome and transcriptome sequence data from a metastatic cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002643	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002644	Genome and transcriptome sequence data from a multifocal hepatocellular carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002645	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002646	Genome and transcriptome sequence data from an epithelioid mesothelioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002647	Genome and transcriptome sequence data from a metastatic pancreatic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002648	Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002649	Variants called from RNA-seq data of meningioma tumors.		25
EGAD00001002650	Somatic variants called from whole-exome sequencing of meningioma-blood pairs		87
EGAD00001002651	Presurgical studies allow study of the relationship between mutations and response of estrogen receptor positive (ER+) breast cancer to aromatase inhibitors (AIs) but have been limited to small biopsies. Here in Phase I of this study, we perform exome sequencing on baseline, surgical core-cuts and blood from 60 patients (40 AI treated, 20 Controls). In poor responders (based on Ki67 change) we find significantly more somatic mutations than good responders. Subclones exclusive to baseline or surgical cores occur in approximately 30% of tumours. In Phase II we combine targeted sequencing on another 28 treated patients with Phase I. We find six genes frequently mutated: PIK3CA, TP53, CDH1, MLL3, ABCA13 and FLG with 71% concordance between paired cores. TP53 mutations are associated with poor response. We conclude that multiple biopsies are essential for confident mutational profiling of ER+ breast cancer and TP53 mutations are associated with resistance to oestrogen deprivation therapy.	Illumina HiSeq 2000	443
EGAD00001002652	50 ng of genomic double stranded DNA was enzymatically sheared to an average size of 200 bp. Further processing was performed using Illumina Nextera Rapid Capture Custom Kit (Illumina) and 100 bp paired-end sequencing was performed with 24 samples per lane on a Illumina HiSeq 2000 (Illumina) to reach a coverage of 100-1000x.		284
EGAD00001002653	Genomic DNA from leukemic and remission bone marrow mononuclear cells was isolated with the QIAamp DNA Blood Extraction Kit (Qiagen, Venlo, The Netherlands). Libraries were prepared with the Illumina TruSeq DNA Sample Prep and TruSeq Exome Enrichment Kits (Illumina, San Diego, CA, USA) according to the manufacturer's recommendations. 100 bp paired-end sequencing was performed on a HiSeq 2000 (Illumina) to about 80x coverage.		57
EGAD00001002654	This dataset contains RNA-seq, ATAC-seq, and ChIP-seq samples from the SJERG cohort. We applied ChIP-Seq for Dux4 on two B-cell ALL cell-lines(REH, Nalm6) along with INPUT. ATAC-Seq on two B-cell ALL cell-lines(REH, Nalm6) and xenograft of a B-cell ALL patient(ERG000016).	Illumina HiSeq 2000	13
EGAD00001002655	BLUEPRINT ChIP-Seq from two mantle cell lymphoma patients	Illumina HiSeq 2000	2
EGAD00001002656	Whole exome sequencing BAM files and whole genome sequencing CRAM files for 722 individuals from the NIHR-BioResource Rare Diseases Consortium (SPEED project) with inherited retinal disease.	Illumina HiSeq 2000	707
EGAD00001002657	Reverse Capture Hi-C	Illumina HiSeq 2000	8
EGAD00001002658	Highly purified mesenchymal cells (CD45-/7AAD-/CD235a-/CD31-/CD271+/CD105+) were prospectively FACS-isolated from bone marrow specimens of 45 low-risk myelodysplastic syndrome (LRMDS) cases. Gene expression profiles (GEPs) of the 45 LRMDS have been compared to GEPs derived from likewise highly purified mesenchymal cells obtained from bone marrow specimens of healthy donors for the identification of inflammatory signatures. Additionally, an overlap in inflammatory signatures has been determined by comparing the GEPs of these 45 LRMDS cases to the GEPs of 4 Shwachman-Diamond syndrome and 3 Diamond-Blackfan anemia cases, both representing different subclasses of congenital pre-leukemia syndromes with a tendency of leukemic progression and perturbed niche compartment. Finally, the GEPs and gene expression signatures have been utilized for prognostication and the prediction of leukemic progression.	Illumina HiSeq 2500	45
EGAD00001002659	Highly purified mesenchymal cells (CD45-/7AAD-/CD235a-/CD31-/CD271+/CD105+) were prospectively FACS-isolated from bone marrow specimens of 10 healthy donors (HDs). This data set is used as a baseline control to observe the differences between gene expression profiles (GEPs) of pre-leukemia cases (45 low-risk myelodysplastic syndrome, 4 Shwachman-Diamond and 3 Diamond-Blackfan anemia patients) and gene expression patterns observed in a normal, healthy context. Through differential expression and gene set enrichment analysis we determined that inflammatory signaling pathways are significantly more active in mesenchymal cells of pre-leukemia cases compared to their healthy counterparts. Finally, we determined through statistical modelling of healthy donor's GEPs which pre-leukemia cases have significantly more active inflammatory signaling and demonstrated a strong relation to survival statistics.	Illumina HiSeq 2500	10
EGAD00001002660	Highly purified mesenchymal cells (CD45-/7AAD-/CD235a-/CD31-/CD271+/CD105+) were prospectively FACS-isolated from bone marrow specimens of 4 Shwachman-Diamond syndrome (SDS) cases. This data set, comprising 4 SDS cases, is used as complement to 45 low-risk myelodysplastic syndrome (LRMDS) and 3 Diamond-Blackfan anemia (DBA) cases to demonstrate aberrant inflammatory signaling as a common mechanism in pre-leukemia syndromes to induce genotoxic stress in hematopoietic stem cells. In addition this data set is used to determine different overlapping gene expression signatures in pre-leukemia syndromes compared to gene expression profiles of highly purified mesenchymal cells of healthy donors.	Illumina HiSeq 2500	4
EGAD00001002661	Highly purified mesenchymal cells (CD45-/7AAD-/CD235a-/CD31-/CD271+/CD105+) were prospectively FACS-isolated from bone marrow specimens of 3 Diamond-Blackfan anemia (DBA) cases. This data set, comprising 3 DBA cases, is used as complement to 45 low-risk myelodysplastic syndrome (LRMDS) and 4 Shwachman-Diamond syndrome (SDS) cases to demonstrate aberrant inflammatory signaling as a common mechanism in pre-leukemia syndromes to induce genotoxic stress in hematopoietic stem cells. In addition, this data set is used to determine different overlapping gene expression signatures in pre-leukemia syndromes compared to gene expression profiles of highly purified mesenchymal cells of healthy donors.	Illumina HiSeq 2500	3
EGAD00001002662	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: LINC-JP.		62
EGAD00001002663	BLUEPRINT: A human variation panel of genetic influences on epigenomes and transcriptomes in three immune cells (WGS)	Illumina HiSeq 2000	197
EGAD00001002664	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: CMDI-UK.		98
EGAD00001002665	Mapped sequence reads in BAM format for 64 individuals reporting Kanak ancestry recruited in New Caledonia sequenced at four times target coverage using the Illumina HiSeq 4000 platform.		64
EGAD00001002666	Genomic DNA from leukemic and remission bone marrow mononuclear cells was isolated with the QIAamp DNA Blood Extraction Kit (Qiagen, Venlo, The Netherlands). Libraries were prepared using Nextera Rapid Capture Exome Kit (Illumina, San Diego, USA). Paired-end sequencing of 100 bp reads was performed on a HiSeq 2000 (Illumina) to obtain at least a 50 x coverage.		105
EGAD00001002667	Additional files for "The Genomic Landscape of Core-Binding Factor Acute Myeloid Leukemias" (EGAS00001000349). This dataset includes the processed Excap data referenced in this paper.	Illumina HiSeq 2000	327
EGAD00001002668	Metagenomic shotgun sequencing of Irritable bowel syndrome patients and matched controls	Illumina HiSeq 2000	336
EGAD00001002669	Part of WGS data for Prostate (ICGC)	HiSeq X Ten Illumina HiSeq 2000	38
EGAD00001002670	ChIP-Seq data for 182 mature neutrophil sample(s). 2847 run(s), 366 experiment(s), 355 analysis(s) on human genome GRCh37. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/blueprint_Epivar/protocols/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000 Illumina HiSeq 2500	186
EGAD00001002671	RNA-Seq data for 212 CD4-positive, alpha-beta T cell sample(s). 212 run(s), 212 experiment(s), 212 analysis(s) on human genome GRCh37. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/blueprint_Epivar/protocols/README_rnaseq_analysis_sanger_20160816	Illumina HiSeq 2000 Illumina HiSeq 2500	212
EGAD00001002672	ChIP-Seq data for 172 CD14-positive, CD16-negative classical monocyte sample(s). 572 run(s), 345 experiment(s), 340 analysis(s) on human genome GRCh37. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/blueprint_Epivar/protocols/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000 Illumina HiSeq 2500	174
EGAD00001002673	ChIP-Seq data for 154 CD4-positive, alpha-beta T cell sample(s). 355 run(s), 265 experiment(s), 250 analysis(s) on human genome GRCh37. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/blueprint_Epivar/protocols/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000 Illumina HiSeq 2500	158
EGAD00001002674	RNA-Seq data for 197 CD14-positive, CD16-negative classical monocyte sample(s). 197 run(s), 197 experiment(s), 197 analysis(s) on human genome GRCh37. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/blueprint_Epivar/protocols/README_rnaseq_analysis_sanger_20160816	Illumina HiSeq 2000 Illumina HiSeq 2500	197
EGAD00001002675	RNA-Seq data for 205 mature neutrophil sample(s). 205 run(s), 205 experiment(s), 205 analysis(s) on human genome GRCh37. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/blueprint_Epivar/protocols/README_rnaseq_analysis_sanger_20160816	Illumina HiSeq 2000 Illumina HiSeq 2500	205
EGAD00001002676	DATA FILES FOR PCGP SJERG (WGS)	Illumina HiSeq 2000	44
EGAD00001002677	DATA FILES FOR PCGP SJERG (WXS)	Illumina HiSeq 2000	42
EGAD00001002678	The data set consists of low-pass whole genome sequence data of single CTCs, pools of CTCs and germline controls for a cohort of 31 SCLC patients at both baseline, and for 5 patients at relapse. In addition 9 CDX models and associated germline controls (where available) are included.	Illumina HiSeq 2500 Illumina MiSeq NextSeq 500	319
EGAD00001002679	This dataset contains WES files for the SJACT cohort associated with the paper "Genetic landscape of pediatric Adrenocortical Tumor". In this paper, we analyse 37 adrenocortical tumours (ACTs) by whole-genome, whole-exome and/or transcriptome sequencing.	Illumina HiSeq 2000	38
EGAD00001002680	This dataset contains RNA-Seq files for the SJACT cohort associated with the paper "Genetic landscape of pediatric Adrenocortical Tumor". In this paper, we analyse 37 adrenocortical tumours (ACTs) by whole-genome, whole-exome and/or transcriptome sequencing.	Illumina HiSeq 2000	26
EGAD00001002681	RNA-seq, ChIP-seq, and ATAC-seq files for PCGP SJERG paper titled "Deregulation of DUX4 and ERG in acute lymphoblastic leukemia"	Illumina HiSeq 2000	53
EGAD00001002682	BLUEPRINT DNA methylation profiles of monocytes, T cells and B cells in type 1 diabetes-discordant monozygotic twins (Bisulfite-Seq data).		8
EGAD00001002684	Whole genome sequencing of 98 tumour-normal pairs for the PAEN-AU pancreatic neuroendocrine cancer project.		196
EGAD00001002685	Breast cancer PDTX sequencing data from Bruna et al, Cell 2016 - Exome Sequencing - Shallow Whole Genome Sequencing - RRBS Methylation Sequencing	Illumina HiSeq 2500	393
EGAD00001002686	CD4 T-Cell ChIP-Seq	Illumina HiSeq 2000	42
EGAD00001002687	CD4 T-Cell RNA-Seq	Illumina HiSeq 2000	4
EGAD00001002689	ICGC Oesophageal Adenocarcinoma tissue samples	Illumina HiSeq 2000	1
EGAD00001002690	Exome sequencing of for 10 patients: 10 tumors, 10 cell lines and 7 blood samples (for 3 patients blood was not available)	Illumina HiSeq 2000	27
EGAD00001002691	RNAseq data for 10 patients: 10 tumors and 10 cell lines	NextSeq 500	20
EGAD00001002692	DATA FILES FOR MULLIGHAN MEF2D RNASEQ STRANDED	Illumina HiSeq 2000	200
EGAD00001002693	Innate immune memory is the phenomenon whereby innate immune cells such as monocytes or macrophages undergo functional reprogramming after exposure to microbial components such as LPS. We apply an integrated epigenomic approach to characterize the molecular events involved in LPS-induced tolerance in a time dependent manner. ChIP-seq, RNA-seq, WGBS and ATAC-seq data were generated. This analysis identified epigenetic programs in tolerance and trained macrophages, and the potential transcription factors involved. Experimental set-up Time-course in vitro culture of human monocytes. Two innate immune memory states can be induced in culture through an initial exposure of primary human monocytes to either LPS or BG for 24 hours, followed by removal of stimulus and differentiation to macrophages for an additional 5 days. Cells were collected at baseline (day 0), 1 hour, 4 hour, 24 hour and 6 days.	Illumina HiSeq 2000 NextSeq 500 unspecified	71
EGAD00001002695	48 samples from the TRACK-HD cohort. All samples carry the Huntington’s disease expansion. The subjects were selected on the basis of rate of disease progression.	Illumina HiSeq 2000	48
EGAD00001002696	Recurrent breast cancer is almost universally fatal. We characterize 170 patients locally relapsed or distant metastatic cancers using massively parallel sequencing. We identify that the relapse-seeding clone disseminates late from the primary tumor. TP53 and AKT1 appear to be enriched in ER-positive cancers predisposed to relapse. Mutation acquisition continues at relapse as the same mutation signatures continue to operate and new signatures, such as that caused by radiotherapy appear de novo. In 49% of cases we identify drivers mutations private to the relapse and these are sampled from a wider range of cancer genes, including SWI-SNF complex and JAK-STAT signaling.	HiSeq X Ten Illumina HiSeq 2000	58
EGAD00001002697	Recurrent breast cancer is almost universally fatal. We characterize 170 patients locally relapsed or distant metastatic cancers using massively parallel sequencing. We identify that the relapse-seeding clone disseminates late from the primary tumor. TP53 and AKT1 appear to be enriched in ER-positive cancers predisposed to relapse. Mutation acquisition continues at relapse as the same mutation signatures continue to operate and new signatures, such as that caused by radiotherapy appear de novo. In 49% of cases we identify drivers mutations private to the relapse and these are sampled from a wider range of cancer genes, including SWI-SNF complex and JAK-STAT signaling.	Illumina HiSeq 2000	9
EGAD00001002698	Recurrent breast cancer is almost universally fatal. We characterize 170 patients locally relapsed or distant metastatic cancers using massively parallel sequencing. We identify that the relapse-seeding clone disseminates late from the primary tumor. TP53 and AKT1 appear to be enriched in ER-positive cancers predisposed to relapse. Mutation acquisition continues at relapse as the same mutation signatures continue to operate and new signatures, such as that caused by radiotherapy appear de novo. In 49% of cases we identify drivers mutations private to the relapse and these are sampled from a wider range of cancer genes, including SWI-SNF complex and JAK-STAT signaling.	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina MiSeq	387
EGAD00001002699	This data set includes RNAseq data from 136 samples from the TRACK-HD cohort including premanifest, manifest and control subjects. Data can only be used for Huntington's disease related research.	Illumina HiSeq 2500	136
EGAD00001002704	DATA FILES FOR MULLIGHAN MEF2D RNASEQ UNSTRANDED	Illumina HiSeq 2000	217
EGAD00001002705	McGill EMC Release 6 data	unspecified	59
EGAD00001002707	Whole exome sequencing of a normal sample, primary tumor sample, and relapse tumor sample of a transformed non-Hodgkins follicular lymphoma patient with extraordinary response to treatment.	Illumina HiSeq 2000	3
EGAD00001002708	ATAC-seq data for 7 sample(s) from tonsil, on Genome GRCh38. 7 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	7
EGAD00001002709	ATAC-seq data for 136 sample(s) from venous blood, on Genome GRCh38. 141 run(s), 139 experiment(s), 139 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	136
EGAD00001002710	ATAC-seq data for 4 sample(s) from bone marrow, on Genome GRCh38. 4 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	4
EGAD00001002711	ChIP-Seq_H3K4me3 data for 133 mature neutrophil sample(s). 208 run(s), 136 experiment(s), 136 analysis(s) on human genome GRCh37. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/blueprint_Epivar/protocols/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	133
EGAD00001002712	ChIP-Seq_H3K27me3 data for 131 mature neutrophil sample(s). 321 run(s), 134 experiment(s), 134 analysis(s) on human genome GRCh37. Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/blueprint_Epivar/protocols/README_chipseq_analysis_ebi_20160816	Illumina HiSeq 2000	131
EGAD00001002713	DNase accessibility data for BLUEPRINT consortium immune cells included in eFORGE software tool	Illumina HiSeq 2000	25
EGAD00001002714	We recruited 100 healthy, male donors of self-reported European descent (EUB) and 100 of self-reported African descent (AFB) (Ghent, Belgium). For each participant, peripheral blood mononuclear cells (PBMCs) were isolated from whole blood on Ficoll-Paque density gradients. Monocytes were then positively selected with magnetic CD14 microbeads and exposed for 6 hours to different ligands activating TLR4 (LPS), TLR1/2 (Pam3CSK4), TLR7/8 (R848) and to a human seasonal influenza A virus (IAV). High-quality RNA was obtained from unstimulated and stimulated monocytes for 970 of the 1000 samples (200 x 5 conditions), and was sequenced on an Illumina HiSeq2000. On average, 34 million 101-bp single-end reads were obtained per sample.	Illumina HiSeq 2000	970
EGAD00001002715	Exome sequencing of isolate populations and Generation Scotland	Illumina HiSeq 2000	1027
EGAD00001002716	In this study we characterized genomic alterations in two to five metachronous bladder tumors from 29 patients initially diagnosed with early stage disease. Fourteen patients (32 tumors) had non progressive disease (NPD) and 15 patients (34 tumors) had progressive disease (PD). Whole exome sequencing (WES, ~50x mean read depth and whole transcriptome RNA-seq was performed (RNA was not advalible for 4 tumors) Data provided here consist of 122 Bam files for WES (83 Tumors and 39 blood)	Illumina HiSeq 2000	122
EGAD00001002717	In this study we characterized genomic alterations in two to five metachronous bladder tumors from 29 patients initially diagnosed with early stage disease. Fourteen patients (32 tumors) had non progressive disease (NPD) and 15 patients (34 tumors) had progressive disease (PD). Whole exome sequencing (WES, ~50x mean read depth and whole transcriptome RNA-seq was performed (RNA was not advalible for 4 tumors). Data provided here consist of 71 unmapped Bam files form whole transcriptome RNA-seq.	Illumina HiSeq 2000	71
EGAD00001002718	In this study we characterized genomic alterations in two to five metachronous bladder tumors from 29 patients initially diagnosed with early stage disease. Fourteen patients (32 tumors) had non progressive disease (NPD) and 15 patients (34 tumors) had progressive disease (PD). Whole exome sequencing (WES, ~50x mean read depth and whole transcriptome RNA-seq was performed (RNA was not advalible for 4 tumors). Data provided here consist of 71 mapped Bam files form whole transcriptome RNA-seq.	Illumina HiSeq 2000	71
EGAD00001002719	This dataset contains whole-genome sequencing data files from colon organoid cultures, which were mutated using CRISPR-Cas9 for specific genes (APC, KRAS, TP53 and SMAD4) to generate in vitro transformed cancer cells. After introducing each mutation, the resulting cultures were subjected to whole-genome sequencing. In addition, some cultures were xenotransplanted in recipient mice. The resulting primary tumors and corresponding metastases were subjected to whole-genome sequencing.	HiSeq X Ten	30
EGAD00001002721	Whole genome sequencing of 300 individuals from 142 diverse populations	Illumina HiSeq 2000	21
EGAD00001002722	Exome sequencing for 26 patients with matched blood RNA-seq for 41 patients	Illumina HiSeq 2500	93
EGAD00001002724	September 2016 data update (bam/fastq/vcf) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2500	24
EGAD00001002725	Autism spectrum disorder (ASD) is a collection of neuro-developmental disorders characterized by deficits in social interaction and social communication, along with restricted and repetitive behaviour patterns. we globally interrogated the histone acetylomes of enhancers in a large cohort of ASD and control samples by analyzing tissue from three brain regions postmortem: prefrontal cortex (PFC), temporal cortex (TC) and cerebellum (CB). H3K27ac was selected as the representative acetylation mark and 288 ChIP-seq were performed on these postmortem samples.	Illumina HiSeq 2000	291
EGAD00001002726	Cluster headache is a relatively rare headache disorder, typically characterized by multiple daily, short-lasting attacks of excruciating, unilateral (peri-)orbital or temporal pain associated with autonomic symptoms and restlessness. To better understand the pathophysiology of cluster headache, we used RNA sequencing to identify differentially expressed genes and pathways in whole blood of patients with episodic (n = 19) or chronic (n = 20) cluster headache in comparison with headache-free controls (n = 20).	Illumina HiSeq 4000	60
EGAD00001002727	1,591 single cells from 11 colorectal cancer patients were profiled using Fluidigm based single cell RNA-seq protocol to characterized cellular heterogeneity of colorectal cancer. 630 single cells from 7 cell lines were profiled similarly to benchmark de novo cell type identification algorithms.	Illumina HiSeq 3000	2221
EGAD00001002728	In this dataset, exome sequencing of bone marrow samples taken during multiple timepoints of disease progression from 13 AML patients are present. These samples were take either before/after treatment, at diagnosis or at relapse.	Illumina Genome Analyzer IIx Illumina HiSeq 1500 Illumina HiSeq 2500	32
EGAD00001002729	Haplotype Reference Consortium Release 1.1 - subset for release via the EGA		11227
EGAD00001002730	SPEED - childhood dystonia KMT2B dataset	Illumina HiSeq 2000	5
EGAD00001002731	whole exome sequencing of tumor- as well as PBMC-derived DNA of five melanoma patients for identification of naturally presented patient-specific neoepitopes	Illumina HiSeq 2000	10
EGAD00001002732	DNA methylation was analyzed for stem/progenitor cell types and terminally differentiated cell types of the human blood lineage (HSC, MPP, CMP, MEP, GMP, CLP, MLP0, MLP1, MLP2, MLP3, MK, CD4+ Tcell, CD8+ Tcell, Bcell, NK, Neut, Mono).	Illumina HiSeq 4000	63
EGAD00001002733	Gene expression was analyzed for stem/progenitor cell types and terminally differentiated cell types of the human blood lineage (HSC, MPP, CMP, GMP, CLP, MLP0, MLP1, MLP2, MLP3).	Illumina HiSeq 4000	13
EGAD00001002734	Whole Genome Sequencing data set for the study "Premalignant SOX2 in ovarian cancer patients"	Complete Genomics	39
EGAD00001002735	mRNA, total RNA, small noncoding RNA, NOMe-Seq and DNase-Seq data from following samples (not every Sequencing Type for every sample): 01_HepG2_LiHG_Ct1 41_Hf01_LiHe_Ct 41_Hf02_LiHe_Ct 41_Hf03_LiHe_Ct 51_Hf03_BlCM_Ct 51_Hf04_BlCM_Ct 51_Hf03_BlEM_Ct 51_Hf04_BlEM_Ct 51_Hf03_BlTN_Ct 51_Hf04_BlTN_Ct Metadata available at deep.dkfz.de	Illumina HiSeq 2000 Illumina HiSeq 2500	10
EGAD00001002736	WES of human: A mutation in VPS15 (PIK3R4) causes a ciliopathy and affects IFT20 release from the cis-Golgi WES (Agilent SureSelect All Exon XT2 50 Mb kit) has been realized on three affected siblings (II.1, II.3, II.5) and one healthy sister (II.4). Raw data (BAM files) are provided: - II.1.aligned.sorted.dedup.realign.recal.bam - II.3.aligned.sorted.dedup.realign.recal.bam - II.5.aligned.sorted.dedup.realign.recal.bam - II.4.aligned.sorted.dedup.realign.recal.bam	Illumina HiSeq 2500	4
EGAD00001002738	Background: In follicular lymphoma (FL), studies addressing the prognostic value of microenvironment-related immunohistochemical (IHC) markers and tumor cell-related genetic markers have yielded conflicting results, precluding implementation in practice. Therefore, the Lunenburg Lymphoma Biomarker Consortium (LLBC) performed a validation study for published markers. Methods: To maximize sensitivity, an end-of-spectrum design was applied for 122 uniformly immunochemotherapy-treated FL patients retrieved from international trials and registries; early failure (EF): progression or lymphoma-related death <2 years versus long remission: response duration of >5 years. IHC staining for T-cells and macrophages was performed on tissue microarrays from initial biopsy and scored with a validated computer-assisted protocol. Shallow whole-genome and deep targeted sequencing was performed on the same samples. Results: 96/122 cases with complete molecular and immunohistochemical data were included in the analysis. EZH2 wild-type (p=0.006), gain of chromosome 18 (p=0.002), low percentages of CD8+ cells (p=0.011) and CD163+ areas (p=0.038) were associated with EF. No significant differences in other markers were observed, thereby refuting previous claims on their prognostic significance. Conclusion: Using an optimized study design, this LLBC study validates wild-type EZH2 status, gain of chromosome 18, low percentages of CD8+ cells and CD163+ area as predictors of EF to immunochemotherapy in FL.	Illumina HiSeq 2000	96
EGAD00001002739	Aligned sequence data from 14 Prostate cancer samples with BRCA2 mutations		49
EGAD00001002740	We propose to definitively characterise the somatic genetics of breast cancer through generation of comprehensive catalogues of somatic mutations in breast cancer cases by high coverage genome sequencing.	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina MiSeq	164
EGAD00001002741	Additional Xenograph files for PCGP SJERG	Illumina HiSeq 2000	11
EGAD00001002742	Whole-genome sequencing data from Chad and Lebanon.	HiSeq X Ten Illumina HiSeq 2500	15
EGAD00001002743	These samples comprise both melanoma cases and controls sequenced for a selection of loci linked to disease susceptibility. These bams are a subset of the sequencing restricted specifically to the GRCh37 coding areas of the BAP1 gene.		3186
EGAD00001002744	RNA sequencing data of human small intestinal macrophage subtypes	NextSeq 500	15
EGAD00001002745		Illumina HiSeq 2000 Illumina HiSeq 2500	7
EGAD00001002746		Illumina HiSeq 2000	13
EGAD00001002747	Whole-exome sequencing (WES) of 216 breast cancer metastasis-normal pairs from patients who underwent a biopsy in the context of the SAFIR01, SAFIR02, SHIVA or MOSCATO prospective trials (France).	Illumina HiSeq 2500 Illumina HiSeq 4000 NextSeq 500	432
EGAD00001002748	DDD DATAFREEZE 2014-11-04: 4293 trios - exome sequence CRAM files		1
EGAD00001002749	A KNIH001 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for islet cells	Illumina HiSeq 2000	1
EGAD00001002750	A KNIH002 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for islet cells	Illumina HiSeq 2000	1
EGAD00001002751	A KNIH003 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for islet cells	Illumina HiSeq 2000	1
EGAD00001002752	A KNIH004 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for islet cells	Illumina HiSeq 2000	1
EGAD00001002753	A KNIH005 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for islet cells	Illumina HiSeq 2000	1
EGAD00001002754	A KNIH006 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for beta cells	Illumina HiSeq 2000	1
EGAD00001002755	A KNIH007 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for adipocytes	Illumina HiSeq 2000	1
EGAD00001002756	A KNIH008 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for adipocytes	Illumina HiSeq 2000	1
EGAD00001002757	A KNIH009 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for preadipocytes	Illumina HiSeq 2000	1
EGAD00001002758	A KNIH010 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for podocytes	Illumina HiSeq 2000	1
EGAD00001002759	A KNIH011 Whole-Genome Bisulfite Sequencing(WGBS) paired end data for podocytes	Illumina HiSeq 2000	1
EGAD00001002760	A KNIH001 miRNA-seq single end data for islet cells	Illumina HiSeq 2500	1
EGAD00001002761	A KNIH002 miRNA-seq single end data for islet cells	Illumina HiSeq 2500	1
EGAD00001002762	A KNIH003 miRNA-seq single end data for islet cells	Illumina HiSeq 2500	1
EGAD00001002763	A KNIH004 miRNA-seq single end data for islet cells	Illumina HiSeq 2500	1
EGAD00001002764	A KNIH005 miRNA-seq single end data for islet cells	Illumina HiSeq 2500	1
EGAD00001002765	A KNIH006 miRNA-seq single end data for beta cells	Illumina HiSeq 2500	1
EGAD00001002766	A KNIH007 miRNA-seq single end data for adipocytes	Illumina HiSeq 2500	1
EGAD00001002767	A KNIH008 miRNA-seq single end data for adipocytes	Illumina HiSeq 2500	1
EGAD00001002768	A KNIH009 miRNA-seq single end data for preadipocytes	Illumina HiSeq 2500	1
EGAD00001002769	A KNIH010 miRNA-seq single end data for podocytes	Illumina HiSeq 2500	1
EGAD00001002770	A KNIH011 miRNA-seq single end data for podocytes	Illumina HiSeq 2500	1
EGAD00001002772	In this study we characterized genomic alterations in three bladder cancer patients with metastatic disease courses. Multiple regions were procured by laser microdissection or punctures from primary tumor, lymph node metastases and from distant metastases. Data provided here consist of 35 Bam files for WES (32 Tumors and 2 blood, 1 adjacent normal)	NextSeq 500	35
EGAD00001002883	RNAseq on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer sample of a validation cohort of 60 PDX	Illumina HiSeq 2000	60
EGAD00001002884	RNAseq on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer metastasis sample at early/late passages	Illumina HiSeq 2000	8
EGAD00001002885	Raw sequence data, fastq format	Illumina HiSeq 2000	26
EGAD00001002886	Exome sequencing of North American Brain Expression Consortium (NABEC) subject.	Illumina HiSeq 2000	298
EGAD00001002890	Exome sequencing of 102 French-Canadians		102
EGAD00001002891	Genome and transcriptome sequence data from a leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002892	The data contains genome sequencing of clear cell renal cell carcinomas and normal kidney tissues. The samples were collected from patients from different European countries.	Illumina HiSeq 1000	21
EGAD00001002893	This dataset contains all RNA-seq runs for the BLN panel of cell lines and matched parental tumors. Tumor/cell line pairs have been authenticated using SNP profiles and all pairs were confirmed. Please note: The dataset also contains raw data from an early primary culture (BLN-1) where no stable cell line could be generated. Please also note different reference genomes.	Illumina HiSeq 2000	21
EGAD00001002896	Amplicon sequencing libraries from the study "Histological Transformation and Progression in Follicular Lymphoma: a Clonal Evolution Study". These are Illumina amplicon deep sequencing libraries (n = 118) to validate somatic predictions made in the whole genome sequencing libraries. Specifically, there are 72 tumor libraries and 46 normal libraries. Some patients may have multiple amplicon libraries sequenced.	Illumina HiSeq 2000	118
EGAD00001002897	Whole genome sequencing libraries from the study "Histological Transformation and Progression in Follicular Lymphoma: a Clonal Evolution Study". These are libraries from 41 patients. Specifically: 15 transformed follicular lymphoma (TFL), 6 early progressers (PFL), and 20 non-early progressers (NPFL). For TFL and PFL patients, trios consisting of diagnostic (T1), transformed/progressed (T2) and a matching normal are available (n = 63 libaries in total). For NPFL patients, a tumor-normal pair are available (n = 40 libraries).		103
EGAD00001002898	Oliocapture sequencing libraries from the study "Histological Transformation and Progression in Follicular Lymphoma: a Clonal Evolution Study". These are sequencing libraries from the extension cohort of 277 patients. Specifically, there are 402 tumor libraries and 82 normal libraries.		484
EGAD00001002899	ATAC-seq data for 1 sample(s) for monocyte RPMI_T=4hrs from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002900	ATAC-seq data for 1 sample(s) for monocyte RPMI_LPS_T=24hrs from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002901	ATAC-seq data for 2 sample(s) for monocyte RPMI_LPS_T=24hrs_RPMI_T=5days from venous blood, on Genome GRCh38. 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	2
EGAD00001002902	ATAC-seq data for 3 sample(s) for naive B cell from venous blood, on Genome GRCh38. 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	3
EGAD00001002903	ATAC-seq data for 3 sample(s) for naive B cell from tonsil, on Genome GRCh38. 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	3
EGAD00001002904	ATAC-seq data for 1 sample(s) for monocyte RPMI_BG_T=24hrs_RPMI_T=5days from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002905	ATAC-seq data for 3 sample(s) for unswitched memory B cell from venous blood, on Genome GRCh38. 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	3
EGAD00001002906	ATAC-seq data for 1 sample(s) for monocyte RPMI_BG_T=1hr from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002907	ATAC-seq data for 2 sample(s) for osteoclast from venous blood, on Genome GRCh38. 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	2
EGAD00001002908	ATAC-seq data for 2 sample(s) for class switched memory B cell from venous blood, on Genome GRCh38. 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	2
EGAD00001002909	ATAC-seq data for 1 sample(s) for monocyte RPMI_BG_T=24hrs from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002910	ATAC-seq data for 1 sample(s) for monocyte RPMI_BG_T=4hrs from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002911	ATAC-seq data for 1 sample(s) for germinal center B cell from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002912	ATAC-seq data for 2 sample(s) for plasma cell from tonsil, on Genome GRCh38. 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	2
EGAD00001002913	ATAC-seq data for 1 sample(s) for monocyte RPMI_LPS_T=1hr from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002914	ATAC-seq data for 1 sample(s) for monocyte RPMI_T=6days from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002915	ATAC-seq data for 1 sample(s) for monocyte RPMI_BG_T=24hrs_RPMI_T=5days_LPS_T=4hrs from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002916	ATAC-seq data for 106 sample(s) Chronic Lymphocytic Leukemia from venous blood, on Genome GRCh38. 111 run(s), 109 experiment(s), 109 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	106
EGAD00001002917	ATAC-seq data for 2 sample(s) for germinal center B cell from tonsil, on Genome GRCh38. 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	2
EGAD00001002918	ATAC-seq data for 5 sample(s) Mantle Cell Lymphoma from venous blood, on Genome GRCh38. 5 run(s), 5 experiment(s), 5 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	5
EGAD00001002919	ATAC-seq data for 1 sample(s) for monocyte RPMI_T=1hr from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002920	ATAC-seq data for 4 sample(s) Multiple Myeloma for plasma cell from bone marrow, on Genome GRCh38. 4 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	4
EGAD00001002921	ATAC-seq data for 1 sample(s) for monocyte RPMI_T=24hrs from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002922	ATAC-seq data for 1 sample(s) for monocyte RPMI_LPS_T=4hrs from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002923	ChIPmentation data for 2 sample(s) for memory B cell from venous blood, on Genome GRCh38. 6 run(s), 4 experiment(s), 4 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	2
EGAD00001002924	ChIPmentation data for 2 sample(s) for central memory CD8-positive, alpha-beta T cell from venous blood, on Genome GRCh38. 11 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	2
EGAD00001002925	ChIPmentation data for 1 sample(s) for immature conventional dendritic cell GM-CSF_IL4_T=6_days from venous blood, on Genome GRCh38. 2 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002926	ChIPmentation data for 1 sample(s) for effector memory CD4-positive, alpha-beta T cell from venous blood, on Genome GRCh38. 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002927	ChIPmentation data for 1 sample(s) for central memory CD4-positive, alpha-beta T cell from venous blood, on Genome GRCh38. 5 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002928	ChIPmentation data for 7 sample(s) Acute Lymphocytic Leukemia for precursor B cell from bone marrow, on Genome GRCh38. 13 run(s), 13 experiment(s), 13 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	7
EGAD00001002929	ChIPmentation data for 1 sample(s) for CD38-negative naive B cell from cord blood, on Genome GRCh38. 5 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002930	ChIPmentation data for 1 sample(s) Acute Lymphocytic Leukemia from bone marrow, on Genome GRCh38. 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002931	ChIPmentation data for 3 sample(s) Lymphoma_Follicular from lymph node, on Genome GRCh38. 7 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	3
EGAD00001002932	ChIPmentation data for 1 sample(s) for germinal center B cell from tonsil, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).	Illumina HiSeq 2000	1
EGAD00001002933	ChIPmentation data for 1 sample(s) for class switched memory B cell from venous blood, on Genome GRCh38. 2 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002934	ChIPmentation data for 1 sample(s) for cytotoxic CD56-dim natural killer cell from venous blood, on Genome GRCh38. 2 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002935	ChIPmentation data for 2 sample(s) Acute Myeloid Leukemia for blast cell from bone marrow, on Genome GRCh38. 12 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	2
EGAD00001002936	ChIPmentation data for 5 sample(s) Acute Lymphocytic Leukemia for precursor B cell from venous blood, on Genome GRCh38. 7 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	5
EGAD00001002937	ChIPmentation data for 1 sample(s) for naive B cell from tonsil, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).	Illumina HiSeq 2000	1
EGAD00001002938	ChIPmentation data for 2 sample(s) T-cell Acute Lymphocytic Leukemia from capillary blood, on Genome GRCh38. 2 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	2
EGAD00001002939	ChIPmentation data for 3 sample(s) Burkitt Lymphoma from lymph node, on Genome GRCh38. 10 run(s), 8 experiment(s), 8 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	3
EGAD00001002940	ChIPmentation data for 1 sample(s) for conventional dendritic cell from cord blood, on Genome GRCh38. 4 run(s), 2 experiment(s), 2 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002941	ChIPmentation data for 1 sample(s) for mature conventional dendritic cell GM-CSF_IL4_T=6_days_R848_T=24hrs from venous blood, on Genome GRCh38. 2 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002942	ChIPmentation data for 2 sample(s) for regulatory T cell from venous blood, on Genome GRCh38. 14 run(s), 9 experiment(s), 9 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	2
EGAD00001002943	ChIPmentation data for 1 sample(s) for effector memory CD8-positive, alpha-beta T cell, terminally differentiated from venous blood, on Genome GRCh38. 5 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002944	ChIPmentation data for 2 sample(s) Activated B-Cell-Like Diffuse Large B-Cell Lymphoma from lymph node, on Genome GRCh38. 3 run(s), 3 experiment(s), 3 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	2
EGAD00001002945	ChIPmentation data for 1 sample(s) for effector memory CD8-positive, alpha-beta T cell from venous blood, on Genome GRCh38. 2 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	1
EGAD00001002946	ChIPmentation data for 2 sample(s) Germinal Center B-Cell-Like Diffuse Large B-Cell Lymphoma from lymph node, on Genome GRCh38. 6 run(s), 6 experiment(s), 6 alignment(s). Part of BLUEPRINT (September 2016).	NextSeq 500	2
EGAD00001002947	ChIP-Seq data for 5 sample(s) for thymocyte from thymus, on Genome GRCh38. 17 run(s), 17 experiment(s), 17 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	5
EGAD00001002948	ChIP-Seq data for 1 sample(s) for conventional dendritic cell from cord blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	1
EGAD00001002949	ChIP-Seq data for 1 sample(s) Acute Lymphocytic Leukemia for precursor B cell from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	1
EGAD00001002950	ChIP-Seq data for 1 sample(s) for memory B cell from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	1
EGAD00001002951	ChIP-Seq data for 1 sample(s) for class switched memory B cell from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 1 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	1
EGAD00001002952	ChIP-Seq data for 4 sample(s) T-cell Acute Lymphocytic Leukemia from capillary blood, on Genome GRCh38. 7 run(s), 7 experiment(s), 7 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_chipseq_analysis_ebi_20140811	Illumina HiSeq 2000	4
EGAD00001002953	RNA-Seq data for 8 sample(s) Acute Lymphocytic Leukemia for precursor B cell from bone marrow, on Genome GRCh38. 8 run(s), 8 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	8
EGAD00001002954	RNA-Seq data for 1 sample(s) Acute Lymphocytic Leukemia from bone marrow, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001002955	RNA-Seq data for 1 sample(s) for monocyte T=0day from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001002956	RNA-Seq data for 2 sample(s) T-cell lymphoma for helper T cell from venous blood, on Genome GRCh38. 2 run(s), 2 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	2
EGAD00001002957	RNA-Seq data for 1 sample(s) for monocyte T=2day_RANK_M-CSF from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001002958	RNA-Seq data for 1 sample(s) Acute Myeloid Leukemia from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001002959	RNA-Seq data for 1 sample(s) for monocyte T=1day_M-CSF_S100A9_4hr_RANL from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001002960	RNA-Seq data for 1 sample(s) for monocyte T=6day_S100A9_RANKL_M-CSF from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001002961	RNA-Seq data for 1 sample(s) for monocyte T=10day_S100A9_RANKL_M-CSF from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001002962	RNA-Seq data for 1 sample(s) Acute Myeloid Leukemia for blast cell from bone marrow, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001002963	RNA-Seq data for 6 sample(s) Acute Lymphocytic Leukemia for precursor B cell from venous blood, on Genome GRCh38. 6 run(s), 6 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	6
EGAD00001002964	RNA-Seq data for 1 sample(s) for monocyte T=1day_4hr_RANK from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001002965	RNA-Seq data for 1 sample(s) for monocyte T=2day_S100A9_RANKL_M-CSF from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001002966	RNA-Seq data for 1 sample(s) for monocyte T=10day_RANK_M-CSF from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001002967	RNA-Seq data for 1 sample(s) for monocyte T=6day_RANK_M-CSF from venous blood, on Genome GRCh38. 1 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	1
EGAD00001002968	RNA-Seq data for 2 sample(s) Acute Myeloid Leukemia for myeloid cell from venous blood, on Genome GRCh38. 2 run(s), 2 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_rnaseq_analysis_crg_20140811	Illumina HiSeq 2000	2
EGAD00001002969	Bisulfite-Seq data for 1 sample(s) Acute Lymphocytic Leukemia for precursor B cell from bone marrow, on Genome GRCh38. 3 run(s), 1 experiment(s), 0 alignment(s). Part of BLUEPRINT (September 2016).Analysis documentation available at http://ftp.ebi.ac.uk/pub/databases/blueprint/releases/20140811/homo_sapiens/README_bisulphite_analysis_CNAG_20140811	Illumina HiSeq 2000	1
EGAD00001002972	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002973	Genome and transcriptome sequence data from a rectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002974	Genome and transcriptome sequence data from a metastatic gastric adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002975	Genome and transcriptome sequence data from a metastatic neuroendocrine carcinoma of unknown primary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002976	Genome and transcriptome sequence data from a metastatic cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002977	Genome and transcriptome sequence data from a metastatic colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002978	Genome and transcriptome sequence data from a metastatic colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002979	Genome and transcriptome sequence data from a GI primary (prev breast cancer) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002980	Genome and transcriptome sequence data from a metastatic fibrolamellar hepatocelluar carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002981	Genome and transcriptome sequence data from a metastatic pancreatic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002982	Genome and transcriptome sequence data from a metastatic rectosigmoid adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002983	Genome and transcriptome sequence data from a metastatic carcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002984	Genome and transcriptome sequence data from a metastatic pancreatic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002985	Genome and transcriptome sequence data from a adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002986	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002987	Genome and transcriptome sequence data from a metastatic endocervical adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002988	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002989	Genome and transcriptome sequence data from a medullary thyroid cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002990	Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002991	Genome and transcriptome sequence data from a cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002992	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002993	Genome and transcriptome sequence data from a metastatic carcinoma of primary unknown patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002994	Genome and transcriptome sequence data from a metastatic squamous cell carcinoma of anus patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002995	Genome and transcriptome sequence data from a carcinosarcoma of the uterus patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002996	Genome and transcriptome sequence data from a leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002997	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002998	Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001002999	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003000	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003001	Genome and transcriptome sequence data from a serous carcinoma of fallopian tube patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003002	Genome and transcriptome sequence data from a metastatic adult granulosa cell tumour patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003003	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003004	Genome and transcriptome sequence data from a glioblastoma multiforme patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003005	Genome and transcriptome sequence data from a metastatic adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003006	Genome and transcriptome sequence data from a metastatic medullary thyroid cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003007	Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003008	Genome and transcriptome sequence data from a metastatic adenocarcinoma presumably of ovarian origin patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003009	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003010	Genome and transcriptome sequence data from a metastatic uterine leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003011	Genome and transcriptome sequence data from a squamous cell carcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003012	Genome and transcriptome sequence data from a metastatic adenocarcinoma of the rectum patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003013	Genome and transcriptome sequence data from a metastatic gastric cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003014	Genome and transcriptome sequence data from a metastatic colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003015	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003016	Genome and transcriptome sequence data from a metastatic ductal carcinoma of the breast patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003017	Genome and transcriptome sequence data from a metastatic large cell neuroendocrine tumour of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003018	Genome and transcriptome sequence data from a metastatic clear cell sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003019	Genome and transcriptome sequence data from a metastatic uveal melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003020	Genome and transcriptome sequence data from a low grade serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003021	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003022	Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003023	Genome and transcriptome sequence data from a metastatic renal cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003024	Genome and transcriptome sequence data from a liposarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003025	Genome and transcriptome sequence data from an endometrial adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003026	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003027	Genome and transcriptome sequence data from an anaplastic myxopapillary ependymoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003028	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003029	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003030	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003031	Genome and transcriptome sequence data from a metastatic collecting duct kidney cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003032	Genome and transcriptome sequence data from a metastatic gastric adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003033	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003034	Genome and transcriptome sequence data from a metastatic colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003035	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003036	Genome and transcriptome sequence data from an ovarian adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003037	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003038	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003039	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003040	Genome and transcriptome sequence data from a chordoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003041	Genome and transcriptome sequence data from a metastatic colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003042	Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003043	Genome and transcriptome sequence data from a breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003044	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003045	Genome and transcriptome sequence data from a metastatic lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003046	Genome and transcriptome sequence data from a sigmoid cancer and an ampullary cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003047	Genome and transcriptome sequence data from a non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003048	Genome and transcriptome sequence data from a metastatic pancreatic neuroendocrine tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003049	Genome and transcriptome sequence data from a prostate cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003050	Genome and transcriptome sequence data from a serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003051	Genome and transcriptome sequence data from a cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003052	Genome and transcriptome sequence data from a metastatic malignant peripheral nerve sheath tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003053	Genome and transcriptome sequence data from an adrenocortical carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003054	Genome and transcriptome sequence data from a low-grade serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003055	Genome and transcriptome sequence data from a small bowel carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003056	Genome and transcriptome sequence data from a solitary fibrous tumors (sarcoma) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study	PromethION	1
EGAD00001003057	Genome and transcriptome sequence data from a metastatic lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003058	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003059	Genome and transcriptome sequence data from a metastatic mullerian tumor of endometrium patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003060	Genome and transcriptome sequence data from a liposarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003061	Genome and transcriptome sequence data from an adenocarcinoma of the distal esophagus patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003062	Genome and transcriptome sequence data from an extraosseous osteosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003063	Genome and transcriptome sequence data from an atypical bronchial carcinoid patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003064	Genome and transcriptome sequence data from an ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003065	Genome and transcriptome sequence data from a metastatic adenoid cystic carcinoma of the palate patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003066	Genome and transcriptome sequence data from an appendiceal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003067	Genome and transcriptome sequence data from a metastatic gastroesophageal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003068	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003069	Genome and transcriptome sequence data from a pancreatic neuroendocrine tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003070	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003071	Genome and transcriptome sequence data from a pleural mesothelioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003072	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003073	Genome and transcriptome sequence data from a metastatic adenocarcinoma of the pancreas patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003074	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003075	Genome and transcriptome sequence data from a metastatic colon caner patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003076	Genome and transcriptome sequence data from a metastatic adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003077	Genome and transcriptome sequence data from a metastatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003078	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003079	Genome and transcriptome sequence data from a presumed metastatic lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003080	Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003081	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003082	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003083	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003084	Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003085	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003086	Genome and transcriptome sequence data from a metastatic colorectal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003087	Genome and transcriptome sequence data from a pancreatic neuroendocrine cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003088	Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003089	Genome and transcriptome sequence data from a pancreatic neuroendocrine patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003090	Genome and transcriptome sequence data from a metastatic leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003091	Genome and transcriptome sequence data from a clear cell carcinoma of ovary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003092	Using sequencing and gene expression analyses, we identified a subgroup of HCA characterized by fusion of the INHBE and GLI1 genes and activation of sonic hedgehog pathway. Molecular subtypes of HCAs associated with different patients’ risk factors for HCA, disease progression, and pathology features of tumors. This classification system might be used to select treatment strategies for patients with HCA. Related Publication: Molecular Classification of Hepatocellular Adenoma Associates With Risk Factors, Bleeding, and Malignant Transformation Nault, Jean-CharlesLaurent, Christophe et al. Gastroenterology , Volume 152 , Issue 4 , 880 - 894.e6 http://dx.doi.org/10.1053/j.gastro.2016.11.042	Illumina HiSeq 2000	21
EGAD00001003096	As part of the International Parkinson's Disease Genomics Consortium, exomes of Parkinson's disease (PD) patients and healthy controls were sequenced to study the genetic etiology of PD. This UK cohort consists of 70 PD patients. Researchers can apply for access to fastq files for this cohort.	Illumina HiSeq 2000	77
EGAD00001003097	High-coverage sequencing data from 47 Yemenis samples	HiSeq X Ten	47
EGAD00001003098	Low-coverage sequencing data from 99 Lebanese samples	Illumina HiSeq 2500	99
EGAD00001003099	RNAseq data set (Mollaoglu et al., MYC drives progression of small cell lung cancer to a variant neuroendocrine subtype with vulnerability to Aurora kinase inhibition) RNA isolation from primary tumors and healthy lungs was performed using RNeasy Mini Kit (Qiagen) with the standard protocol. RNA was subjected to library construction with the Illumina TruSeq Stranded mRNA Sample Preparation Kit (cat# RS-122-2101, RS-122-2102) according to manufacturer’s protocol. Chemically denatured sequencing libraries (25 pM) are applied to an Illumina HiSeq v4 single read flow cell using an Illumina cBot. Hybridized molecules were clonally amplified and annealed to sequencing primers with reagents from an Illumina HiSeq SR Cluster Kit v4-cBot (GD-401-4001). Following transfer of the flowcell to an Illumina HiSeq 2500 instrument (HCSv2.2.38 and RTA v1.18.61), a 50 cycle single-read sequence run was performed using HiSeq SBS Kit v4 sequencing reagents (FC-401-4002).	Illumina HiSeq 2000	14
EGAD00001003100	UKBEC 1st release of Exome data for 65 neuropathologically confirmed control individuals of European descent.	Illumina HiSeq 2000	65
EGAD00001003101	The need for a detailed catalogue of local variability for the study of rare diseases within the context of the Medical Genome Project motivated the whole exome sequencing of 267 unrelated individuals, representative of the healthy Spanish population.		267
EGAD00001003102	We sequenced the polyA+ fraction of the RNA of the leukocytes from 624 sardinian individuals with RNAseq. Prior to library preparation we added either ERCC ExFold RNA Spike-In. An average of 60M reads per samples with 51 bp paired-end reads were generated on a HiSeq 2000 (Illumina). Sequencing reads were then aligned using STAR-2.2.0c2 to the h37d5 reference genome supplemented with the ERCC spike-ins sequences. We further provided an exon-exon junction database that we generated from the GENCODE v14 annotation. In order to remove a contamination from a parallel experiment, we discarded any reads that mapped to the genomic regions of CBLB (chr3:105370773-105592330) and BCL11A (chr2:60672555-60784156). Filtered aligned reads (bam format) are shared.	Illumina HiSeq 2000	624
EGAD00001003103	Cohort of 19 ADPKD patients characterized using long-read sequencing. The variant identification provided high sensitivity in identifying PKD1 pathogenic variants, with a diagnostic yield of 94.7%. This dataset includes all sequencing data (BAM files) of the 19 patients, in addition to their raw variants (unfiltered) obtained from the long-read sequencing as well as Sanger sequencing (VCF file).	PacBio RS II	19
EGAD00001003106	Human HiC	Illumina HiSeq 2000	16
EGAD00001003107	We collected fresh tissue from an untreated GBM directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine after selection of CD11b+ cells using magnetic beads.	Illumina HiSeq 2500	1
EGAD00001003108	We collected fresh tissue from an untreated GBM directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine after selection of CD11b+ cells using magnetic beads.	Illumina HiSeq 2500	1
EGAD00001003109	We collected fresh tissue from an untreated GBM directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine after selection of CD11b+ cells using magnetic beads.	Illumina HiSeq 2500	1
EGAD00001003110	We collected fresh tissue from an untreated GBM directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine after selection of CD11b+ cells using magnetic beads.	Illumina HiSeq 2500	1
EGAD00001003111	We collected fresh tissue from an untreated GBM directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine after selection of CD11b+ cells using magnetic beads.	Illumina HiSeq 2500	1
EGAD00001003112	We collected fresh tissue from an untreated GBM (SF10592) directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine, resulting in sequencing libraries from 96 individual cells.	Illumina HiSeq 2500	1
EGAD00001003113	We collected fresh tissue from an untreated GBM (SF10679) directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine, resulting in sequencing libraries from 96 individual cells.	Illumina HiSeq 2500	1
EGAD00001003114	We collected fresh tissue from an untreated GBM (SF10281) directly from the operating room and subjected the biopsy to single-cell RNA-seq with the fluidigm C1 machine, resulting in sequencing libraries from 96 individual cells.	Illumina HiSeq 2500	1
EGAD00001003115	Whole genome sequencing data of 15 French Caucasian and 10 African-Caribbean men with prostate Cancer.	Illumina HiSeq 2000	50
EGAD00001003116	Benchmark data set containing five tumor/normal pairs of non-small cell lung cancer (NSCLC) patients. Tissue pairs were screened with bisulfite (BS) sequencing, MeDIP methylation enrichment sequencing and RNA sequencing in order to identify differentially methylated and expressed spots in the genomes.	Illumina HiSeq 2500	10
EGAD00001003117	In this study, we sequenced three NUT midline carcinoma genomes and their transcriptomes (NMC1, NMC2 and Ty-82), and two paired normal blood samples (for NMC1 and NMC2). Whole-genome sequencing libraries were generated by PCR-free methods, and sequencing run was made in HiSeq X machines. Transcriptome (mRNA) sequencing was performed in HiSeq 2500 machines. PCR duplicates-marked, indel-realigned, and base-recalibrarted BAM files are provided in our dataset.	HiSeq X Ten Illumina HiSeq 2500	8
EGAD00001003118	Targeted capture sequencing for cases with MDS who were subjected to unrelated bone marrow transplantation via Japan marrow donor program		797
EGAD00001003119	TP53 targeted panel aligned reads consisting of BAM paired end reads from ovarian cancer tumor samples Data Access Committee	Illumina MiSeq	76
EGAD00001003120	We used WGS (Complete Genomics) to characterise five metastatic tumours from a BRAF mutant melanoma patient who presented intrinsic resistance.	Complete Genomics	6
EGAD00001003121	Dataset is composed of FASTQ files from 165 samples of small round cell sarcomas which were RNA-sequenced (whole transcriptome) with either Illumina HiSeq 2500 (120 million reads per sample, paired-end 100 pb) or Illumina NextSeq 500 (110 million reads per sample, paired-end 150)	Illumina HiSeq 2500 NextSeq 500	165
EGAD00001003122	December 2016 data update (bam/fastq for WGBS on samples CEMT0062, CEMT0068, CEMT0072, CEMT0086, CEMT0087) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2500	5
EGAD00001003125	WGS data of medulloblastoma tumor/control pairs.		224
EGAD00001003126	WGS data of medulloblastoma tumor/control pairs.		74
EGAD00001003127	WGS data of medulloblastoma tumor/control pairs.		482
EGAD00001003128	Exome sequencing data for medulloblastoma tumor/control pairs		35
EGAD00001003130	Whole exome sequencing data for patients with Bosma arhinia microphthalmia syndrome (BAMS). The dataset includes 21 samples from 7 families with BAMS; see Gordon et al, Nature Genetics, 2017.	Illumina HiSeq 2500 Illumina HiSeq 4000	21
EGAD00001003131	The dataset consists of two main sample groups. 1) The inter-tumour sample group contains a total of 97 samples from 27 patients. Each patient has a single normal and primary sample as well as one or more metastases. All samples were sequenced using IonTorrent PGM and a custom colorectal cancer (CRC) panel. 2) The intra-tumour sample group contains a total of 68 samples from a single tumour as well as a normal tissue sample. All 68 samples were sequenced using IonTorrent PGM and a custom CRC panel. Shallow whole genome sequencing was additionally applied to 10 of the samples using Illumina HiSeq 4000.	Illumina HiSeq 4000 Ion Torrent PGM	193
EGAD00001003132	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: GACA-CN.		84
EGAD00001003133	RRBS data of 86 Ewing patients (French). Illumina HiSeq 2000/2500 (Fastq files available). Sheffield et al. Nat Med. 2017 Jan 30	Illumina HiSeq 2000	86
EGAD00001003134	DATA FILES FOR GRUBER SJAMLM7 EXOME	Illumina HiSeq 2000	114
EGAD00001003135	DATA FILES FOR GRUBER SJAMLM7 RNASEQ	Illumina HiSeq 2000	86
EGAD00001003136	We carried out whole-genome oxidative bisulfite sequencing (WGoxBS) in the placentas of two healthy female and two healthy male pregnancies generating an average genome depth of coverage of 25x. The sex-specific differential methylation pattern observed in this region was validated in additional 8 healthy placentas (including 2 from the WGoxBS) using SureSelect in-solution target capture. For WGoxBS, placental genomic DNA (4 µg) from 4 healthy pregnancies was processed to achieve 10 kb fragments with the g-Tube (Covaris), according to the manufacturer's instructions. To increase the number of uniquely sequenced reads, two independent libraries were generated for each individual. Multiplexed sequencing was carried out on the Illumina MiSeq, HiSeq 2000, and HiSeq 2500 instruments with 2x100, 2x50 and 2x125 cycles using MiSeq Reagent Kit v3, HiSeq SBS Kit v3 and HiSeq SBS Kit v4, respectively. For SureSelect in-solution capture, placental genomic DNA (3.5 µg) from 8 healthy pregnancies (including 2 from the WGoxBS) was fragmented by the Covaris S220 system according to the SureSelect Methyl-Seq target enrichment protocol (Agilent). All 8 libraries were pooled and sequenced on the Illumina HiSeq 2500 instrument with 2 × 125 cycles using HiSeq SBS Kit v4 and a single lane of the Illumina HiSeq 4000 instrument with 2 × 150 cycles using HiSeq 3000/4000 SBS Kit following Illumina's guidelines (Illumina Application Note: Epigenetics February 2016).	Illumina HiSeq 2500	10
EGAD00001003137	Metastatic and primary tumour samples were collected from 4 patients with advanced breast cancer. Samples were collected at autopsy and also from biopsies taken during life. Tumour and germline samples are available. Whole exome sequencing was performed on all samples.		52
EGAD00001003138	A dataset consisting of Multi-regional Whole Exome Sequencing (WES) and Whole Genome Sequencing (WGS) data for 54 samples from 9 patients with hepatocellular carcinoma. The dataset includes 45 tumor samples and 9 normal blood samples. Selected somatic variants were validated by Sequenom. Patients covered are: Patient 1, Patient 2, Patient 3, Patient 4, Patient 5, Patient 6, Patient 7, Patient 8, Patient 9 and Patient 10.	Illumina HiSeq 2500	54
EGAD00001003139	200PG : WGS Aligned Sequence (fastq) : Aligned WG sequence data (bam) in this dataset are from the 124 CPCGene Tumour/Normal Pairs used in the 200PG Study. https://www.ncbi.nlm.nih.gov/pubmed/28068672		1
EGAD00001003140	We analyzed the spectrum and clinical significance of MYC and BCL2 mutations in 347 DLBCL cases from population-based cohort of BC, Canada.	Illumina MiSeq	347
EGAD00001003141	List of SNPs, and their frequencies, extracted from a low pass whole genome sequencing of 3,514 individuals.		1
EGAD00001003142	RNA sequencing of 31 patient-derived fibroblast cell lines from patients with inborn errors of cobalamin (vitamin B12) metabolism, and 7 control samples. The RNA seq library was prepared using the TruSeq Stranded Total RNA Sample Preparation Kit (Illumina RS-122–2301) including Ribo-Zero Gold depletion to remove ribosomal RNA. Sequencing was done via llumina Hi-Seq2000 sequencer, using 100bp paired end reads.	Illumina HiSeq 1500 Illumina HiSeq 2000	38
EGAD00001003143	Total stranded TruSeq RNA sequencing by Illumina of six tumor samples from six cases of pediatric Pilocytic astrocytoma. The data is published in the following paper: Tomic TT, Olausson J, Wilzen A, Sabel M, Truve K, Sjogren H, Dosa S, Tisell M, Lannering B, Enlund F, Martinsson T, Aman P, Abel F. A new GTF2I-BRAF fusion mediating MAPK pathway activation in pilocytic astrocytoma. PLoS One. 2017 Apr 27;12(4):e0175638.	Illumina HiScanSQ	6
EGAD00001003145	Sensory neurons are nerve cells that are activated by sensory input such as heat, light and convey information to the brain. Although a key cell type in complex organisms, human sensory neurons are challenging to study because they are impossible to obtain from living donors. We have collaborated with the Neucentis Pharmaceutical Research Unit to differentiate sensory neuron like cells from human induced pluripotent stem cells derived as part of the Human Induced Pluripotent Stem Cells Initiative. We will sequence RNA from 100 IPS lines derived from healthy individuals and perform RNA-seq on the differentiated cells to identify noncoding variants that alter gene expression in human sensory neurons.	Illumina HiSeq 2000 Illumina MiSeq	123
EGAD00001003146	We performed whole genome sequencing of nine OC patient-derived cell lines and one normal cell line (HOSEpiC) to analyze if the cell lines harbor OC-typical genomic aberrations absent in normal cells and to relate genomic features to drug sensitivities.		10
EGAD00001003148	Microfluidic direct library preparation (DLP) single-cell whole-genome BAM files for near-diploid immortalized lymphoblastoid cell line GM18507.	NextSeq 500	192
EGAD00001003149	Microfluidic direct library preparation (DLP) single-cell whole-genome BAM files for third-passage patient-derived primary triple-negative breast cancer xenograft SA501X3F.	Illumina HiSeq 2500	384
EGAD00001003150	Microfluidic direct library preparation (DLP) single-cell whole-genome BAM files for fourth-passage patient-derived primary triple-negative breast cancer xenograft SA501X4F.	Illumina HiSeq 2500	384
EGAD00001003151	Bulk whole-genome BAM files for 184-hTERT-L2, SA501X3F, and SA501X4F.	Illumina HiSeq 2500	3
EGAD00001003152	Microfluidic direct library preparation (DLP) single-cell whole-genome BAM files for near-diploid immortalized breast epithelial cell line 184-hTERT-L2.	Illumina HiSeq 2500	192
EGAD00001003153	Sequencing of untreated pancreatic cancer metastases and primary tumor sections.	Illumina HiSeq 2000 Illumina HiSeq 2500	49
EGAD00001003154	RNA-Seq files for SJOS study	Illumina HiSeq 2000	14
EGAD00001003155	WES files for SJMDS paper titled 'Genomic Landscape of Pediatric Myelodysplastic Syndromes'	Illumina HiSeq 2000	6
EGAD00001003156	WGS files for SJMDS paper titled 'Genomic Landscape of Pediatric Myelodysplastic Syndromes'	Illumina HiSeq 2000	4
EGAD00001003157	Alignment of Genome Denmark Phase II dataset to GRCh38. The dataset consists of 150 Danish individuals (50 trios) sequenced to 80X. The BAM-file contains data from multiple libraries created from one individual with libraries of 180, 500, 800, 2000, 5000, 10000 and 20000 bp. The libraries were created using standard Illumina protocols for paired end reads (180-800bp libraries) and mate pair libraries (2kb-20kb).		150
EGAD00001003158	Bam files consisting of aligned MeDIP-seq reads from cord blood cells and cord blood mononuclear cells of twins conceived through in vitro fertilisation	Illumina Genome Analyzer II	75
EGAD00001003159	Bam files consisting of aligned MeDIP-seq reads from cord blood cells and cord blood mononuclear cells of twins not conceived through in vitro fertilisation	Illumina Genome Analyzer II	105
EGAD00001003160	Exome data from patients and parents with DONSON mutations	Illumina HiSeq 2000	15
EGAD00001003161	HipSci - Bardet-Biedl Syndrome - Exome Sequencing - October 2016	Illumina HiSeq 2000 Illumina HiSeq 2500	3
EGAD00001003162	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: PACA-CA.		298
EGAD00001003163	Whole genome sequencing data of 20 carcinosarcomas.	Illumina HiSeq 2000	23
EGAD00001003164	Variant call set (vcf) for three (primary and two recurrent) tumors		3
EGAD00001003165	Whole genome sequencing was performed for 81 liver cancer cell lines. Additional whole exome sequencing was performed for a subset of 11 liver cancer cell lines. SK_HEP_1 was also provided, though considered not hepatic origin. These sequencing data provided the detailed genomic characterization of liver cancer models.	HiSeq X Ten	82
EGAD00001003168	The blood samples of eight lung cancer patients and one benign lung tumor patient are collected for this dataset. Blood samples were centrifuged first at 1,600 × g for 10 minutes, and then the plasma was transferred into new micro tubes and centrifuged at 16,000 × g for another 10 minutes. The plasma was collected and stored at -80⁰C. CfDNA was extracted from 5 ml plasma using the Qiagen QIAamp Circulating Nucleic Acids Kit and quantified by Qubit 3.0 Fluoromter (Thermo Fisher Scientific). Bisulfite conversion of cfDNA was performed by using EZ-DNA-Methylation-GOLD kit (Zymo Research). After that, Accel-NGS Methy-Seq DNA library kit (Swift Bioscience) was used to prepare the sequencing libraries. The DNA libraries were then sequenced with 150bp paired-end reads.	HiSeq X Ten	9
EGAD00001003174	There are 116 liver cancer cases in this study and belong to LICA-CN project	Illumina HiSeq 2000	232
EGAD00001003176	For each subject, genomic DNA from whole blood, circulating cell free DNA and tumor tissues (whenever possible) were performed targeting next generation sequencing on Illumina Miseq or Hiseq 4000 platforms. The sequencing results of whole blood were used to distinguish germline and somatic mutations. Specimens were collected from patients with different kinds of solid tumors, but most are lung cancer patients.	Illumina HiSeq 4000 Illumina MiSeq	1845
EGAD00001003180	HipSci - Monogenic Diabetes - RNA Sequencing - October 2016	Illumina HiSeq 2000 Illumina HiSeq 2500	1
EGAD00001003181	HipSci - Bardet-Biedl Syndrome - RNA Sequencing - October 2016	Illumina HiSeq 2000 Illumina HiSeq 2500	3
EGAD00001003186	Variants on the Y chromosome for 62 danish males in VCF format from the GenomeDenmark Phase 2 cohort. Variants were called using reference based approaches such as the haplotype-caller module from GATK and using alignment of denovo assemblies to the reference using ASMvar.		68
EGAD00001003187	TBD	Complete Genomics	9
EGAD00001003188	Variants and genotypes called in 50 danish parent-offspring trios from 80x Illumina sequencing data using BayesTyper. Data was produced using different insert size libraries of the sizes 180, 500, 800, 2000, 5000, 10000 and 20000 bp. The sample IDs for the fathers and mothers are TrioID-01 and TrioID-02, respectively, and the IDs for the children are TrioID-0x, where x is a number between 3 and 7		150
EGAD00001003189	Whole genome sequencing of 8 HER2-Positive Breast Cancer (in complement to EGAD00001001844)	Illumina HiSeq 2000	16
EGAD00001003190	WGS blood data (fastq raw read sequences) for French ICGC leiomyosarcoma cancer sequencing project, 67 samples representing 67 donors. Sequencing was performed on Illumina HiSeq. The libraries were then sequenced with a 2 x 100bp paired-end protocol to a minimum mean coverage of 30x.	Illumina HiSeq 2000	67
EGAD00001003191	WGS cancer data (fastq raw read sequences) for French ICGC leiomyosarcoma cancer sequencing project, 78 samples representing 67 donors. Sequencing was performed on Illumina HiSeq. The libraries were then sequenced with a 2 x 100bp paired-end protocol to a minimum mean coverage of 50x.	Illumina HiSeq 2000	78
EGAD00001003192	RNA-Seq data (fastq raw read sequences) for French ICGC leiomyosarcoma cancer sequencing project, 78 samples representing 67 donors. Sequencing was performed on Illumina HiSeq. The libraries were then sequenced with a 2 x 75bp paired-end protocol to a minimum mean reads of 50 million.	Illumina HiSeq 2000	78
EGAD00001003193	Exome sequencing for 2 infertile brothers	Illumina HiSeq 2500	2
EGAD00001003194	This dataset contains whole exome sequence of six HCC patients from Qidong China who are very likely exposed to aflatoxin.	Illumina HiSeq 2500	12
EGAD00001003196	Amplicon-based fungal metagenomic sequencing for the identification of fungal species in brain tissue from Alzheimer's disease. The study consists in 14 samples, sequenced using Illumina's paired-end technology.	Illumina MiSeq	14
EGAD00001003200	Files from whole exome sequencing of 26 tumors and two matched normals from one melanoma patient. The 26 tumors include the untreated primary, cutaneous metastases and distant metastases to internal organs.	Illumina HiSeq 2500	28
EGAD00001003203	Aligned (hg19) sequencing data from 16 participants with FL/DLBCL.		37
EGAD00001003204	Understanding how cells sense and respond to their environment, and how these responses are modulated by genetic variation, are fundamental biological problems, particularly for understanding how pathogenic organisms invade and manipulate the cells of the human immune system. Macrophages recognize and respond to many important human pathogens including HIV-1, Mycobacteria tuberculosis and Salmonella. This study will focus on the cellular response of human macrophages to Salmonella infection and how this response is modulated by the genetic bacground of the individual as well as additional pro-inflammatory stimulus (interferon-gamma priming). We will acquire 100 human induced pluripotent stem cell lines from the HipSci project, differentiate the cells in vitro into macrophages and expose them to four environmental conditions: (i) no stimulation, (ii) interferon-gamma (18h), (iii) Salmonella typhimurium SL1344 (5h), (iv) interferon-gamma (18h) + Salmonella (5h).Subsequently, we will isolate RNA from the samples for sequencing.	Illumina HiSeq 2500	236
EGAD00001003205	160 WES and 25 WGS for HBV related HCC, and 15 WES for ICC belongs LICA-CN	Illumina HiSeq 2000	402
EGAD00001003206	BACKGROUND TRACERx (TRAcking Cancer Evolution through therapy (Rx)) is a prospective cohort study designed to investigate intratumor heterogeneity (ITH) in relation to clinical outcome, and to determine the clonal nature of driver events and evolutionary processes in early stage non-small cell lung cancer (NSCLC). METHODS Multiregion high-depth whole-exome sequencing (M-seq) was performed on 100 early stage NSCLC tumors resected prior to systemic therapy. A total of 327 tumor regions were sequenced and analyzed to define evolutionary histories, obtain a census of clonal and subclonal events, and assess the relationship between ITH and recurrence-free survival (RFS). RESULTS Widespread ITH was observed for both somatic copy number alterations (median 48% [0.03-88%]) and mutations (median 30% [0.5-93%]). Driver mutations in EGFR, MET, BRAF and TP53 were almost always clonal. However, heterogeneous driver alterations occurring later in evolution were found in over 75% of tumors and were common in PIK3CA, NF1 and genes involved in chromatin modification and DNA response and repair. Genome doubling and ongoing dynamic chromosomal instability (CIN), illustrated by mirrored subclonal allelic imbalance, were identified as causes of ITH resulting in parallel evolution of driver copy number events, including amplifications of CDK4, FOXA1, and BCL11A. Elevated copy number heterogeneity was associated with shorter RFS (HR=4.9, P=0.00044), which remained significant in a multivariate analysis. CONCLUSIONS ITH mediated through CIN, rather than point mutational heterogeneity, was associated with increased risk of relapse, supporting its value as a prognostic predictor, and the need to target this high-risk phenotype.		427
EGAD00001003207	Whole genome sequencing data for MMML (28 tumor/control pairs)		56
EGAD00001003208	Whole genome sequencing data for MMML (12 tumor/control pairs)	Illumina HiSeq 2000	-
EGAD00001003210	Whole genome sequencing data for MMML (cell_line)		8
EGAD00001003211	Deep (>25x mean coverage) whole genome sequencing on 5-10 families drawn from the Scottish Family Health Study with four or more children.	HiSeq X Ten	57
EGAD00001003213	The olfactory gene repertoire is largely species-specific, shaped by the nature and necessity of chemosensory information for survival in each species' niche. We are intrigued by this interspecific variation and started to investigate the olfactory transcriptome in primates for evidence of selection at the level of receptor gene choice. Having collected this data from two primates, we now wish to extend the analysis to humans. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2500	9
EGAD00001003215	This data set contains whole exome sequences of individuals with self-stated parental relatedness from the East London Genes & Health cohort. Rare frequency functional variants in these healthy individuals will be studied with respect to the genetic health of the participants and loss-of-function analysis of human genes.	Illumina HiSeq 2000 Illumina HiSeq 2500	-
EGAD00001003216	Whole genome sequencing of tumour normal pairs of human undifferentiated sarcomas.	HiSeq X Ten	98
EGAD00001003217	Targeted resequencing at high depth (21 genes, 9 chromosomal regions): at least 4 FFPE samples per case and matched germline DNA: * 100 cases with detailed outcome data, including 15 cases with tumour relapse (515 samples) * 40 cases with matched pre-chemotherapy biopsies (240 samples) * 50 nephrogenic rests matched to above cases (50 samples) We expect a proportion (possibly 10%) of cases to be mutationally silent on the above studies, and propose to subsequently carry out integrated whole-genome, methylome and transcriptome studies on matched frozen tissue from these cases	Illumina HiSeq 2500	35
EGAD00001003218	There are 80 Brain cancer cases （160 samples）in this study and belong to GBM-CN project.	Illumina HiSeq 2000	80
EGAD00001003220	Whole genome, whole exome, and custom panel sequencing of high-grade meningioma cohort		188
EGAD00001003221	Aligned, merged and deduplicated BAM files from BGISeq-500 sequencing of six samples: matched tumour-normal pairs from three melanoma patients.		6
EGAD00001003222	Aligned, merged and deduplicated BAM files from HiSeqXTen sequencing of six samples: matched tumour-normal pairs from three melanoma patients.		6
EGAD00001003223	We collected tumor samples and adjacent nomal mucosae from 5 patients with colorectal cancer in surgical operation from 2014 to 2016 in the First Affiliated Hospital of Chongqing Medical University (Chongqing, China) and the Research Institute of Surgery, Third Military Medical University (Chongqing, China). the qualified captured library of each sample was then loaded on Illumina HiSeq 2000 (Illumina, San Diego, CA) platforms and subjected to high-throughput sequencing.	Illumina HiSeq 2000	10
EGAD00001003224	We collected tumor samples and adjacent nomal mucosae from 17 patients with colorectal cancer in surgical operation from 2014 to 2016 in the First Affiliated Hospital of Chongqing Medical University (Chongqing, China) and the Research Institute of Surgery, Third Military Medical University (Chongqing, China). the qualified captured library of each sample was then loaded on Illumina HiSeq 2000 (Illumina, San Diego, CA) platforms and subjected to high-throughput sequencing.	Illumina HiSeq 2000	34
EGAD00001003225	Whole Genome Sequencing Illumina HiSeq data from 111 men with prostate cancer. Samples were taken from primary tissue obtained at prostatectomy (target sequencing depth 50X) with matched blood control (target sequencing depth 30X). This data is from batches 4 to 6.	Illumina HiSeq 2000	221
EGAD00001003227	ICGC PCAWG Dataset for WGS BAM aligned using BWA MEM. Project: OV-AU.		146
EGAD00001003230	Small RNA expression profiles of the blood plasma-derived exosomes from B-cell chronic lymphocytic leukemia patients	Illumina HiSeq 2000	3
EGAD00001003231	Poly A transcriptome sequence of mutifocal hepatocelular carcinoma	Illumina HiSeq 2000	7
EGAD00001003234	Aligned whole genome sequence from AML relapse project		33
EGAD00001003235	Raw exome sequence data(fastq) for the GATCI project	unspecified	172
EGAD00001003236	Raw whole genome sequence data(fastq) for the GATCI project	HiSeq X Ten	10
EGAD00001003237	Primary mucosal melanomas (MMs) arise from melanocytes located in mucosal membranes lining the respiratory, gastrointestinal and urogenital tracts. MMs frequently present late and have a poor prognosis; the 5-year survival rate is only 14%. MM makes up only ~1.4% of all melanomas and it is this rarity that makes knowledge of the genetic changes that contribute to its pathogenesis limited to a small number of exome/genome studies and other targeted studies. Thus to investigate the somatic alterations and mutation spectra in MM genomes, we have extracted genomic DNA from formalin-fixed, paraffin-embedded (FFPE) human MMs, and subjected them to whole exome sequencing. Given the propensity of MM to metastasize, we will also be sequencing metastatic MM lesions; primary and metastatic lesions from the same individual represent an excellent opportunity to identify potential drivers of metastasis in MM. Finally we will sequence 'normal' DNA from the same individual, where possible, to exclude germline variations.	Illumina HiSeq 2000	110
EGAD00001003239	This study involves mutagenizing C32, a melanoma cell line, with ENU to identify those mutations which engender resistance to a targeted treatment.	Illumina HiSeq 2000	80
EGAD00001003240	Study of cell lineage and embryogenesis using biopsy samples from sites across the whole body (post mortem). Sample donors are recruited sensitively through the Phoenix study and consent to samples being taken after their death for both the Phoenix study and this WTSI study.	HiSeq X Ten	-
EGAD00001003241	Toxoplasmosis is a zoonotic disease caused by a ubiquitous protozoan parasite called Toxoplasma gondii, which can infect all mammal and bird species throughout the world. seroprevalence varies widely between countries. Studies have estimated that between 7-34% of people in the UK have been infected with T. gondii. The vast majority of these people will not have noticed any symptoms, however about 10% of people develop a mild to moderate self limiting flu-like illness. Following the acute active stage of the infection the parasite persists in the body in the form of cysts, particularly in heart and skeletal muscle and nervous system tissues, for many years, and usually for life. In immunocompetent persons these cysts do not pose a health risk. We will use RNA-seq to quantify the transcriptional response of macrophages to T gondii infection. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2000	18
EGAD00001003242	This study comprises of three different datasets. 1) 57 samples from the 1243 canapps cell line study,2) 91 FFPE normal samples and 3) 87 samples from the SCORT WS2 dataset. The aim is to sequence these 235 samples in order to test the new V2 Colorectal bait design.	Illumina HiSeq 2000	92
EGAD00001003243	Corresponding data set is composed of RNA sequencing of Korean ER positive breast cancer under 35 years old. This set provides 50 alignment files of 50 tumor samples. This is a part of total project data set.	Illumina HiSeq 2500	50
EGAD00001003244	We aim to sequence the mRNA transcriptome of 22 human melanoma cell lines in biological triplicate in order to define the gene expression profile of each cell line. The data will be correlated to the mutation status and the sensitivity to a panel of drugs in order to identify genes whose deregulation is associated to drug resistance This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2000	66
EGAD00001003245	We aim to sequence the small RNAs of 22 human melanoma cell lines in biological triplicate in order to define the microRNAs expression profile of each cell line. The data will be correlated to the mutation status and the sensitivity to a panel of drugs in order to identify genes whose deregulation is associated to drug resistance This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2500	66
EGAD00001003246	Whole exome sequencing of hepatosplenic T cell lymphoma (HSTL) tumors, paired normals, and cell lines, including (1) 68 exome capture, paired-end Illumina Hiseq sequencing, BAM files from HSTL tumor samples, (2) 20 exome capture, paired-end Illumina Hiseq sequencing, BAM files from HSTL paired normal samples, and (3) 2 exome capture, paired-end Illumina Hiseq sequencing, BAM files from HSTL cell lines.	Illumina HiSeq 2500	90
EGAD00001003247	Liberal variant calls generated with VarScan		37
EGAD00001003248	A BRAF V600E colorectal organoid which is sensitive to MAP kinase inhibition was mutagenised with the chemical mutagen ENU and then drug selected using a combination of Trametinib, Dabrafenib and Cetuximab. Single cell derived organoids were then manually picked and expanded in drug. Resistance was confirmed in a 14 day assay and DNA was collected. These then underwent targeted amplicon-based sequencing to confirm candidate resistance effectors from a screen in 2 2D BRAF V600E colorectal cell lines. Pools of resistant clones were also sequenced.	Illumina MiSeq	36
EGAD00001003250	1cm biospies of from patients undergoing bladder cystectomy will be collected. The underlying muscle and stroma will be removed and the remaining epithelia dissected into small sequential areas which will be sent for ultra-deep exome sequencing using a panel of known cancer and viral genes. Sequence analysis using similar methods to Martincorena I et al (Science 2015, 348:880) will provide an idea of the somatic mutational landscape in these patient samples. Individual patient muscle samples will also be sequenced as a reference.	Illumina HiSeq 2000	55
EGAD00001003252	Sequencing of drug resistant organoids	Illumina HiSeq 2000	36
EGAD00001003253	Targeted gene screen of cell line tumour samples for testing the new V2 Colorectal gene panel.	Illumina HiSeq 2000	57
EGAD00001003254	R&D project to develop low input library construction methods.	Illumina HiSeq 2500	12
EGAD00001003255	Transcriptome of anaplastic meingiomas	Illumina HiSeq 2500	34
EGAD00001003256	Whole genome sequencing for 131 early onset prostate tumor/control pairs (ICGC)		262
EGAD00001003257	Hi-C and promoter capture Hi-C data for HT29 and LoVo. 2 replicates per cell line for the Hi-C. 3 replicates per cell line for the CHi-C.	Illumina HiSeq 2000	2
EGAD00001003258	ChIPseq data for H3K4me1 and H3K9me3 in HT29; H3K4me1, H3K27me3, H3K9me3, H3K36me3 in LoVo.	Illumina HiSeq 2000	2
EGAD00001003259	Regions of common inter-individual DNA methylation differences in human monocytes – potential function and genetic basis WGBS Data of Samples: 43_Hm03_BlMo_Ct, 43_Hm02_BlMo_Ct, 43_Hm05_BlMo_Ct, 43_Hm01_BlMo_Ct For details about sequencing or sample metadata check http://deep.dkfz.de/	Illumina HiSeq 2000	4
EGAD00001003260	The cell lines in this study are a combination of internally sequenced (cosmic) and externally sequenced cell lines known to be “double-wild-type” (lacking BRAF and NRAS somatic mutations). These sequences were realigned in this data set for consistency.		22
EGAD00001003261	These are seven sequencing files form whole exome and whole genome of five tissue samples collected from one pancreatic cancer patient	HiSeq X Ten Illumina HiSeq 2500	5
EGAD00001003262	High-coverage WES sequencing of DNA samples from 50 PTCs was performed on the Illumina HiSeq 2500 or 4000 System	Illumina HiSeq 2000	100
EGAD00001003263	ICGC DCC Release 24, PACA-CA Deep KRAS sequencing		82
EGAD00001003264	ICGC DCC Release 24, PACA-CA Exome sequence		190
EGAD00001003265	For CCOC cohorts, OvCaRe cases were reviewed, including frozen material, by at least two expert gynecopathologists prior to inclusion in the sequencing cohort who provided the confirmation on final selected cohort. Frozen H&E from Tokyo were also used for evaluation along with representative H&E photos and review done at the Jikei School of Medicine. All CCOC tumours are primary tumour samples. Library construction and sequencing Frozen specimens with >50% tumour cellularity (based on initial slide review) were used for cryosectioning and subsequent nucleic acid extraction. Patient tumour and normal blood samples derived from primary, untreated fresh frozen tumour specimens harvested at diagnosis during standard of care debulking surgery. Germline DNA was provided from peripheral blood buffy coat on all specimens except 13 from Tokyo, where non-cancer frozen tissue was used as a germline source. DNA extraction from both matched normal (blood) and tumour samples (frozen tissue) were performed using the QIAamp Blood and Tissue DNA kit (Qiagen) and quantified using a Qbit fluorometer and reagents (high-sensitivity assay). Three lanes of Illumina HiSeq 2500 v4 chemistry for normal samples and five lanes for tumour samples were obtained. The PCR-free protocol was adopted to eliminate the PCR-induced bias and improve coverage across the genome.	Illumina Genome Analyzer II Illumina HiSeq 2000	70
EGAD00001003266	For ENOC cohorts, OvCaRe cases were reviewed, including frozen material, by at least two expert gynecopathologists prior to inclusion in the sequencing cohort who provided the confirmation on final selected cohort. Frozen H&E from Tokyo were also used for evaluation along with representative H&E photos and review done at the Jikei School of Medicine. For ENOC, DAH985 and DG1288 are recurrent and both were treated with chemotherapy after their first surgery. DAH123 is a untreated sample, metastasis from an primary endometrial tumour. All HGSC, GCT, CCOC and the rest ENOC tumours are primary tumour samples. Library construction and sequencing Frozen specimens with >50% tumour cellularity (based on initial slide review) were used for cryosectioning and subsequent nucleic acid extraction. Patient tumour and normal blood samples derived from primary, untreated fresh frozen tumour specimens harvested at diagnosis during standard of care debulking surgery. Germline DNA was provided from peripheral blood buffy coat on all specimens except 13 from Tokyo, where non-cancer frozen tissue was used as a germline source. DNA extraction from both matched normal (blood) and tumour samples (frozen tissue) were performed using the QIAamp Blood and Tissue DNA kit (Qiagen) and quantified using a Qbit fluorometer and reagents (high-sensitivity assay). Three lanes of Illumina HiSeq 2500 v4 chemistry for normal samples and five lanes for tumour samples were obtained. The PCR-free protocol was adopted to eliminate the PCR-induced bias and improve coverage across the genome.	Illumina HiSeq 2000	58
EGAD00001003267	For GCT cohorts, OvCaRe cases were reviewed, including frozen material, by at least two expert gynecopathologists prior to inclusion in the sequencing cohort who provided the confirmation on final selected cohort. Frozen H&E from Tokyo were also used for evaluation along with representative H&E photos and review done at the Jikei School of Medicine. All GCT tumours are primary tumour samples. Library construction and sequencing Frozen specimens with >50% tumour cellularity (based on initial slide review) were used for cryosectioning and subsequent nucleic acid extraction. Patient tumour and normal blood samples derived from primary, untreated fresh frozen tumour specimens harvested at diagnosis during standard of care debulking surgery. Germline DNA was provided from peripheral blood buffy coat on all specimens except 13 from Tokyo, where non-cancer frozen tissue was used as a germline source. DNA extraction from both matched normal (blood) and tumour samples (frozen tissue) were performed using the QIAamp Blood and Tissue DNA kit (Qiagen) and quantified using a Qbit fluorometer and reagents (high-sensitivity assay). Three lanes of Illumina HiSeq 2500 v4 chemistry for normal samples and five lanes for tumour samples were obtained. The PCR-free protocol was adopted to eliminate the PCR-induced bias and improve coverage across the genome.	Illumina HiSeq 2000	20
EGAD00001003268	HGSC cases in the OvCaRe and CRCHUM Tumour Banks were selected according to the following criteria: (i) were administered platinum taxane based therapy; (ii) relapsed within 12 months (365 days) or had at least longer than 4.5 years (1642.5 days) follow-up data; (iii) had at least 50% tumour content by H&E staining and expert pathology review. All cases were re-reviewed by expert pathologists to confirm the diagnosis of HGSC. Germline BRCA1 and BRCA2 was determined for all patients through hereditary cancer screening programs. The design of cases selection as a discovery cohort was engineered to amplify biological differences by selecting cases from the extremes of the outcome distribution. All HGSC tumours are primary tumour samples. Library construction and sequencing Frozen specimens with >50% tumour cellularity (based on initial slide review) were used for cryosectioning and subsequent nucleic acid extraction. Patient tumour and normal blood samples derived from primary, untreated fresh frozen tumour specimens harvested at diagnosis during standard of care debulking surgery. Germline DNA was provided from peripheral blood buffy coat on all specimens except 13 from Tokyo, where non-cancer frozen tissue was used as a germline source. DNA extraction from both matched normal (blood) and tumour samples (frozen tissue) were performed using the QIAamp Blood and Tissue DNA kit (Qiagen) and quantified using a Qbit fluorometer and reagents (high-sensitivity assay). Three lanes of Illumina HiSeq 2500 v4 chemistry for normal samples and five lanes for tumour samples were obtained. The PCR-free protocol was adopted to eliminate the PCR-induced bias and improve coverage across the genome.	Illumina HiSeq 2000	118
EGAD00001003269	High-coverage WGS sequencing of DNA samples from 90pairs GCs was performed on the Illumina HiSeq X Ten System.	Illumina HiSeq 2000	1332
EGAD00001003270	ICGC DCC Release 24, PACA-CA Whole Genome sequence merged alignments		95
EGAD00001003271	WGS of T-cell and NK-cell lymphoma The tumor samples were sequenced with Illumina HiSeq 2500 platform and the resulting FASTq files have been uploaded.	Illumina HiSeq 2000	102
EGAD00001003272	March 2017 data update (bam/fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2500	8
EGAD00001003273	Low-coverage whole genome sequencing for the establishment of genomewide copy number alterations in pleura effusions and respective primary tumors	Illumina MiSeq	20
EGAD00001003274	Whole genome sequencing data for MMML (tumor/control pairs and one cell_line)		315
EGAD00001003275	Targeted resequencing of samples was done with TruSeq custom amplicon low input kit (TSCA-LI, Illumina). The oligo capture probes were designed to include a prefix of 8 random nucleotides at the 5 end of each probe. The assay is designed such that each targeted locus is annealed with two probes, resulting in amplicons tagged with unique molecular identifiers (UMI) (22) of 16 bases. Raw FASTQ sequencing files were processed as following: (a) The first 8 bases were trimmed from each read and recorded with the corresponding base quality scores (BQ) in the attribute field. (b) Reads were aligned with BWA. (c) First round of PCR duplicate cleaning was performed with picard tools markDuplicates using the parameters BARCODE_TAG=BC TAGGING_POLICY=All REMOVE_DUPLICATES=true (d) Since in the previous step only duplicate reads with identical UMIs were removed, a second pass of filtering was done. Reads with identical mapping were considered unique only if their corresponding UMIs were different in at least 3 positions (i.e., UMI edit distance > 2). (e) Paired-end read pairs overlapping genomic positions were clipped to avoid overestimation of the sequencing coverage using bamUtils clipOverlap.	NextSeq 550	74
EGAD00001003276	Whole genome sequencing data for MMML (24 tumor/control pairs), fastq-files	Illumina HiSeq 2000 Illumina HiSeq 2500	-
EGAD00001003278	Whole Exome and Target Sequencing Data in 75 Samples from 5 Hepatocellular Carcinoma Patients. The sequencing was performed by Illumina HiSeq 4000. Background and aims: Intratumoral heterogeneity (ITH) challenges identifying mutations with target therapy potential whereas circulating cell-free DNAs (cfDNAs) could reflect nearly the entire mutation spectrum in given tumors. We investigated how to minimize the limit of ITH for profiling hepatocellular carcinoma (HCC).Methods: Thirty-two multi-regional HCC samples from five patients were subjected to whole exome sequencing (WES) and targeted deep sequencing (TDS). ITH extent was measured by the average percentage of non-ubiquitous mutations (present in parts of tumor regions). Matched cfDNAs were also analyzed by WES and TDS. Profiling efficiencies of single tumor specimen and cfDNA were compared and the one better depicted mutational landscape was selected to screen therapeutic targets.Results: We found variable extents of ITH in HCCs and observed branched and parallel evolution patterns. ITH level decreased at higher sequencing depth of TDS than that measured by WES (28.1% vs 34.9%, P < 0.01) but it remained unchanged upon additional samples analyzed. TDS of single tumor specimen detected an average of 70% the total mutations in HCC. Although more mutations were detected in cfDNA under TDS than WES, an average of 47.2% total HCC mutations uncovered by cfDNA suggested tissue outperform cfDNA and the latter may serve as alternative in profiling HCC genome. Consequently, TDS of single tumor tissue in 66 patients and cfDNAs in four unresectable HCCs identified 38.6% (26/66 and 1/4) patients bearing therapeutic targets.Conclusions: TDS of single tumor specimen could largely circumvent ITH to uncover mutations indicative of target therapy in HCC.	Illumina HiSeq 4000	124
EGAD00001003279	RNA sequencing data for 170 medulloblastoma tumor samples	Illumina HiSeq 2000	171
EGAD00001003280		NextSeq 550	16
EGAD00001003281	Genomic alterations driving tumorigenesis result from the interaction of environmental exposures and endogeneous cellular processes. With a diversity of risk factors including viral infection, carcinogenic exposures and metabolic diseases, liver cancer is an ideal model to study these interactions. Whole genome sequencing of liver tumors identified 10 mutational signatures showing distinct relationships with environmental exposures, replication and transcription. Transcription-coupled damage was specifically associated with the liver-specific signature 16 and alcohol intake. Flood of indels were identified in very highly expressed hepato-specific genes, likely resulting from replication-transcription collisions. Reconstruction of sub-clonal architecture revealed mutational signature evolution during tumor development exemplified by the vanishing of aflatoxin-B1 signature in African migrants. These findings shed new light on the natural history of liver cancers.	Illumina HiSeq 2000	52
EGAD00001003282	Analysis scripts and output		37
EGAD00001003283	Whole genome sequencing data for MMML (healthy cell_line)		24
EGAD00001003284	Whole exome sequencing of enteropathy-associated T cell lymphoma (EATL) tumors and paired normals, as well as RNA-sequencing of EATL tumors: including (1) 69 exome capture, paired-end Illumina Hiseq sequencing, BAM files from EATL tumor samples, (2) 36 exome capture, paired-end Illumina Hiseq sequencing, BAM files from EATL paired normal samples, and (3) 32 RNAseq, paired-end Illumina Hiseq sequencing, BAM files from EATL tumor samples.	Illumina HiSeq 2500	137
EGAD00001003285	RNA sequencing data for MMML (3 tumor samples and 1 gcbcell)		5
EGAD00001003286	Whole genome sequencing data for MMML (7 tumors and 8 controls)		15
EGAD00001003290	Whole genome sequencing for 12 late onset prostate cancer tumor/control pairs (ICGC)		24
EGAD00001003291	This dataset represents RNA-sequencing data from 278 primary colon cancers obtained from fresh-frozen tumor sections. RNA-sequencing was performed using TruSeq library preparation and samples were sequenced on Illumina NextSeq and HiSeq. The data are available as Illumina NextSeq and HiSeq fastq files (_R1.fastq and _R2.fastq for each tumor sample, 556 files in total).	Illumina HiSeq 2500 NextSeq 500	278
EGAD00001003292	WGS sequencing for cases from the ICGC ESAD-UK project Tumours 50x Normals 30x HiSeq X BAM files These samples are all available in ICGC release 24	Illumina HiSeq 2000	34
EGAD00001003293	RNA-Seq and WXS from 6 glioblastoma patients	Illumina HiSeq 2500	11
EGAD00001003294	Integrated callset of high coverage Ethiopian genomes from the Pagani et al. 2015 AJHG paper (doi: http://dx.doi.org/10.1016/j.ajhg.2015.04.019)		5
EGAD00001003295	Integrated callset of high coverage Egyptian genomes from the Pagani et al. 2015 AJHG paper (doi: http://dx.doi.org/10.1016/j.ajhg.2015.04.019)		3
EGAD00001003296	Integrated callset of low coverage Ethiopian and Egyptian genomes from the Pagani et al. 2015 AJHG paper (doi: http://dx.doi.org/10.1016/j.ajhg.2015.04.019)		220
EGAD00001003297			9
EGAD00001003298	BAM outputs from RSEM (https://deweylab.github.io/RSEM/) analysis of RNASeq sequencing on HiSeq platform of tumour samples from 95 pancreatic adenocarcinoma cases.		96
EGAD00001003301	Whole exome sequencing of 10 metastatic biopsies from four TRACERx100 patients (see EGA dataset EGAS00001002247), collected either after relapse or death. The data from these samples are initially published with Abbosh, C. et al. Phylogenetic ctDNA analysis depicts early stage lung cancer evolution. Nature, http://dx.doi.org/10.1038/nature22364 (2017). Abstract: Earlier detection of relapse following primary surgery for non-small cell lung cancer and the characterization of emerging subclones seeding metastatic sites might offer new therapeutic approaches to limit tumor recurrence. The potential to non-invasively track tumor evolutionary dynamics in ctDNA of early-stage lung cancer is not established. Here we conduct a patient-specific approach to ctDNA profiling in the first 100 lung TRACERx (TRAcking Cancer Evolution through therapy (Rx)) study participants, including one patient co-recruited to the PEACE (Posthumous Evaluation of Advanced Cancer Environment) post-mortem study. We identify independent predictors of ctDNA release in early-stage non-small cell lung cancer and perform tumor volume limit of detection analyses. Through blinded profiling of post-operative plasma, we observe evidence of adjuvant chemotherapy resistance and identify patients destined to experience recurrence of their lung cancer. Finally, we show that phylogenetic ctDNA profiling tracks the subclonal nature of lung cancer relapse and metastases, providing a new approach for ctDNA driven therapeutic studies.		10
EGAD00001003302		Illumina HiSeq 3000	21
EGAD00001003303	The evolution of four breast cancers was analyzed using longitudinal samples collected over 2-15 years. Whole-genome sequencing and single-cell RNA-Seq were used to analyze evolution. We have deposited VCF files for SNV, indel, and structural variant calls from WGS data, and a text file showing transcripts per million (TPM) expression for the single-cell RNA-Seq data.		16
EGAD00001003304	We collected tumor samples and adjacent nomal mucosae from 46 patients with colorectal cancer in surgical operation from 2014 to 2016 in the First Affiliated Hospital of Chongqing Medical University (Chongqing, China) and the Research Institute of Surgery, Third Military Medical University (Chongqing, China). the qualified captured library of each sample was then loaded on Illumina HiSeq 2000 (Illumina, San Diego, CA) platforms and subjected to high-throughput sequencing.	Complete Genomics	38
EGAD00001003305	Diffuse Intrinsic Pontine Glioma (DIPG) is a fatal brain cancer that arises in the brainstem of children with no effective treatment. To understand what drives DIPGs we integrated whole-genome-sequencing with methylation, expression and copy-number profiling.	AB SOLiD System Illumina HiSeq 2500	23
EGAD00001003306	Exome sequencing data of 15 French Caucasian and 10 African-Caribbean men with prostate Cancer.	Illumina HiSeq 2000	50
EGAD00001003307	In this project we will use exome sequencing to identify somatic mutations in lesions from a patient with a germline mutation in the protection of telomeres 1 gene (POT1). This dataset contains all the data available for this study on 2017-04-27.	Illumina HiSeq 2000 Illumina MiSeq	40
EGAD00001003308	This is an in vitro genome-wide CRISPR/cas9 screen in human glioblastoma stem cells, screening for genes essential for survival of these cells. These cells express cas9 and have been transfected with a guide RNA library causing gene knockouts. We will analyse the sequencing data for depletion of guide RNAs. This dataset contains all the data available for this study on 2017-04-27.	Illumina HiSeq 2000	10
EGAD00001003309	The study will investigate serial samples from the same patient taken at the time of MGUS or SMM diagnosis, and later at the time of evolution towards MM. Samples will be sequenced by whole genome along with a matched normal to obtain the highest possible amount of information toinvestigate genomic changes at disease evolution. This dataset contains all the data available for this study on 2017-04-27.	HiSeq X Ten	139
EGAD00001003310	There are 66 pairs of LAML cases(complete genomics) in this project which belongs to LAML-CN..The library is constructed by the Completes Genomics protocol.	Complete Genomics	66
EGAD00001003311	Dataset contains one sample derived from gDNA of human fibroblasts. Files are in FASTQ format and were generated using the Agilent SureSelect Human All Exon 50Mb Kit and followed by Next Generation Sequencing on a HighSeq2000 instrument (Illumina).	Illumina HiSeq 2000	1
EGAD00001003315	This dataset includes the high-throughput sequencing data from a study entitled "Clonal History and Genetic Predictors of Transformation into Small Cell Carcinomas from Lung Adenocarcinomas". Whole-genome sequencing libraries were generated by PCR-free methods, and sequencing run was made in HiSeq X or HiSeq 2500 machines. PCR duplicates-marked, indel-realigned, and base-recalibrarted BAM files are provided in our dataset.	HiSeq X Ten Illumina HiSeq 2500	16
EGAD00001003316	RNAseq of LC2AD with AD80 or DMSO Plenker et al., Mechanistic insight into RET kinase inhibitors targeting the DFG-out conformation in RET-rearranged cancer	Illumina HiSeq 2000	1
EGAD00001003317	There are 22 pairs of LAML cases in this project which belongs to LAML-CN.The library is constructed by the Illumina protocol.	Illumina HiSeq 2000	63
EGAD00001003318	RNA-sequencing alignment for SYSCOL colorectal adenoma-carcinoma samples		314
EGAD00001003320	Transcriptome sequencing of tumour tissue, adjacent normal tissue and derived organoids/tumoroids from colorectal cancer This dataset contains all the data available for this study on 2017-05-04.	Illumina HiSeq 2000 Illumina HiSeq 2500	106
EGAD00001003321	This dataset contains all the data available for this study on 2017-05-04.	Illumina HiSeq 2000	523
EGAD00001003323	Runs that contain data for the sensitivity and specificity experiments for BiSeqS.	Illumina MiSeq	2
EGAD00001003324		Illumina HiSeq 2500	21
EGAD00001003325	Exome from EGAS00001002441	Illumina HiSeq 2500	2
EGAD00001003326	Azoospermia, characterized by the absence of spermatozoa in the ejaculate is a common cause of male infertility with a poorly characterized etiology. Exome sequencing analysis of two azoospermic brothers allowed the identification of a homozygous splice mutation in SPINK2, encoding a serine protease inhibitor believed to target acrosin, the main sperm acrosomal protease. In accord with these findings we observed that homozygous Spink2 KO male mice had azoospermia. Moreover, despite normal fertility, heterozygous male mice had a high rate of morphologically abnormal spermatozoa and a reduced sperm motility. Further analysis demonstrated that in the absence of Spink2, protease-induced stress initiates Golgi fragmentation and prevents acrosome biogenesis leading to spermatid differentiation arrest. We also observed a deleterious effect of acrosin overexpression in HEK cells, effect that was alleviated by SPINK2 coexpression confirming its role as acrosin inhibitor. These results demonstrate that SPINK2 is necessary to neutralize proteases during their cellular transit towards the acrosome and that its deficiency induces a pathological continuum ranging from oligoasthenoteratozoospermia in heterozygotes to azoospermia in homozygotes.	Illumina HiSeq 2000	2
EGAD00001003328	Clinical and genetic information of an individual with RVOT-VT and a KCNK2 (TREK1) gene mutation obtained after whole exome sequencing.		1
EGAD00001003329	The offspring of first cousin marriages have ~6% of their genome autozygous, i.e. homozygous identical by descent, or even more if there was further consanguinity in their ancestry. In the UK there are large populations with very high first cousin marriage rates of 20-50%. Sequencing the exomes of a sample of these individuals has the potential both to support genetic health programmes in these populations, and to provide genetic research information about rare loss of function mutations. This pilot study based on existing cohort samples from the Born In Bradford study will identify homozygous individuals for almost all variants down to an allele frequency around 1%, plus individuals carrying hundreds of new homozygous rare loss-of-function variants, and will support development of community relations and ethics for a wider study currently being designed. The data deposited in the EGA consist of low coverage whole exome sequencing on these samples. This dataset contains all the data available for this study on 2017-05-11.	Illumina HiSeq 2000 Illumina HiSeq 2500	3188
EGAD00001003330	The samples will be sequenced for a targeted panel of cancer relevant genes (n ~ 370) and analysed for somatic mutations. This dataset contains all the data available for this study on 2017-05-11.	Illumina HiSeq 2000	416
EGAD00001003331	Whole-exome sequencing of a cohort of families (probands and affected/unaffected relatives) suffering from one of two rare thyroid disorders: congenital hypothyroidism (CH) and resistance to thyroid hormone (RTH). This dataset contains all the data available for this study on 2017-05-11.	Illumina HiSeq 2000 Illumina HiSeq 2500	78
EGAD00001003332	PCR and MiSeq validation for early embryonic substitution candidates from 400 Breast cancer patients This dataset contains all the data available for this study on 2017-05-11.	Illumina MiSeq	4
EGAD00001003334	Targeted exome sequencing of patient derived xenografts from primary colorectal tumours and liver metastases. This dataset contains all the data available for this study on 2017-05-11.	Illumina HiSeq 2000	573
EGAD00001003335	A resource for assessment of exon CNV calling methods in targeted NGS data, we here present the ICR96 exon CNV validation series. The dataset includes high-quality sequencing data from a targeted NGS assay (the TruSight Cancer Panel) together with Multiplex Ligation-dependent Probe Amplification (MLPA) results for 96 independent samples. 66 samples contain at least one validated exon CNV and 30 samples have validated negative results for exon CNVs in 26 genes. The dataset includes 46 exon CNVs in BRCA1, BRCA2, TP53, MLH1, MSH2, MSH6, PMS2, EPCAM and PTEN, giving excellent representation of the cancer predisposition genes most frequently tested in clinical practice. Moreover, the validated exon CNVs include 25 single exon CNVs the most difficult exon CNV to detect.	Illumina HiSeq 2500	96
EGAD00001003336	BAM outputs from RSEM (https://deweylab.github.io/RSEM/) analysis of RNASeq sequencing on HiSeq platform of tumour samples from 29 pancreatic neuroendocrine cases.		29
EGAD00001003337	T cells isolated from peripheral blood, tumors and adjacent normal tissues from six hepatocellular carcinoma patients. SmartSeq2 and Tang2009 protocol were used to amplify RNA from single T cells. High depth enables simultaneously expression profiling and TCR assembling.	Illumina HiSeq 2500 Illumina HiSeq 4000	5063
EGAD00001003338	This is a test dataset derived from public data of the 1000 Genomes Project. Its purpose is not to allow for any inference about cohort data or results, but to aid bioinformaticians in the technical development and testing of tools, as well as data consumers in learning how to access information. This dataset consists of 2508 samples from the 1000 Genomes Project (https://www.nature.com/articles/nature15393). Samples' (e.g. NA18534) data can be accessed through the IGSR portal (e.g. https://www.internationalgenome.org/data-portal/sample/NA18534) or their corresponding folder at the 1000 Genomes' FTP site (e.g. http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000_genomes_project/data/CHB/NA18534/exome_alignment/). There are several different types of data this dataset encompasses: Variant Calling Format (VCF, or its binary counterparts BCF) files, both joint (e.g. ALL_chr22_20130502_2504Individuals.vcf.gz) and split (HG01775.chrY.vcf.gz); exome sequencing CRAM files (e.g. NA18534.GRCh38DH.exome.cram); whole genome sequencing CRAM/BAM files (e.g. NA19239.cram). Additionally, there are multiple files that were sliced to create shorter files, which allows for a quick download, formated as "{FILE-INFO}__{NUMBER-OF-READS}r__{CHR}.{START-COORDINATE}-{END-COORDINATE}.{FILETYPE}" (e.g. "HG01500.GRCh38DH__90r__3.10000-10500__4.10000-10500.cram"). These files can be downloaded directly through the EGA-download-client PyEGA3 (https://github.com/EGA-archive/ega-download-client).	AB SOLiD 4 System unspecified	6
EGAD00001003339	Whole exome library making will be performed on genomic DNA derived from radiotherapy induced sarcoma samples and matched normal DNA from the same patients. Next Generation sequencing will be performed on the resulting libraries and mapped to build 37 of the human reference genome to facilitate the identification of mutations This dataset contains all the data available for this study on 2017-05-17.	Illumina HiSeq 2000	7
EGAD00001003340	DDD DATAFREEZE 2016-10-03: 7831 trios - VCF files		1
EGAD00001003341	Sequence data from fungal infection isolated from neural tissue in ALS patients.	Illumina MiSeq	34
EGAD00001003342	Identification of fusion transcripts by RNA-sequencing and Whole genome sequencing of a breast cancer patient sample (METABRIC ID MB-0152)	Illumina HiSeq 2000	3
EGAD00001003344	Transcriptome profiling of 25 prostate tumor samples by RNA-Seq	Illumina HiSeq 2000	25
EGAD00001003345	exome sequence data for 57 HIV elite long term non-progressors and rapid progressors. Complete dataset of improved BAMs mapped to hs37d5 and including phenotype information.		57
EGAD00001003347	This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ This dataset contains all the data available for this study on 2017-05-24.	Illumina HiSeq 2000	75
EGAD00001003348	The differentiation of distinct multifocal hepatocellular carcinoma (HCC): multicentric disease vs. intrahepatic metastases, in which the management and prognosis varies substantively, remains problematic. We aim to stratify multifocal HCC and identify novel diagnostic and prognostic biomarkers by performing whole genome and transcriptome sequencing, as part of a multi-omics strategy.	Illumina HiSeq 2000	8
EGAD00001003349	ChIP-seq data (H3K4Me1, H3K4Me3, H3K27Ac histone modifications) in experimental triplicates on multiple myeloma cell line KMS11 and plasma cell leukaemia cell lines L363 and JJN3. ChIP reactions were performed on a Diagenode SX-8G IP-Star Compact using Diagenode automated Ideal Kit. ChIP libraries were generated using HTP Illumina library preparation kit, and sequenced using Illumina HiSeq 2000 with 100 bp single-ended reads. ChIP-seq files are in BED format.	Illumina HiSeq 2000	9
EGAD00001003350	DDD DATAFREEZE 2016-10-03: 7831 trios - phenotypic and family descriptions		1
EGAD00001003351	In order to comprehensively investigate the genetic relationship between PTC tumors and benign nodules, we totally collected 127 fresh-frozen biopsies samples from 28 patients with concurrent thyroid benign nodule and PTC (n=20) or simple benign nodule (n=8). We carried out whole-exome sequencing on all the 127 biopsies samples and RNA-sequencing in total of 40 samples.	Illumina HiSeq 2500	127
EGAD00001003353	BAM outputs from STAR (https://github.com/alexdobin/STAR) analysis of RNASeq sequencing on HiSeq platform of 56 tumour samples from 46 melanoma cases. Gene model = Ensembl version 70		-
EGAD00001003354	From 9 patients undergoing hip joint replacement surgery for osteoarthritis, we collected 3 cartilage samples each: a low-grade sample (no obvious evidence of damage or fibrillation); a high-grade sample (damaged and fibrillated cartilage); an osteophytic sample (overlaid bony protrusions mainly around the margins of the articular surface). Multiplexed libraries were sequenced on Illumina HiSeq 2000 (75bp paired-end read length) and a cram file was produced for each sample. This dataset contains all the data available for this study on 2017-06-09.	Illumina HiSeq 2500	27
EGAD00001003355	From 17 patients undergoing knee joint replacement surgery for osteoarthritis, we collected 4 samples each: intact cartilage, degraded cartilage, synovium, and meniscus. We also collected blood for DNA analysis. Multiplexed libraries were sequenced on Illumina HiSeq 2000 (75bp paired-end read length) and a cram file was produced for each sample. This dataset contains all the data available for this study on 2017-06-09.	Illumina HiSeq 2500	72
EGAD00001003356	Up to now, there are two hypothesis about the pathogenesis of the relationship of intravenous leiomyomatosis and uterine myoma. One theory suggests that the IVL comes from the smooth muscle cell in the vessel wall.The other theory indicates that the IVL derives from the uterine myometrium. However, limited to the technology, few studies have been deeply explore the underlying relation. In this study, we employ the RNA sequencing to explore the molecule relationship between IVL and uterus myoma. In order to identify the molecule relationship between IVl and uterine myoma we conducted transcriptome sequencing and bioinformaitc analysis	Illumina HiSeq 2000	20
EGAD00001003357	Aligned, merged and deduplicated BAM files from HiSeq whole exome sequencing of 106 samples: matched tumour-normal pairs from 53 melanoma patients.		-
EGAD00001003358	The dataset consists of samples from papillary thyroid cancer patients. A total of 181 DNA samples from blood/normal and cancer tissue are subjected to whole exome sequencing using Illumina. The fastq files generated were aligned with reference genome ‘hg19’, duplicates were marked, realignment around indels and quality recalibration were performed to produce good quality variants. The recalibrated “.bam” files are included with this dataset.		189
EGAD00001003359	In this study, we present the results of a custom “pan-cardiomyopathy panel” in a molecular screening of 38 unrelated patients, 16 affected by DCM, 14 by HCM, and 8 by ARVC. The panel was designed using the Design Studio Tool (Illumina, San Diego, CA,USA). Coding regions and intron–exon boundaries of 115 genes, known to be associated with 7 DCM, HCM, and ARVC as well as channelopathies, were selected for targeted gene enrichment. For genes with multiple transcripts, all exons included in transcripts expressed in cardiac muscle were considered in the gene panel design. Total DNA was extracted from peripheral blood samples using the Wizard Genomic DNA Purification Kit (Promega, Mannheim, Germany) according to the manufacturer’s instructions, quantified, and qualitatively checked using NanoDrop 2000c (Thermo Fisher Scientific, Waltham, MA, USA). Custom targeted gene enrichment and DNA library preparation were performed using the Nextera Capture Custom Enrichment kit (Illumina) according to the manufacturer’s instructions. Targeted regions were sequenced using the Illumina MiSeq platform, generating approximately two millions of 150-bp paired-end reads for each sample (Q30 ≥90%).	Illumina MiSeq	38
EGAD00001003360	Bam files containing mitochondrial alignments, extracted from CPCGene Whole Genome Alignments		432
EGAD00001003361	VCF files containing mitochondrial variant calls using MToolbox		432
EGAD00001003362	RNAseq on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer primary tumor sample (EPO2_cohort)	Illumina HiSeq 2000	49
EGAD00001003363	Whole-exome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer primary tumor sample (EPO2_cohort)	Illumina HiSeq 2000	114
EGAD00001003364	RNAseq on Illumina HiSeq2000/2500 of colorectal cancer metastasis sample (OT2_cohort)	Illumina HiSeq 2000	4
EGAD00001003365	RNAseq on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer metastasis sample (OT2_cohort)	Illumina HiSeq 2000	1
EGAD00001003366	RNAseq on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer metastasis sample (OT2_cohort)	Illumina HiSeq 2000	10
EGAD00001003367	RNAseq on Illumina HiSeq2000/2500 of colorectal cancer primary tumor sample (OT2_cohort)	Illumina HiSeq 2000	7
EGAD00001003368	RNAseq on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer primary tumor sample (OT2_cohort)	Illumina HiSeq 2000	1
EGAD00001003369	RNAseq on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer primary tumor sample (OT2_cohort)	Illumina HiSeq 2000	13
EGAD00001003370	Whole-genome sequencing on Illumina HiSeq2000/2500 of Blood EDTA (OT2_cohort)	Illumina HiSeq 2000	10
EGAD00001003371	Whole-genome sequencing on Illumina HiSeq2000/2500 of normal colon control tissue (OT2_cohort)	Illumina HiSeq 2000	1
EGAD00001003372	Whole-genome sequencing on Illumina HiSeq2000/2500 of colorectal cancer metastasis sample (OT2_cohort)	Illumina HiSeq 2000	4
EGAD00001003373	Whole-genome sequencing on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer metastasis sample (OT2_cohort)	Illumina HiSeq 2000	1
EGAD00001003374	Whole-genome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer metastasis sample (OT2_cohort)	Illumina HiSeq 2000	8
EGAD00001003375	Whole-genome sequencing on Illumina HiSeq2000/2500 of colorectal cancer primary tumor sample (OT2_cohort)	Illumina HiSeq 2000	7
EGAD00001003376	Whole-genome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer primary tumor sample (OT2_cohort)	Illumina HiSeq 2000	12
EGAD00001003377	Whole-exome sequencing on Illumina HiSeq2000/2500 of Blood EDTA (OT2_cohort)	Illumina HiSeq 2000	10
EGAD00001003378	Whole-exome sequencing on Illumina HiSeq2000/2500 of normal colon control tissue (OT2_cohort)	Illumina HiSeq 2000	1
EGAD00001003379	Whole-exome sequencing on Illumina HiSeq2000/2500 of colorectal cancer metastasis sample (OT2_cohort)	Illumina HiSeq 2000	4
EGAD00001003380	Whole-exome sequencing on Illumina HiSeq2000/2500 of PDO culture derived from colorectal cancer metastasis sample (OT2_cohort)	Illumina HiSeq 2000	1
EGAD00001003381	Whole-exome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer metastasis sample (OT2_cohort)	Illumina HiSeq 2000	10
EGAD00001003382		MinION	26
EGAD00001003383	Whole-exome sequencing on Illumina HiSeq2000/2500 of colorectal cancer primary tumor sample (OT2_cohort)	Illumina HiSeq 2000	7
EGAD00001003384	Whole-exome sequencing on Illumina HiSeq2000/2500 of Patient-derived xenograft derived from colorectal cancer primary tumor sample (OT2_cohort)	Illumina HiSeq 2000	14
EGAD00001003385	Whole-exome sequencing on AB 5500xl Genetic Analyzer of Blood EDTA (OT2_cohort)	AB 5500 Genetic Analyzer	1
EGAD00001003386	Whole-exome sequencing on AB 5500xl Genetic Analyzer of colorectal cancer primary tumor sample (OT2_cohort)	AB 5500 Genetic Analyzer	1
EGAD00001003387		MinION	19
EGAD00001003388	Aligned, merged and deduplicated BAM files from HiSeq whole genome sequencing of 366 samples: matched tumour-normal pairs from 183 melanoma cases comprising 48 primary melanomas, 15 cell lines, and 120 metastases. Sequencing was performed on the Illumina HiSeq 2000 and Xten platforms at Australian and Korean sequencing centres. Data was aligned to the human genome (GRCh37) using BWA-MEM.		-
EGAD00001003389	WGS and WXS files for Dyer ATRX study	Illumina HiSeq 2000	6
EGAD00001003390	DCM-cases (149 human DCM samples) human heart biopsies from 149 patients with dilated cardiomyopathy (DCM) were subjected to RNA sequencing in order to assess transcriptome variation. We used Illumina HiSeq2000 technology. Each sample-dataset contains the output from tophat-1.4.1 (one .bam file with the aligned reads and two .fq files one with the not aligned forward read and one with the revers unaligned reads). We reveal extensive differences of gene expression and splicing between dilated cardiomyopathy patients and controls.	Illumina HiSeq 2000	149
EGAD00001003391	DCM-controls (113 human non-DCM samples) human heart biopsies from 113 non-diseased controls were subjected to RNA sequencing in order to assess transcriptome variation. We used Illumina HiSeq2000 technology. Each sample-dataset contains the output from tophat-1.4.1 (one .bam file with the aligned reads and two .fq files one with the not aligned forward read and one with the revers unaligned reads). We reveal extensive differences of gene expression and splicing between dilated cardiomyopathy patients and controls.	Illumina HiSeq 2000	113
EGAD00001003392	High-coverage WGS sequencing of DNA samples from 51pairs GCs was performed on the Illumina HiSeq X Ten System.	Illumina HiSeq 2000	102
EGAD00001003393	This dataset contains bam files for RNA-seq experiments for 6 neuroblastoma PDXs (Patient Derived Xenograft) and 3 pairs of neuroblastoma tumors at diagnosis and at relapse.	Illumina HiSeq 2500 Illumina HiSeq 4000 NextSeq 500	12
EGAD00001003394	This dataset contains bam files for ChIP-seq experiments for 6 neuroblastoma PDXs (Patient Derived Xenograft). It includes the bam files for the H3K27ac mark as well as the bam files of the corresponding input DNA for each sample.	Illumina HiSeq 2500	6
EGAD00001003395	This dataset consists of the exome sequencing data for 30 tumour and germline DNA pairs derived from relapsed/refractory DLBCL.		60
EGAD00001003396	WGS minibam files for SJLIFE	Illumina HiSeq 2000	3036
EGAD00001003397	Twenty samples were collected in pairs, i.e., HCC tissue and adjacent non-cancerous tissue. The collected tissue samples were stored in liquid nitrogen. First, 50 mg of tissue was lysed in TRIzol (Invitrogen) to extract RNA following the manufacturer’s instructions. Next, ribosomal RNA was depleted using a RiboZero Gold kit (Epicentre Bio-technologies). RNA integrity was assessed with an Agilent Bioanalyzer 2100. An RNA-Seq library was generated with the rRNA-depleted samples using an Illumina standard RNA Sample Prep kit according to the manufacturer’s instructions. The library was subsequently sequenced on an Illumina HiSeq2500 as 125-bp paired-ends with approximately 300-bp size selection.	Illumina HiSeq 2500	20
EGAD00001003399	RNAseq dataset of 34 samples (6 normals, 7 stroma-enriched, 21 malignant cells-enriched) from patients with resected pancreatic ductal carcinoma.	Illumina HiSeq 4000	34
EGAD00001003400	We present targeted NGS panel data from 170 samples that were processed using the TruSightTM Cancer (TSC) panel (Illumina, San Diego, CA, USA), which targets 94 genes and 284 SNPs associated with a predisposition towards cancer. The samples are enriched for CNVs in the genes of interest. All CNVs have previously been assessed with MLPA and can therefore be considered as confirmed.	Illumina MiSeq	170
EGAD00001003404	RRBS sequencing of 7 tumour regions and a normal sample from a single TRACERx patient.	Illumina HiSeq 2500	8
EGAD00001003405	High-coverage WGS sequencing of DNA samples from 23pairs GCs was performed on the Illumina HiSeq X Ten System.	Illumina HiSeq 2000	46
EGAD00001003406	DDD DATAFREEZE 2016-10-03: 7831 trios - exome sequence CRAM files		1
EGAD00001003407	Whole-genome sequencing and phasing of admixed Aboriginal Australian genomes and Papua New Guinean genomes using 10x Genomics Chromium technology. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/ This dataset contains all the data available for this study on 2017-06-27.	HiSeq X Ten	4
EGAD00001003408	Chip-Seq sequencing data of Atypical teratoid/rhabdoid tumors (ATRT)	Illumina HiSeq 2000	19
EGAD00001003409	Amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) are part of a clinical, pathological and genetic continuum. The purpose of the present study was to assess the mutation burden that is present in ALS and/or FTD known disease-causing genes in 54 patients (16 with available postmortem neuropathological diagnosis) with concurrent ALS and FTD (ALS/FTD) not-carrying the C9orf72 hexanucleotide repeat expansion, the most important genetic cause in both diseases.	Illumina HiSeq 2500	54
EGAD00001003410	ICGC PCAWG Dataset for RNA-Seq BAM aligned using Star. Project: PACA-AU.		81
EGAD00001003411	ICGC PCAWG Dataset for RNA-Seq BAM aligned using TopHat2. Project: PACA-AU.		81
EGAD00001003412		Illumina HiSeq 2000	152
EGAD00001003413		Illumina HiSeq 2000	145
EGAD00001003414	June 2017 data update (bam/fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2500 NextSeq 500	40
EGAD00001003415	ICGC PCAWG Dataset for RNA-Seq BAM aligned using Star. Project: OV-AU.		93
EGAD00001003416	ICGC PCAWG Dataset for RNA-Seq BAM aligned using TopHat2. Project: OV-AU.		93
EGAD00001003417	This dataset includes genomic information of 19 adult cerebellar glioblastomas (C-GBMs). Whole-exome sequencing data are available for 9 C-GBMs and their 8 corresponding matched blood samples, and glioma-specific targeted-DNA sequencing (GliomaSCAN) data from additional 10 C-GBMs are also available. Among them, Whole-transcriptome sequencing data were conducted for 6 C-GBM tumors.	Illumina HiSeq 2500	34
EGAD00001003419	Whole exome seqeuncing from primary human JMML samples	Illumina HiSeq 2000	50
EGAD00001003421	Sequence data of 28 Samples (19 chronic lymphocytic leukemia, 9 control) Including RNA-Seq and ChIP-Seq of following histone modifications: H3, H3K4me1, H3K4me3, H3K9ac, H3K9me3, H3K27ac, H3K27me3, H3K36me3 Project see: http://www.cancerepisys.org/		28
EGAD00001003422	WXS from barcoded cells that are FACS sorted from GBM-719 xenografts, and the germline reference from patient GBM-719. The 4 xenografts are named according to passage (secondary or tertiary) and treatment (vehicle control or temozolomide).	Illumina HiSeq 2500	5
EGAD00001003423	Pulmonary arterial hypertension (PAH) is a rare disorder with a poor prognosis. Deleterious variation within genes encoding components of the transforming growth factor-ß pathway underlie the majority of heritable forms of PAH. Identifying the missing genetic contribution is challenging, even with genes of large effect size, since it likely involves mutations in genes confined to small numbers of PAH cases. In this study, we performed whole genome sequencing, comparing 1038 PAH index cases to 6385 subjects with other rare diseases. Rare variant analysis identified mutations in novel causal genes, namely ATP13A3, AQP1 and SOX17, and provided independent validation of a critical role for GDF2 in PAH. We detected mutations predicted to be disruptive of function in most, but not all, previously reported PAH genes. Taken together these findings provide new insights into the molecular basis of PAH, and support a central role for endothelial dysregulation in disease pathogenesis.	Illumina HiSeq 2000	149
EGAD00001003425	A EGFR mutant NSCLC cell line which is sensitive to AZD9291 inhibition was mutagenised with the chemical mutagen ENU and then drug selected using a AZD9291. Single cell derived colonies were then manually picked and expanded in drug. Resistance was confirmed in a 14 day assay and DNA was collected. These then underwent targeted amplicon-based sequencing to confirm candidate resistance effectors hypothesised from currently available literature. This dataset contains all the data available for this study on 2017-07-05.	Illumina MiSeq	177
EGAD00001003426	High depth whole genome sequencing from GemCode (10x Genomics) DNA libraries containing long range linkage information for one Baganda trio and one Baganda child (parent already sequenced at high depth). This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ This dataset contains all the data available for this study on 2017-07-05.	Illumina HiSeq 2500	16
EGAD00001003427	Genome-wide profiling of DNA methylation levels by RRBS in 349 samples, derived from 112 glioblastoma (IDH wildtype) patients, 13 IDH muated brain tumor patients, and 5 normal brain controls. For each patient samples from at least two and up to six tumor resections are available. For 6 patients multiple regions of each tumor were sampled.	Illumina HiSeq 2000 Illumina HiSeq 3000 Illumina HiSeq 4000	349
EGAD00001003428	RNAseq data from the study: "Widespread DNA hypomethylation and differential gene expression in Turner syndrome".	Illumina HiSeq 2000 NextSeq 500	37
EGAD00001003429	RNA analysis of two patients 11 and 15 with WGS done on Illumina HiSeq2000. For research purpose and authorised user only.	Illumina HiSeq 2000	2
EGAD00001003430	RNA analysis of six patients 34, 35, 36, 37, 38 and 39 with WGS done on Illumina HiSeq2500. For research purpose and authorised user only.	Illumina HiSeq 2500	6
EGAD00001003431	High-coverage WGS sequencing of DNA samples from 45pairs GCs was performed on the Illumina HiSeq X Ten System.	Illumina HiSeq 2000	88
EGAD00001003432	ChIP-Seq data for the paper titled "Orthotopic Patient-Derived Xenografts of Pediatric Solid Tumors"	Illumina HiSeq 2000	20
EGAD00001003433	RNA-Seq data for the paper titled "Orthotopic Patient-Derived Xenografts of Pediatric Solid Tumors"	Illumina HiSeq 2000	98
EGAD00001003434	Whole Exome Sequencing for the paper titled "Orthotopic Patient-Derived Xenografts of Pediatric Solid Tumors"	Illumina HiSeq 2000	149
EGAD00001003435	Whole Genome Sequencing for the paper titled "Orthotopic Patient-Derived Xenografts of Pediatric Solid Tumors"	Illumina HiSeq 2000	150
EGAD00001003436	Seven files of patients 3, 21, 29, 30, 31, 32 and 33 with WGS done on Illumina MiSeq with high coverage. For research purpose and authorised user only.	Illumina MiSeq	7
EGAD00001003437	Fourteen files of patients 1, 2, 4, 6, 7, 8, 9, 12, 14, 16, 17, 18, 19 and 27 with WGS done on Illumina MiSeq with low coverage. For research purpose and authorised user only.	Illumina MiSeq	14
EGAD00001003438	Three files of patients 20, 23 and 25 with WGS done on Illumina HiSeq 2000. For research purpose and authorised user only.	Illumina HiSeq 2000	3
EGAD00001003439	Three files of patients 10, 11 and 13 with WGS done on Illumina HiSeq X Ten. For research purpose and authorised user only.	HiSeq X Ten	3
EGAD00001003440	One file of patient 16 with WGS done on Illumina HiSeq X-Ten. For research purpose and authorised user only.	HiSeq X Ten	1
EGAD00001003441	Total of 584 tumor specimens and/or patient-derived cells across 14 cancer types were subjected for whole-exome/targeted-exome and/or whole-transcriptome sequencing.	Illumina HiSeq 2500	584
EGAD00001003443	Massively parallel nanowell-based single-cell gene expression profiling	Illumina HiSeq 2500	14
EGAD00001003444	This dataset contains both standard RNA-Seq and small RNA-Seq of TSC related cortical tubers and age matched cortical controls. For the standard RNA-Seq paired-end sequencing was carried out. Each sample was split across multiple lanes. For the files available here the multiple lanes have been merged together, resulting in one forward and one reverse .fastq file for each sample. Small RNA-Seq was carried out on the same samples that underwent standard RNA-Seq. Again paired-end sequencing was carried out. The files here are raw and will need to be undergo quality control and trimming.	Illumina HiSeq 2500	44
EGAD00001003445	Clear cell renal cancer is characterized by near-universal loss of the short arm of chromosome 3 (3p). This event arises through unknown mechanisms, but critically results in the loss of several tumor suppressor genes. We analyzed whole genomes from 95 biopsies across 33 patients with clear cell renal cancer (ccRCC) recruited into the Renal TRACERx study. We find novel hotspots of point mutations in the 5'-UTR of TERT, targeting a MYC-MAX repressor, that result in telomere lengthening. The most common structural abnormality generates simultaneous 3p loss and 5q gain (36% patients), typically through chromothripsis. Using molecular clocks, we estimate this occurs in childhood or adolescence, generally preceding emergence of the most recent common ancestor by years to decades. Similar genomic changes recent common ancestor by years to decades. Similar genomic changes are seen in inherited kidney cancers. Modeling differences in age-incidence between inherited and sporadic cancers suggests that the number of cells with 3p loss capable of initiating sporadic tumors is no more than a few hundred. Targeting essential genes in deleted regions of chromosome 3p could represent a potential preventative strategy for renal cancer.	HiSeq X Ten	164
EGAD00001003446	This dataset includes deep coverage (>60x) whole exomes of 15 human embryonic stem cell lines. Genomic DNA was purified and fragmented using the Illumina Nextera system for library preparation and sequenced using 150bp paired-end reads. Sequencing reads were aligned to the hg19 reference genome using the BWA MEM alignment program.	HiSeq X Ten	15
EGAD00001003448	strand-specific RNA-seq data from 19 gastric tumors and their adjacent normal tissues, plus 16 gastric cancer cell lines, one normal gastric cell line, and 3 normal stomach RNAs	Illumina HiSeq 2500	58
EGAD00001003452	The samples include paired tumor and normal tissues from 205 patients (201 for normal and primary tumor tissues; 4 for normal, primary tumor and liver metastatic tissues). High-coverage WES sequencing or whole genome sequencing of DNA samples were performed on the Illumina HiSeq 2000 system	Illumina HiSeq 2000	30
EGAD00001003453	16S sequencing of stool samples of LifeLines-DEEP, domain V4	Illumina MiSeq	1010
EGAD00001003454	Validation of HLA variation of 8 individuals from the GenomeDenmark Phase 2 study. Validation is performed Sanger sequencing of selected amplicons (5-10 amplicons per sample).	AB 3730xL Genetic Analyzer	8
EGAD00001003455	The MHC vcf call set was generated using a modified AsmVar and BayesTyper pipeline. In contrast to the original pipeline, where variant calling is performed using alignment of collapsed assemblies to a reference genome, the MHC call set was produced using alignment of phased MHC haplotypes. Two iterations of BayesTyper was run, a first iteration for each haplotype seperately and a second iteration performing joint variant calling on all haplotypes. The sample IDs for the fathers and mothers are TrioID-01 and TrioID-02, respectively, and the IDs for the children are TrioID-0x, where x is a number between 3 and 7.		25
EGAD00001003456	There are 5WGS and 35WES sample pairs from the first affiliated hospital of kunming medical university, which belongs to ICGC projects COCA-CN.	Illumina HiSeq 2000	80
EGAD00001003457	Placental biopsies (n = 64 female placentas, n = 67 male placentas) were selected from healthy pregnancies from the POPs cohort. These patients had no evidence of hypertension at booking and during pregnancy, did not experience pre-eclampsia, Hemolysis, Elevated Liver enzymes, and Low Platelets (HELLP) syndrome, gestational diabetes, or diabetes mellitus type I or type II and other obstetric complications. They delivered live babies with a birth weight percentile in the normal range (20-80th percentile), with no evidence of slowing in fetal growth trajectory. Chorionic villi from the corresponding placentas (free from decidua, visible infarction, calcification, hematoma, or damage) were collected and processed within 30 minutes of separation from the uterus. After repeated washes in chilled phosphate buffered saline, the samples were placed in RNA later (Applied Biosystems) and stored at -80°C. Total placental RNA was extracted using mirVana Isolation Kit (Ambion). For each placenta, approximately 5 mg of tissue were homogenized in the Lysis/Binding solution for 20 sec at 6 m/s using a bead beater (FastPrep24) and Lysing Matrix D Tubes (MP Biomedicals). The samples were then spun at 13,000 rpm for 5 min at 4°C and the supernatants recovered. Afterwards, the manufacturer's instructions were followed. Immediately after the RNA extraction, placental RNA samples were DNase-treated using DNA-free DNA Removal Kit (Ambion), aliquoted, and stored in -80°C. Quantity and quality of the RNA samples were assessed using the Agilent 2100 Bioanalyzer, the Agilent RNA 6000 Nano Kit (Agilent Technologies), and Qubit fluorometer. Libraries were prepared starting with 300-500 ng of good quality total RNA (RIN ≥7.5) using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero Human/Mouse/Rat (Illumina), according to the manufacturer's instructions. The kit contains 96 uniquely indexed adapter combinations in order to allow pooling of multiple samples prior to sequencing. After determining their size (with the Agilent 2100 Bioanalyzer and the Agilent High Sensitivity DNA Kit by Agilent Technologies) and concentration (by qPCR with the KAPA Illumina ABI Prism Library Quantification Kit, Kapa Biosystems), libraries have been pooled and sequenced (single-end, 125 bp) using a Single End V4 Cluster Kit and an Illumina HiSeq2500 or HiSeq4000 instrument.	Illumina HiSeq 4000	147
EGAD00001003458	Fastq data of genomics heterogeneity of multiple synchronous lung cancer. Whole-genome sequencing (WGS) were performed in 3 tumour samples, one regional lymph node metastasis sample and peripheral blood sample from the same patient with MSLCs.	Illumina HiSeq 2000	6
EGAD00001003459	Single cell transcriptomics of PBMCs of 47 donors from the Lifelines Deep cohort (general population, Northern part of the Netherlands). Cells of five or six different donors were pooled together in one sample pool, resulting in eight different sample pools. In total, 28.855 cells were captured and their transcriptomes were sequenced to an average depth of 74k. Genotype data was available for each donor, which allowed us to use the Demuxlet method that uses variable SNPs between the pooled individuals to determine which cell belongs to which individual. Since genotype information is lacking of 2 individuals, the transcriptome of only 45 individuals could be retrieved.	Illumina HiSeq 4000	8
EGAD00001003460		Illumina HiSeq 2000 Illumina HiSeq 2500	13
EGAD00001003461	H3K27ac ChIP-seq and input genome sequencing was performed in 19 primary prostate tumours classified as intermediate risk. Sequencing of ChIP DNA was performed on an Illumina HiSeq 2000 as either single end 50 bp reads (for 7 samples) or paired end 100 bp reads (for 12 samples). Input DNA from all samples was sequenced using single-end 50 bp reads. The files provided are in fastq format.	Illumina HiSeq 2000	38
EGAD00001003462	Placental biopsies (n = 64 female placentas, n = 67 male placentas) were selected from healthy pregnancies from the POPs cohort. A quality control process was also applied for the RNA-Seq datasets: reads were trimmed with Trim Galore!, which uses cutadapt internally and were mapped to the same version of human genome reference (hg19). TopHat2, a splice-aware mapper built on top of Bowtie2 short-read aligner, was used in the mapping process in which so-called two-pass (or two-scan) alignment protocol was applied to rescue unmapped reads from the initial mapping step. In the second mapping, previously unmapped reads were re-aligned to the exon-intron junctions detected in the first-mapping by TopHat2 and were combined across all 131 placenta samples. The initial and second mapped reads were merged by samtools	Illumina HiSeq 4000	147
EGAD00001003463	These are the vcf files of exome sequencing of the two probands who were found to harbor mutations in KLB. Sample: EGAN00001564799 is the proband 1; Sample: EGAN00001564800 is the proband 11 in the KLB paper. Exome capture was performed using the SureSelect All Exon capture (Agilent Technologies, Santa Clara, CA USA) and sequenced on the HiSeq2500 (Illumina, San Diego CA USA).		2
EGAD00001003464	For RNA-Seq total RNA was isolated following LDC67 or JQ1 treatment. 3’RNAseq libraries were prepared with QUANT SEQ FWD 3´mRNA-Seq Kit (Lexogen, Austria), sequenced on an Illumina HiSeq 4000	Illumina HiSeq 2000	3
EGAD00001003466	This dataset contains 21 tumor-normal pairs of exome sequencing data of HCC patient from Chang Gung Memorial Hospital, Taiwan.	Illumina HiSeq 2500	42
EGAD00001003467	This dataset contains 77 tumor-normal pairs of exome sequencing data of HCC patient from National Taiwan University, Taiwan.	Illumina HiSeq 2500	154
EGAD00001003468	A CKD23_C_Mesan_WGBS paired end data for Mesangial cells(kidney)	HiSeq X Ten	1
EGAD00001003469	A CKD24_C_Podo_WGBS paired end data for Podocytes(CD90(-) Podocalyxin(+), kidney)	HiSeq X Ten	1
EGAD00001003470	A CKD25_C_Podo_WGBS paired end data for Podocytes(CD90(-) Podocalyxin(+), kidney)	HiSeq X Ten	1
EGAD00001003471	A CKD27_C_Mesan_WGBS paired end data for Mesangial cells(kidney)	HiSeq X Ten	1
EGAD00001003472	A DB31_N_Alpha_WGBS paired end data for alpha cells(PSA-NCAM(-), pancreas)	HiSeq X Ten	1
EGAD00001003473	A IPS01_N_Fibroblast_WGBS paired end data for iPSC(Oct4)	HiSeq X Ten	1
EGAD00001003474	A IPS02_N_NPC_WGBS paired end data for Neural progenitor cells(Nestin)	HiSeq X Ten	1
EGAD00001003475	A IPS03_N_ENeuron_WGBS paired end data for Early neuron cells(Tuj1)	HiSeq X Ten	1
EGAD00001003476	A IPS04_X_Fibroblast_WGBS paired end data for iPSC(Oct4)	HiSeq X Ten	1
EGAD00001003477	A IPS05_X_NPC_WGBS paired end data for Neural progenitor cells(Nestin)	HiSeq X Ten	1
EGAD00001003478	A IPS06_X_ENeuron_WGBS paired end data for Early neuron cells(Tuj1)	HiSeq X Ten	1
EGAD00001003479	A OB56_N_PreA_WGBS paired end data for Preadipocytes(fat)	HiSeq X Ten	1
EGAD00001003480	A OB57_D_PreA_WGBS paired end data for Preadipocyte(fat)	HiSeq X Ten	1
EGAD00001003481	A CKD23_C_Mesan_mRNA-Seq paired end data for Mesangial cells(kidney)	Illumina HiSeq 2500	1
EGAD00001003482	A CKD24_C_Podo_mRNA-Seq paired end data for Podocytes(CD90(-) Podocalyxin(+), kidney)	Illumina HiSeq 2500	1
EGAD00001003483	A CKD25_C_Podo_mRNA-Seq paired end data for Podocytes(CD90(-) Podocalyxin(+), kidney)	Illumina HiSeq 2500	1
EGAD00001003484	A CKD27_C_Mesan_mRNA-Seq paired end data for Mesangial cells(kidney)	Illumina HiSeq 2500	1
EGAD00001003485	A DB31_N_Alpha_mRNA-Seq paired end data for alpha cells(PSA-NCAM(-), pancreas)	Illumina HiSeq 2500	1
EGAD00001003486	A OB56_N_PreA_mRNA-Seq paired end data for Preadipocytes(fat)	Illumina HiSeq 2500	1
EGAD00001003487	A OB57_D_PreA_mRNA-Seq paired end data for Preadipocyte(fat)	Illumina HiSeq 2500	1
EGAD00001003488	A IPS01_N_Fibroblast_mRNA-Seq paired end data for iPSC(Oct4)	Illumina HiSeq 2500	1
EGAD00001003489	A IPS02_N_NPC_mRNA-Seq paired end data for Neural progenitor cells(Nestin)	Illumina HiSeq 2500	1
EGAD00001003490	A IPS03_N_ENeuron_mRNA-Seq paired end data for Early neuron cells(Tuj1)	Illumina HiSeq 2500	1
EGAD00001003491	A IPS04_X_Fibroblast_mRNA-Seq paired end data for iPSC(Oct4)	Illumina HiSeq 2500	1
EGAD00001003492	A IPS05_X_NPC_mRNA-Seq paired end data for Neural progenitor cells(Nestin)	Illumina HiSeq 2500	1
EGAD00001003493	A IPS06_X_ENeuron_mRNA-Seq paired end data for Early neuron cells(Tuj1)	Illumina HiSeq 2500	1
EGAD00001003494	A DB31_N_Alpha_smRNA-Seq single end data for alpha cells(PSA-NCAM(-), pancreas)	Illumina HiSeq 2500	1
EGAD00001003495	A OB56_N_PreA_smRNA-Seq single end data for Preadipocytes(fat)	Illumina HiSeq 2500	1
EGAD00001003496	A OB57_D_PreA_smRNA-Seq single end data for Preadipocyte(fat)	Illumina HiSeq 2500	1
EGAD00001003497	A CKD23_C_Mesan_smRNA-Seq single end data for Mesangial cells(kidney)	Illumina HiSeq 2500	1
EGAD00001003498	A CKD24_C_Podo_smRNA-Seq single end data for Podocytes(CD90(-) Podocalyxin(+), kidney)	Illumina HiSeq 2500	1
EGAD00001003499	A CKD25_C_Podo_smRNA-Seq single end data for Podocytes(CD90(-) Podocalyxin(+), kidney)	Illumina HiSeq 2500	1
EGAD00001003500	A CKD27_C_Mesan_smRNA-Seq single end data for Mesangial cells(kidney)	Illumina HiSeq 2500	1
EGAD00001003501	A IPS01_N_Fibroblast_smRNA-Seq single end data for iPSC(Oct4)	Illumina HiSeq 2500	1
EGAD00001003502	A IPS02_N_NPC_smRNA-Seq single end data for Neural progenitor cells(Nestin)	Illumina HiSeq 2500	1
EGAD00001003503	A IPS03_N_ENeuron_smRNA-Seq single end data for Early neuron cells(Tuj1)	Illumina HiSeq 2500	1
EGAD00001003504	A IPS04_X_Fibroblast_smRNA-Seq single end data for iPSC(Oct4)	Illumina HiSeq 2500	1
EGAD00001003505	A IPS05_X_NPC_smRNA-Seq single end data for Neural progenitor cells(Nestin)	Illumina HiSeq 2500	1
EGAD00001003506	A IPS06_X_ENeuron_smRNA-Seq single end data for Early neuron cells(Tuj1)	Illumina HiSeq 2500	1
EGAD00001003507	All the samples were obtained from the Pregnancy Outcome Prediction–a prospective cohort study of nulliparous women attending the Rosie Hospital, Cambridge (UK) for their dating ultrasound scan between January 14, 2008, and July 31, 2012. Ethical approval for the study was given by the Cambridgeshire 2 Research Ethics Committee (reference number 07/H0308/163) and all participants provided written informed consent. Cases of preeclampsia (PET) were defined on the basis of the 2013 ACOG criteria and cases of small for gestational age (SGA)infants were confined to severe SGA, i.e. a customized birth weight <5th percentile. Chorionic villi from the corresponding placentas (free from decidua, visible infarction, calcification, hematoma, or damage) were collected and processed within 30 minutes of separation from the uterus. After repeated washes in chilled phosphate buffered saline, the samples were placed in RNA later (Applied Biosystems) and stored at -80°C. Total placental RNA was extracted using mirVana Isolation Kit (Ambion). For each placenta, approximately 5 mg of tissue were homogenized in the Lysis/Binding solution for 20 sec at 6 m/s using a bead beater (FastPrep24) and Lysing Matrix D Tubes (MP Biomedicals). The samples were then spun at 13,000 rpm for 5 min at 4°C and the supernatants recovered. Afterwards, the manufacturer's instructions were followed. Immediately after the RNA extraction, placental RNA samples were DNase-treated using DNA-free DNA Removal Kit (Ambion), aliquoted, and stored in -80°C. Quantity and quality of the RNA samples were assessed using the Agilent 2100 Bioanalyzer, the Agilent RNA 6000 Nano Kit (Agilent Technologies), and Qubit fluorometer. Libraries were prepared starting with 300-500 ng of good quality total RNA (RIN ≥7.5) using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero Human/Mouse/Rat (Illumina), according to the manufacturer's instructions. The kit contains 96 uniquely indexed adapter combinations in order to allow pooling of multiple samples prior to sequencing. After determining their size (with the Agilent 2100 Bioanalyzer and the Agilent High Sensitivity DNA Kit by Agilent Technologies) and concentration (by qPCR with the KAPA Illumina ABI Prism Library Quantification Kit, Kapa Biosystems), libraries have been pooled and sequenced (single-end, 125 bp) using a Single End V4 Cluster Kit and an Illumina HiSeq2500 or HiSeq4000 instrument.	Illumina HiSeq 4000	52
EGAD00001003508	All the samples were obtained from the Pregnancy Outcome Prediction–a prospective cohort study of nulliparous women attending the Rosie Hospital, Cambridge (UK) for their dating ultrasound scan between January 14, 2008, and July 31, 2012. Ethical approval for the study was given by the Cambridgeshire 2 Research Ethics Committee (reference number 07/H0308/163) and all participants provided written informed consent. Cases of preeclampsia (PET) were defined on the basis of the 2013 ACOG criteria and cases of small for gestational age (SGA)infants were confined to severe SGA, i.e. a customized birth weight <5th percentile. Chorionic villi from the corresponding placentas (free from decidua, visible infarction, calcification, hematoma, or damage) were collected and processed within 30 minutes of separation from the uterus. After repeated washes in chilled phosphate buffered saline, the samples were placed in RNA later (Applied Biosystems) and stored at -80°C. Total placental RNA was extracted using mirVana Isolation Kit (Ambion). For each placenta, approximately 5 mg of tissue were homogenized in the Lysis/Binding solution for 20 sec at 6 m/s using a bead beater (FastPrep24) and Lysing Matrix D Tubes (MP Biomedicals). The samples were then spun at 13,000 rpm for 5 min at 4°C and the supernatants recovered. Afterwards, the manufacturer's instructions were followed. Immediately after the RNA extraction, placental RNA samples were DNase-treated using DNA-free DNA Removal Kit (Ambion), aliquoted, and stored in -80°C. Quantity and quality of the RNA samples were assessed using the Agilent 2100 Bioanalyzer, the Agilent RNA 6000 Nano Kit (Agilent Technologies), and Qubit fluorometer. Libraries were prepared starting with 300-500 ng of good quality total RNA (RIN ≥7.5) using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero Human/Mouse/Rat (Illumina), according to the manufacturer's instructions. The kit contains 96 uniquely indexed adapter combinations in order to allow pooling of multiple samples prior to sequencing. After determining their size (with the Agilent 2100 Bioanalyzer and the Agilent High Sensitivity DNA Kit by Agilent Technologies) and concentration (by qPCR with the KAPA Illumina ABI Prism Library Quantification Kit, Kapa Biosystems), libraries have been pooled and sequenced (single-end, 125 bp) using a Single End V4 Cluster Kit and an Illumina HiSeq2500 or HiSeq4000 instrument.	Illumina HiSeq 4000	91
EGAD00001003509	Whole Exome Sequencing reads consisting of BAM paired end reads from Follicular Lymphoma samples.		11
EGAD00001003510	BAM files with sequencing reads derived from Illumina whole genome sequencing of two DNA samples from lymphoblastoid cell lines from two patients with congenital disease. Whole genome sequencing was performed using Illumina HiSeq X Ten and samples were prepared using TruSeq library prep.	HiSeq X Ten	2
EGAD00001003511	BAM files with sequencing reads derived from Oxford Nanopore MinION whole genome sequencing of two DNA samples from lymphoblastoid cell lines from two patients with congenital disease. Samples were prepared using 1D and 2D library preps.	MinION	2
EGAD00001003512	This dataset includes bam files from 58 samples. These bam files include all read pairs where at least one of the reads aligns within 1kb of the HTT repeat expansion. These samples were sequenced using 2x150bp reads on an Illumina HiSeqX sequencer and aligned using bwa. Twelve of the samples used TruSeq Nano library preparation and 46 samples used TruSeq DNA PCR-free sample preparation.	HiSeq X Ten	58
EGAD00001003513	This dataset includes bam files from 3,001 samples. These bam files include all read pairs where at least one of the reads aligns within 1kb of the C9orf72 repeat expansion. Additionally, these bam files also contain reads that are aligned to any of 29 pre-determined off target locations where the aligners are known to mis-align reads associated with this repeat expansion. These samples were sequenced using a combination of 2x100bp reads on an Illumina HiSeq2000 and 2x150bp reads on an Illumina HiSeqX sequencer and aligned using the Isaac aligner.	HiSeq X Ten Illumina HiSeq 2000	3001
EGAD00001003514	HipSci - Healthy Normals - Exome Sequencing - July 2017	Illumina HiSeq 2000 Illumina HiSeq 2500	123
EGAD00001003515	HipSci - Bardet-Biedl Syndrome - Exome Sequencing - July 2017	Illumina HiSeq 2000 Illumina HiSeq 2500	3
EGAD00001003516	HipSci - Monogenic Diabetes - Exome Sequencing - July 2017	Illumina HiSeq 2000 Illumina HiSeq 2500	1
EGAD00001003517	HipSci - Alport Syndrome - Exome Sequencing - July 2017	Illumina HiSeq 2500	7
EGAD00001003518	HipSci - Battens Disease - Exome Sequencing - July 2017	Illumina HiSeq 2500	4
EGAD00001003519	HipSci - Bleeding and Platelet Disorders - Exome Sequencing - July 2017	Illumina HiSeq 2500	7
EGAD00001003520	HipSci - Congenital Hyperinsulinia - Exome Sequencing - July 2017	Illumina HiSeq 2500	5
EGAD00001003521	HipSci - Hereditary Cerebellar Ataxias - Exome Sequencing - July 2017	Illumina HiSeq 2500	11
EGAD00001003522	HipSci - Hereditary Spastic Paraplegia - Exome Sequencing - July 2017	Illumina HiSeq 2500	1
EGAD00001003523	HipSci - Hypertrophic Cardiomyopathy - Exome Sequencing - July 2017	Illumina HiSeq 2500	18
EGAD00001003524	HipSci - Kabuki Syndrome - Exome Sequencing - July 2017	Illumina HiSeq 2500	6
EGAD00001003525	HipSci - Macular Dystrophy - Exome Sequencing - July 2017	Illumina HiSeq 2500	3
EGAD00001003526	HipSci - Primary Immune Deficiency - Exome Sequencing - July 2017	Illumina HiSeq 2500	8
EGAD00001003527	HipSci - Retinitis Pigmentosa - Exome Sequencing - July 2017	Illumina HiSeq 2500	2
EGAD00001003528	HipSci - Usher Syndrome - Exome Sequencing - July 2017	Illumina HiSeq 2500	27
EGAD00001003529	HipSci - Healthy Normals - RNA Sequencing - July 2017	Illumina HiSeq 2000 Illumina HiSeq 2500	118
EGAD00001003530	HipSci - Monogenic Diabetes - RNA Sequencing - July 2017	Illumina HiSeq 2000 Illumina HiSeq 2500	1
EGAD00001003531	HipSci - Bardet-Biedl Syndrome - RNA Sequencing - July 2017	Illumina HiSeq 2000 Illumina HiSeq 2500	3
EGAD00001003532	HipSci - Alport Syndrome - RNA Sequencing - July 2017	Illumina HiSeq 2500	7
EGAD00001003533	HipSci - Battens Disease - RNA Sequencing - July 2017	Illumina HiSeq 2500	4
EGAD00001003534	HipSci - Congenital Hyperinsulinia - RNA Sequencing - July 2017	Illumina HiSeq 2500	5
EGAD00001003535	HipSci - Kabuki Syndrome - RNA Sequencing - July 2017	Illumina HiSeq 2500	6
EGAD00001003536	HipSci - Primary Immune Deficiency - RNA Sequencing - July 2017	Illumina HiSeq 2500	8
EGAD00001003537	HipSci - Hereditary Spastic Paraplegia - RNA Sequencing - July 2017	Illumina HiSeq 2500	6
EGAD00001003538	HipSci - Hereditary Cerebellar Ataxias - RNA Sequencing - July 2017	Illumina HiSeq 2500	11
EGAD00001003539	HipSci - Bleeding and Platelet Disorders - RNA Sequencing - July 2017	Illumina HiSeq 2500	7
EGAD00001003540	HipSci - Hypertrophic Cardiomyopathy - RNA Sequencing - July 2017	Illumina HiSeq 2500	18
EGAD00001003541	HipSci - Macular Dystrophy - RNA Sequencing - July 2017	Illumina HiSeq 2500	1
EGAD00001003542	HipSci - Retinitis Pigmentosa - RNA Sequencing - July 2017	Illumina HiSeq 2500	2
EGAD00001003543	HipSci - Usher Syndrome - RNA Sequencing - July 2017	Illumina HiSeq 2500	27
EGAD00001003544	Whole exome sequencing data to 30 PDOX models (28 early passages, 3 late passages (1 overlap)), 3 cell lines, and 20 matching human tumors	Illumina HiSeq 2000 Illumina HiSeq 4000	53
EGAD00001003545	Low-coverage whole genome sequencing data for 30 PDOX models (28 early passages, 4 late passages (2 overlaps)), 3 cell lines, and 21 matching human tumors	Illumina HiSeq 2000	56
EGAD00001003546	ICGC PCAWG Dataset for RNA-Seq BAM aligned using TopHat2. Project: LIRI-JP.		130
EGAD00001003547	ICGC PCAWG Dataset for RNA-Seq BAM aligned using Star. Project: LIRI-JP.		130
EGAD00001003548	ICGC PCAWG Dataset for RNA-Seq BAM aligned using TopHat2. Project: CLLE-ES.		74
EGAD00001003549	ICGC PCAWG Dataset for RNA-Seq BAM aligned using Star. Project: CLLE-ES.		74
EGAD00001003550	Cell line exome sequencing	Illumina HiSeq 2500	176
EGAD00001003551	The samples include paired tumor and normal tissues from 106 patients . High-coverage WES sequencing or whole genome sequencing of DNA samples were performed on the Illumina HiSeq 2000 system	Illumina HiSeq 2000	212
EGAD00001003553	Follicular lymphoma (FL) is an incurable B cell malignancy characterized by advanced stage disease and a heterogeneous clinical course. Recent genomic studies have focused on profiling “single” FL biopsies over several time-points, however, multi-site sampling in solid cancers has demonstrated profound spatial intra-tumor heterogeneity (ITH) with implications for precision medicine based initiatives. This study examined the extent of spatial heterogeneity in FL by whole exome sequencing 22 synchronously removed spatially separated biopsies from 9 patients. We observed significant differences in the extent of ITH across cases, with two distinct patterns of high and low spatial heterogeneity emerging. Site-specific alterations in genes with biological, prognostic or therapeutic relevance included, TNFRSF14, PIK3CD, TNFAIP3, PTEN, EP300 and XBP1. In depth characterization of these variants using deep-sequencing techniques confirmed their discordant nature, suggesting on-going genetic diversification driving evolution after widespread tumor dissemination. There was evidence of tumors comprising multiple competing subclones, with distinct clusters of mutations demonstrating differential expansions within spatially-separated sites. For cases where spatial tumors were examined at two time-points (FL and transformation to diffuse large B cell lymphoma (DLBCL)), the degree of heterogeneity increased with transformation. Collectively, our results demonstrate that spatial ITH is prevalent in FL. The existence of site-specific aberrations suggests that a single biopsy may not be sufficient in all patients to capture the full genomic complexity present and these spatial variations need to be considered in biomarker-led clinical studies.	Illumina HiSeq 2500	31
EGAD00001003555	40 paired normal and tumour whole-exome sequencing samples was used to investigate the genomic landscape of cutaneous squamous cell carcinoma	Illumina HiSeq 2500	80
EGAD00001003556	We will perform RNAseq to evaluate the effects of the loss of a list of TSGs on the transcriptome. This dataset contains all the data available for this study on 2017-08-10.	Illumina HiSeq 2500	25
EGAD00001003557	This dataset is belong to 2014 whole genome sequenced AML data which is aligned to human reference(human_g1k_v37.fasta). There are 67 paired CR samples from Chunnam University. All samples has passed QC and recalibration steps while aligning to reference.	Illumina HiSeq 2000	134
EGAD00001003558	ICGC PCAWG Dataset for RNA-Seq BAM aligned using Star. Project: RECA-EU.		100
EGAD00001003559	ICGC PCAWG Dataset for RNA-Seq BAM aligned using TopHat2. Project: RECA-EU.		100
EGAD00001003560	ICGC PCAWG Dataset for RNA-Seq BAM aligned using Star. Project: MALY-DE.		99
EGAD00001003561	ICGC PCAWG Dataset for RNA-Seq BAM aligned using TopHat2. Project: MALY-DE.		99
EGAD00001003562	This dataset includes bam files from 120 samples. These samples were sequenced using 2x150bp reads on an Illumina HiSeqX sequencer and aligned using the Isaac aligner. All samples were processed with TruSeq DNA PCR-free sample preparation.	HiSeq X Ten	118
EGAD00001003563	Whole exome sequencing of diffuse intrinsic pontine glioma (DIPG) cells isolated from the pons and from a sub-ventricular zone site of spread within the frontal lobe from the same individual (SU- DIPG-XIII)	Illumina HiSeq 2000	3
EGAD00001003564	The aim of the project is the definition of the molecular defect in a cohort of Rett-like patients negative for mutations in known disease genes. To this aim, a number of unrelated trios (patients plus parents) will be analysed by exome sequencing. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ This dataset contains all the data available for this study on 2017-08-16.	Illumina HiSeq 2500	46
EGAD00001003565	The project is focused on the axonal forms of Charcot-Marie-Tooth (CMT) disease. We have selected 13 families (7 from Spain and 6 from Czech Republic) that have been indepth clinically assessed and previously tested for mutations in known CMT genes without causal variants characterised. In these patients we expect to discover several CMT2 genes. Thus, we requested for exome sequencing of 45 DNAs:27 exomes in families from Spain and 18 exomes in the families from Czech Republic. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ This dataset contains all the data available for this study on 2017-08-16.	Illumina HiSeq 2500	45
EGAD00001003567	Reduced Representation Bisulfite Sequencing for WEHI-AML-1 and WEHI-AML-2. RRBS libraries were made with the NuGEN Ovation RRBS Methyl-Seq System. Bisulfite conversion was performed with the Qiagen Epitect kit. Sequencing was performed on an Illumina HiSeq2500.		7
EGAD00001003568	Genome sequencing at diagnosis and post induction for WEHI-AML-1 and WEHI-AML-2. Whole genome sequencing was performed on an Illumina HiSeq X Ten.		4
EGAD00001003569	Transcriptome sequencing for WEHI-AML-1 and WEHI-AML-2. RNA libraries were generated using the Illumina TruSeq RNA Sample Preparation Kit v2 and sequenced on an Illumina HiSeq2500.		9
EGAD00001003570	Exome sequencing for WEHI-AML-1 and WEHI-AML-2. Exome capture was performed with the Human All Exon v5_UTR Capture Library and the Agilent Technologies SureSelectXT2 Target Enrichment System, with sequencing on an Illumina HiSeq2500.		9
EGAD00001003571	The data consists of 678189 genome-wide polymorphic variants of 3658 individuals from ERF/GRIP region in a variant call format (vcf) file. ERF has been genotyped with different genotyping platform: Illumina 318 k, 350 k, 610 k and Affymetrics 200 k.		3658
EGAD00001003573	RNA sequencing data for the PDOX model EPD-613FH	Illumina HiSeq 2500	1
EGAD00001003574	Clonal evolution study of Intrahepatic cholangiocarcinoma: 69 PDPCs and 6 tissues.	Illumina HiSeq 4000	81
EGAD00001003579	Samples prepared using Safe-SeqS technology. All samples ran on an Illumina MiSeq instrument. Fastq files for read 1 and the index read present (R and I respectively).	Illumina MiSeq	49
EGAD00001003580	WGS sequencing for 303 cases (620 samples) from the ICGC ESAD-UK project Tumours 50x Normals 30x HiSeq X BAM files These samples are all available in ICGC release 26	Illumina HiSeq 2000	38
EGAD00001003581	Using low input SMART-seq protocol, the whole transcriptome of human small intestine macrophage subtypes is characterized.	NextSeq 500	33
EGAD00001003582	Genomics-Driven Precision Medicine for Advanced Pancreatic Cancer - Early Results from the COMPASS Trial - RNA-Seq unmapped reads	Illumina HiSeq 2500	50
EGAD00001003583	516 DNA samples were collected from individuals upon enrollment into the European Prospective Investigation into Cancer and Nutrition study between 1993 and 1998 across 17 different centers. 126bp pair-end reads sequencing data from the Illumina platform were converted to fastq format, the 2bp molecular barcode information at each read of the pair was trimmed and was written in the reads name. The Thymine nucleotide required for ligation was removed from the sequences. Burroughs-Wheeler Aligner (BWA-mem) was used for alignment of the processed fastq files to the reference hg19 genome, following indel-re-alignment using GATK. An in-house algorithm was written to collapse read families that share the same molecular barcode sequence		516
EGAD00001003584	Genomics-Driven Precision Medicine for Advanced Pancreatic Cancer - Early Results from the COMPASS Trial - RNA-Seq mapped reads		-
EGAD00001003585	Genomics-Driven Precision Medicine for Advanced Pancreatic Cancer - Early Results from the COMPASS Trial - WGS mapped reads		-
EGAD00001003586	Whole Genomes Define Concordance in Matched Primary, Xenograft, and Organoid Models of Pancreas Cancer - WGS mapped reads		54
EGAD00001003587	This data is belong to 2015 whole exome sequenced AML data which is aligned to human reference(human_g1k_v37.fasta). There are 40 paired NR samples from Chunnam University. All samples has passed QC and recalibration steps while aligning to reference.	HiSeq X Ten	80
EGAD00001003589	Ultra-Fast Patient-Derived Xenografts Identify Functional and Spatial Tumour Heterogeneities that Drive Therapeutic Resistance - WXS mapped reads		27
EGAD00001003590	Ultra-Fast Patient-Derived Xenografts Identify Functional and Spatial Tumour Heterogeneities that Drive Therapeutic Resistance - WXS unaligned reads	Illumina HiSeq 2500	27
EGAD00001003591	Merged bam files for PACA-CA Whole Genome Sequencing, for DCC release 25		211
EGAD00001003592	Merged bam files for PACA-CA Whole Exome Sequencing, for DCC release 25		216
EGAD00001003593		Complete Genomics	24
EGAD00001003596	The MITOEXME project aims to improve protocols for molecular diagnosis of patients with OXPHOS disorders with a focus on a next generation sequencing methods and to increase the knowledge of pahtophysiological mechanisms by identification of new targets and cellular studies. In this project we will sequence the exomes fo 120 patients. This dataset contains all the data available for this study on 2017-08-29.	Illumina HiSeq 2000	125
EGAD00001003597	Promoter capture HiC on KMS11 (multiple myeloma)	Illumina HiSeq 2000	1
EGAD00001003598	This data is belong to 2017 AML prospective data which is aligned to human reference(human_g1k_v37.fasta). There are 10 paired tumor/normal samples from SNUH. All samples has passed QC and recalibration steps while aligning to reference.	HiSeq X Ten	20
EGAD00001003599	This data is belong to 2017 AML genome data which is aligned to human reference(human_g1k_v37.fasta). There are 10 paired tumor/normal samples from SNUH. All samples has passed QC and recalibration steps while aligning to reference.	HiSeq X Ten	20
EGAD00001003600	Exome sequencing data for 1001 DLBCL patients and RNA sequencing data for 775 DLBCL patients	Illumina HiSeq 2500	1776
EGAD00001003601	The dataset for Direct Detection of Early-Stage Cancers using Circulating Tumor DNA includes 602 bam files from next-generation sequencing on the Illumina HiSeq2500 or MiSeq. The samples analyzed include cancer cell lines as well as plasma and tissue specimens from healthy individuals and patients with cancer.	Illumina HiSeq 2500 Illumina MiSeq	550
EGAD00001003602	Dataset consisting of: (1) N=234 genome-wide chromatin accessibility (ATAC-seq) profiles for distinct N=21 healthy old and N=28 healthy young subjects. ATAC-seq biological samples provided for the following tissues: PBMC (N=24), CD14+ monocytes (N=18), CD8+ memory T cells (N=7), CD8+ naive T cells (N=7), CD4+ memory T cells (N=7), CD4+ naive T cells (N=7), and naive B cells (N=7). (2) N=39 genome-wide transcription (RNA-seq) data for distinct N=15 healthy old and N=24 healthy young subjects' PBMCs.	Illumina HiSeq 2500	273
EGAD00001003603	Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003604	Genome and transcriptome sequence data from a metastatic gallbladder cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003605	Genome and transcriptome sequence data from a metastatic colonic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003606	Genome and transcriptome sequence data from a metastatic adenocarcinoma of the rectum patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003607	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003608	Genome and transcriptome sequence data from a metastatic small cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003609	Genome and transcriptome sequence data from a metastatic serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003610	Genome and transcriptome sequence data from a mullerian mixed tumor with carcinosarcoma of the ovaries patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003611	Genome and transcriptome sequence data from a metastatic cecal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003612	Genome and transcriptome sequence data from a metastatic adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003613	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003614	Genome and transcriptome sequence data from a metastatic non-small cell lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003615	Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003616	Genome and transcriptome sequence data from an adenocarcimona of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003617	Genome and transcriptome sequence data from a non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003618	Genome and transcriptome sequence data from a mesothelioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003619	Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003620	Genome and transcriptome sequence data from a metastatic colorectal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003621	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003622	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003623	Genome and transcriptome sequence data from a metastatic rectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003624	Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003625	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003626	Genome and transcriptome sequence data from a retroperitoneal mucinous cystic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003627	Genome and transcriptome sequence data from a salivary duct carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003628	Genome and transcriptome sequence data from a metastatic adenocarcinoma of appendiceal origin patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003629	Genome and transcriptome sequence data from a metastatic gastric cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003630	Genome and transcriptome sequence data from a radiation-induced pleomorphic sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003631	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003632	Genome and transcriptome sequence data from a chronic lymphocytic leukemia patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003633	Genome and transcriptome sequence data from a metastatic rectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003634	Genome and transcriptome sequence data from a solitary fibrous tumors (sarcoma) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003635	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003636	Genome and transcriptome sequence data from a metastatic paraganglioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003637	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003638	Genome and transcriptome sequence data from a metastatic prostate cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003639	Genome and transcriptome sequence data from a non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003640	Genome and transcriptome sequence data from a metastatic adenoid cystic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003641	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003642	Genome and transcriptome sequence data from a metastatic neuroendocrine carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003643	Genome and transcriptome sequence data from a metastatic cecal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003644	Genome and transcriptome sequence data from a metastatic spindle cell sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003645	Genome and transcriptome sequence data from an anaplastic ependymoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003646	Genome and transcriptome sequence data from a squamous cell carcinoma of ge junction patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003647	Genome and transcriptome sequence data from an anal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003648	Genome and transcriptome sequence data from a glioblastoma multiforme patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003649	Genome and transcriptome sequence data from a metastatic colon adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003650	Genome and transcriptome sequence data from a metastatic non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003651	Genome and transcriptome sequence data from a metastatic colon adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003652	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003653	Genome and transcriptome sequence data from a non-small cell lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003654	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003655	Genome and transcriptome sequence data from a metastatic colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003656	Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003657	Genome and transcriptome sequence data from an adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003658	Genome and transcriptome sequence data from a primary unknown patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study	PromethION	1
EGAD00001003659	Genome and transcriptome sequence data from an ependymoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003660	Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003661	Genome and transcriptome sequence data from an advanced adenocarcinoma of lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003662	Genome and transcriptome sequence data from a left cavernous sinus invasive skull meningioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003663	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003664	Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003665	Genome and transcriptome sequence data from a metastatic rectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003666	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003667	Genome and transcriptome sequence data from a metastatic gastrointestinal stromal tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003668	Genome and transcriptome sequence data from a metastatic rhabdomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003669	Genome and transcriptome sequence data from a metastatic mucinous adenocarcinoma of the rectum patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003670	Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003671	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003672	Genome and transcriptome sequence data from a metastatic clear cell ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003673	Genome and transcriptome sequence data from a metastatic adenocarcinoma of the ge junction patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study	PromethION	1
EGAD00001003674	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003675	Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003676	Genome and transcriptome sequence data from a metastatic adrenocortical carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003677	Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003678	Genome and transcriptome sequence data from a thymoma carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003679	Genome and transcriptome sequence data from a metastatic adenoid cystic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003680	Genome and transcriptome sequence data from a low grade serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003681	Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003682	Genome and transcriptome sequence data from a non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003683	Genome and transcriptome sequence data from a metastatic high grade sarcomatous neoplasm nos patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003684	Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003685	Genome and transcriptome sequence data from an osterosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003686	Genome and transcriptome sequence data from a metastatic neuroendocrine tumor arising from small bowel patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003687	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003688	Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003689	Genome and transcriptome sequence data from a metastatic epitheloid angiomyelolipoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003690	Transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003691	Genome sequence data from a metastatic squamous cell carcinoma of the oropharynx patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003692	Genome and transcriptome sequence data from a metastatic gastrointestinal stromal tumour patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003693	Genome and transcriptome sequence data from a metastatic rectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003694	Genome and transcriptome sequence data from a pleomorphic sarcomatoid epithelioid carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003695	Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003696	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003697	Genome and transcriptome sequence data from a metastatic meningioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003698	Genome and transcriptome sequence data from a locally advanced right breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003699	Genome and transcriptome sequence data from a metastatic lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003700	Genome and transcriptome sequence data from a thymic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003701	Genome and transcriptome sequence data from a metastatic myoepithelial carcinoma of parotid patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003702	Genome and transcriptome sequence data from a high grade serous carcinoma of the fallopian tube/ovary/peritoneum patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003703	The incidence of acute myeloid leukemia (AML) increases with age and mortality exceeds 90% when diagnosed after age 60. Only 10-15% of cases evolve from a pre-existing myeloproliferative or myelodysplastic disorder; the remaining cases arise de novo without a detectable prodrome and are diagnosed upon development of bone marrow failure. Analysis of diagnostic blood samples has demonstrated that de novo AML is preceded by the accumulation of somatic mutations in pre-leukemic hematopoietic stem and progenitor cells (preL-HSPCs) that subsequently undergo clonal expansion. If individuals in this pre-leukemic phase could be identified, methods for determination of risk and monitoring for progression to overt AML could be developed. However recurrent AML mutations also accumulate during aging in healthy individuals who never develop AML, referred to as age related clonal hematopoiesis (ARCH). To distinguish individuals with preL-HSPCs at high risk of developing AML from those with ARCH, we undertook deep targeted sequencing of genes recurrently mutated in AML in blood samples from 133 individuals in the European Prospective Investigation into Cancer and Nutrition (EPIC) study taken on average 6 years before they developed AML (pre-AML group), together with 683 matched healthy individuals (Control group). Pre-AML cases displayed accelerated age-correlated accumulation of somatic mutations.The identity, number and variant allele frequency (VAF) of mutations differed between the two groups, and were incorporated into a computational model of AML risk prediction that accurately distinguished pre-AML cases from controls on average 7 years prior to AML development. Our findings provide proof of concept that early prediction of AML development is feasible in high-risk populations, paving the way for early disease detection, monitoring, and potentially prevention.	Illumina HiSeq 2000 Illumina HiSeq 2500	628
EGAD00001003704	Rna sequencing of purified human group 3 innate lymphoid cells from non-reactive lymph nodes and spleen, inflamed tonsils and peripheral blood.	Illumina HiSeq 2500	20
EGAD00001003705	10 single-cell placental RNA libraries were generated using the Chromium Single Cell 3′ Reagent Kit (10X Genomics). All single-cell libraries were sequenced with a customized paired end with dual indexing (98/14/8/10-bp) format according to the recommendation by 10X Genomics. The data were aligned using the Cell Ranger Single-Cell Software Suite (version 1.0). Moreover, plasma RNA from 22 samples were extracted using the RNeasy Mini Kit (Qiagen). cDNA reverse transcription, second-strand synthesis, and RNA-sequencing (RNA-seq) library construction were performed using the Ovation RNA-seq System V2 (NuGEN) kit according to the manufacturer’s protocol. For alignment of the plasma RNA library, adaptor sequences and low-quality bases on the fragment ends (i.e., quality score < 5) were trimmed, and reads were aligned to the human reference genome (hg19) using the TopHat (v2.0.4) software. All aligned reads were deposited in bam file format.	Illumina HiSeq 2000 NextSeq 500	32
EGAD00001003706	PRAD-CA, DCC Release 26 : This dataset contains fastq files with Whole genome sequencing data for the CPC-Gene Project. Data from each sample was generated using multiple whole genome libraries and sequenced across multiple runs	Illumina HiSeq 2000 Illumina HiSeq 2500 unspecified	89
EGAD00001003708	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003709	Genome and transcriptome sequence data from a high-grade serous fallopian tube carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003710	Genome and transcriptome sequence data from a metastatic colorectal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003711	Genome and transcriptome sequence data from a bilateral breast lobular cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003712	Genome and transcriptome sequence data from a primary of unknown origin patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003713	Genome and transcriptome sequence data from a low-grade serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003714	Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003715	Genome and transcriptome sequence data from a metastatic cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003716	Genome and transcriptome sequence data from a melanoma of the right buccal mucosa patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003717	Genome and transcriptome sequence data from a metastatic non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003718	Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003719	Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003720	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003721	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003722	Genome and transcriptome sequence data from a primary unknown patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003723	Genome and transcriptome sequence data from a squamous cell carcinoma of the anus patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003724	Genome and transcriptome sequence data from a T-cell rich B cell lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003725	Genome and transcriptome sequence data from a metastatic adenocarcinoma of the rectosigmoid patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003726	Genome and transcriptome sequence data from a large-cell neuroendocrine lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003727	Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003728	Genome and transcriptome sequence data from a metastatic gastrointestinal stromal tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003729	Genome and transcriptome sequence data from a peripheral T-cell lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003730	Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003731	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003732	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003733	Genome and transcriptome sequence data from a metastatic uterine leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003734	Genome and transcriptome sequence data from a spindle cell carcinoma of the left parotid patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003735	Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003736	Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003737	Genome and transcriptome sequence data from a sinus adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003738	Genome and transcriptome sequence data from a Ewing sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003739	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003740	Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003741	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003742	Genome and transcriptome sequence data from an adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003743	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003744	Genome and transcriptome sequence data from a pleomorphic xanthoastrocytoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001003745	Exome sequencing fastq files from 6 mutation carriers and 5 non-carriers from 2 families. One µg DNA was used for library preparation using the TruSeq DNA LT Sample Prep Kit v2 according to the manufacturer’s instructions (Illumina). Hybridization was performed using Nimblegen SeqCap EZ Exome v3 (Roche) and Paired-end Sequencing (2x100 bp) on the Illumina HiSeq 2000 with TruSeq v3 chemistry (Illumina).	Illumina HiSeq 2000	11
EGAD00001003746	Sequencing was performed using OncoPanel v.2 (OPv2), an Agilent SureSelect custom designed bait set consisting of the coding regions of 504 genes, previously linked to human cancer. Sequencing wa sperformed on an Illumina HiSeq 2500. 14 highly differentiated, fusion-negative rhabdomyosarcoma tumor samples, and 8 non-matched normal skeletal muscle samples weer sequenced. BAM files are available for download.	Illumina HiSeq 2500	22
EGAD00001003747	Optimisation of ex vivo Memory B cell Expansion/Differentiation for Interrogation of Rare Peripheral Memory B Cell Subset Responses 1) This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/ This dataset contains all the data available for this study on 2017-09-13.	Illumina MiSeq	38
EGAD00001003748	Sequencing of B-cell receptor repertoires in healthy individuals and patients with chronic lymphocytic leukemia. 1) This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/ This dataset contains all the data available for this study on 2017-09-13.	Illumina MiSeq	387
EGAD00001003749	Isotype-resolved sequencing of B cell receptor in measles virus infection 1) This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/ This dataset contains all the data available for this study on 2017-09-13.	Illumina MiSeq	182
EGAD00001003750	This is the first whole exome sequencing analysis of a primary meningeal melanocytic tumour (MMT) alongside the patients germline. Here we report the CRAM files from the tumour and germline.	Illumina HiSeq 2500	2
EGAD00001003751	Whole genome sequencing data for primary tumors, matching control material from blood and their corresponding organoid. Whole transcriptome data for organoids.	HiSeq X Ten NextSeq 500	102
EGAD00001003752	single nucleotide variant calls from somatic sniper, vcf format		34
EGAD00001003753	single nucleotide variant calls from somatic sniper, vcf format. input for subclonal reconstruction		20
EGAD00001003754	structural variant calls from Delly, vcf format		37
EGAD00001003755	This dataset provides whole genome sequencing data of normal/tumors pairs from 9 patients with uterine or ovarian carcinosarcoma using the HiSeq 2000 sequencing system. It includes 27 samples (9 normals, 16 uterine tumors and 2 ovarian tumors). Through separate whole genome sequencing of carcinomatous and sarcomatoid components, we analyse and compare the genomic alterations of these components.	Illumina HiSeq 2000	27
EGAD00001003756	Prostate Cancer - RNA-Seq unmapped reads	Illumina HiSeq 2000	3
EGAD00001003757	BBMRI - BIOS project - Freeze 2 - Fastq files	Illumina HiSeq 2000	3686
EGAD00001003758	BBMRI - BIOS project - Freeze 2 - Bam files	Illumina HiSeq 2000	3686
EGAD00001003759	ATAC-seq data for 5 non-diabetic human pancreatic islet samples	Illumina HiSeq 2500	5
EGAD00001003760	There are 88 paired samples from HCC patients including tumors and matched adjacent normal tissues which were sequencing by Illumina HiSeq 2000 platform.	Illumina HiSeq 2000	176
EGAD00001003761	This dataset contains fastq files with Whole genome sequencing data for the CPC-Gene Project. Data from each sample was generated using multiple whole genome libraries and sequenced across multiple runs	Illumina HiSeq 2000 Illumina HiSeq 2500 unspecified	89
EGAD00001003762	Whole Exome sequencing of paediatric High Grade Gliomas	Illumina HiSeq 2000	99
EGAD00001003763	15 whole exome sequencing datasets from five patients. Data is provided as bam files. Libraries were generated using the SeqCap EZ Exome v3.0 kit and sequenced on an Illumina sequencer		15
EGAD00001003764	Four RNA-sequencing datasets from two patients with initial low-grade glioma and copy number alteration at IDH1 upon recurrence. Data is provided as bam files.		4
EGAD00001003765	Whole-exome sequencing of 20 samples of actinic keratosis (10) and cutaneous squamous cell carcinoma (10) was performed to investigate a potential relationship between DNA methylation-based subtypes and genetic mutation patterns. 7 samples were shown to belong to the stem cell-like subclass (4 AK and 2 SCC), 12 - to the keratinocyte-like subtype (6 AK and 6 SCC) and one SCC sample is unclassified (was not included in the methylation analysis). Exome regions were captured using Agilent Low Input Exome-Seq Human v5 kit and sequenced on Illumina Hiseq4000 with paired-end 100-nucleotide reads.	Illumina HiSeq 4000	20
EGAD00001003769	This dataset is a time-series of EGFR-mutant NSCLC clinical specimens from an individual patient profiled using tumor-based whole exome sequencing and the data is in BAM format. DNA was extracted from FFPE for primary tumor and frozen tumor tissue samples and matched non-tumor tissue using the Qiagen Allprep DNA/RNA Mini Kit. The library preparation protocol was based on the Agilent SureSelect Library Prep and Capture System. DNA was resuspended in a low TE buffer and sheared (Duty Cycle 5%; Intensity 175; Cycles/Burst: 200; Time: 300s, Corvaris S2 Utrasonicator). Bar-coded exome libraries were prepared using the Agilent Sure Select V5 library kit per manfucaturer’s specifications. The libraries were run on the HiSeq2500. Raw paired end reads (100bp) in FastQ format generated by the Illumina pipeline were aligned to the full hg19 genomic assembly obtained from USCS, gencode 14, using bwa version 0.7.12. Picard tools version 1.117 was used to sort, remove duplicate reads and generate QC statistics. Tumor DNA was sequenced to median depth of 303X (range 114.39-383.41) and the matched germline DNA to average depth of 231.65.	Illumina HiSeq 2500	8
EGAD00001003770	We performed RNA-seq on polyA-enriched mRNA isolated from the original liver biopsy tissue (Liver tissue), primary liver cells (PLC), hepatocyte-like cells (HLCs) differentiated from induced pluripotent stem cells (iPSCs), and iPSCs. RNA libraries were prepared using the Illumina TruSeq Stranded mRNA Sample Preparation protocol (ref. RS-122-2101, Illumina, San Diego CA, US) and sequenced using the Illumina HiSeq2500 platform following the manufacturer’s protocol. Samples were sequenced in paired-end mode to a length of 2x76 base pairs. Images from the instrument were processed using the manufacturer’s software to generate FASTQ sequence files.	Illumina HiSeq 2500	24
EGAD00001003776	186 tumor/normal matched samples from whole exome sequecing and 178 samples (168 tumors, 10 normals) from whole transcriptome sequencing	Illumina HiSeq 2500	550
EGAD00001003778		Illumina HiSeq 2000	1
EGAD00001003779	Whole genome sequencing (WGS) data of human small intestinal organoid cultures, which were deleted for the XPC gene using CRISPR-Cas9. Contains WGS data of 1 clone and 1 subclone.	HiSeq X Ten	2
EGAD00001003780	RNA-seq data obtained from directed differentiation of a subset of FiPSCs and BiPSCs cell lines towards islet-like cells. RNA was collected at two key developmental stages: definitive endoderm (DE) and pancreatic progenitors (PP).	Illumina HiSeq 2500	16
EGAD00001003781	Paired whole exome sequencing for 32 primary MDS, 14 MDS/MPN, and 8 AML-MRC cases (total = 54). Normal comparator genomic DNA was extracted from lymphocytes purified by flow cytometry. Bulk myeloid cells were used as a source of tumor gDNA. Files uploaded are mapped BAM files.	Illumina HiSeq 2000	94
EGAD00001003782	When available (25 primary MDS, 12 MDS/MPN, and 6 AML-MRC cases), high quality RNA (stranded-total) was submitted for RNA-seq. RNA was extracted from bulk myeloid cells which was used as the tumor population. Files uploaded are mapped BAM files.	Illumina HiSeq 2000	43
EGAD00001003783	Recent studies using next-generation sequencing strategies have described the landscape of genetic alterations in diffuse large B-cell lymphoma (DLBCL). However, little is known about the clinical relevance of recurrent mutations and copy number alterations and their transcriptional footprints. This study examines the frequency, interaction and clinical impact of recurrent genetic aberrations in DLBCL using high-resolution technologies in a large population-based cohort.	Illumina HiSeq 2000 Illumina HiSeq 2500	376
EGAD00001003784	BBMRI - BIOS project - Freeze 2 - Bam files - unrelated samples	Illumina HiSeq 2000	3559
EGAD00001003785	BBMRI - BIOS project - Freeze 2 - Fastq files - unrelated samples	Illumina HiSeq 2000	3559
EGAD00001003786	BBMRI - BIOS project - Freeze 2 - Bam files - GoNL samples	Illumina HiSeq 2000	420
EGAD00001003787	BBMRI - BIOS project - Freeze 2 - Fastq files - GoNL samples	Illumina HiSeq 2000	420
EGAD00001003788	Whole Exome Sequencing of 9 Colorectal Cancer (CRC) samples performed on Illumina HiSeq4000 consisting of aligned paired reads. RNAseq data sequenced on Illumina NextSeq500 consisting of FASTQ single reads from 3 CRC colon samples. A total of 12 samples from five patients (we matched normal tissue or pbmc and tumors) were sequenced on Illumina NextSeq500.	Illumina HiSeq 4000 NextSeq 500	24
EGAD00001003789	Exome reads constituting of FASTQ paired end reads from 5 FHD/FHDL patients	Illumina HiSeq 2000	16
EGAD00001003790	RNA seq reads constituting of FASTQ paired end reads from 5 FHD/FHDL patients	Illumina HiSeq 2000	13
EGAD00001003791	The SAHGP characterises the genomes of 24 individuals (8 Coloured and 16 black southeastern Bantu-speakers) using deep whole genome sequencing (WGS).		24
EGAD00001003792	The dataset for High Grade Serous Ovarian Carcinomas Originate in the Fallopian Tube includes 46 bam files from next-generation sequencing on the Illumina HiSeq2500. The samples analyzed include multiple lesions from nine patients, five with high grade serous ovarian carcinoma and four who are BRCA-carriers.	Illumina HiSeq 2500	46
EGAD00001003793	By differential gene expression analysis followed by protein expression and functional studies, we define that the naive T cells having divided the least since thymic emigration express complement receptors (CR1 and CR2) known to bind complement C3b- and C3d-decorated microbial products and, following activation, produce IL-8 (CXCL8), a major chemoattractant for neutrophils in bacterial defense. We also observed an IL-8–producing memory T cell subpopulation coexpressing CR1 and CR2 and with a gene expression signature resembling that of RTEs. JCI Insight. 2017;2(16):e93739. https://doi.org/10.1172/jci.insight.93739	Illumina HiSeq 2500	24
EGAD00001003794			8
EGAD00001003795	This dataset includes Nimblegen SeqCap EZ Exome v3 data for each lesion of three patients with multicentric glioma. For two patients, each lesion was sequenced along with whole blood. For a third patient, 3 pieces from the right lesion and 4 pieces from the left were sequenced along with whole blood. In each case BAM files that have been aligned with BWA mem alignment are available.		15
EGAD00001003797	This dataset contains WES data (.bam files) and associated phenotype information from 10 patients included in our microbiome study who went on to anti PD-1 immunotherapy for the treatment of metastatic melanoma at the University of Texas MD Anderson Cancer Center. Both tumor and matching germ line normal were sequenced on each patient using Illumina HiSeq 2500. The average coverage was 283X in tumors and 135X in germline (tumor+germline overall:209, Range: 0-1552).	Illumina HiSeq 2500	20
EGAD00001003799	We performed whole-exome sequencing and whole epigenome sequencing (RRBS) of samples collected from different time points during radiotherapy from thirty-four ESCC patients. We compared the genetic and epigenetic features of the different time biopsy samples to reveal the changes in ESCC received radiotherapy.	Illumina HiSeq 2500	180
EGAD00001003800	Whole Exome Sequencing was performed in a dilution series containing known amounts of human and mouse DNA, 3x 100% human 0% mouse, 2x 90/10, 3x 50/50, 2x 25/75 and 3x 0/100. A set of breast cancer clinical samples, matched normal tissue and matched PDTXs (total number = 14) were also analysed. Paired-end 75bp sequences for the dilution series and paired-end 125bp for the clinical samples were obtained on Illumina HiSeq2500; fastq files are provided. A triplicate analysis of the transcriptome using RNA-seq was also performed for the Universal Human RNA Reference and the Universal Mouse RNA Reference samples. Paired-end 150bp fastq files obtained on Illumina HiSeq4000 are provided.	Illumina HiSeq 2500 Illumina HiSeq 4000	12
EGAD00001003801	RNAseq Data set	Illumina HiSeq 2000	40
EGAD00001003802	106 FFPE tumor samples from small bowel were sequenced with Illumina HiSeq 4000. Exome capture was performed with NimbleGen SeqCap EZ Exome Library v3 Kit. Reads were aligned with BWA–MEM v.0.7.12 to GRCh37 reference genome. Variant calls were produced with GATK HaplotypeCaller. Variant calls were filtered against all data from gnomAD database using allele frequency threshold 0.0001 in order to remove germline variation.		106
EGAD00001003803	This dataset contains VCF files from a variant calling analysis of 19 neuroblastoma patients. WES or WGS data of the primary tumor were compared to WES cfDNA analysis at the time of diagnosis and at a 2nd timepoint (complete remission, partial remission, disease progression or relapse). For 4 patients, WGS of germline, tumor at diagnosis and tumor at relapse DNA was performed on Illumina HiSeq2500, with 100-bp paired-end reads. For the other patients, WES was performed using either an AgilentSureSelect Human All Exon v5 or a Roche Nimblegen SeqCap EZ Exome V3 kit on Illumina HiSeq2000, with 100-bp paired-end reads. SNVs observed in any of the primary tumors or cfDNA samples studied by WES were targeted using a capture sequencing panel at all intermediate time points.		146
EGAD00001003804	Exome fastq files of 98 hepatocellular carcinoma and matched nomral (BCM, HCC-JP)	Illumina HiSeq 2000	196
EGAD00001003805	A whole genome mutation analysis of cortical kidney tissue, an early passage kidney organoid culture derived from the kidney tissue sample, and a late passage of the same organoid culture.	HiSeq X Ten	3
EGAD00001003806	cDNA depleted RNA (500ng total RNA input) was fragmented to 150-200 nucleotides in first strand buffer for 3 minutes at 94°C. Random hexamer primed first strand was generated in presence of dATP, dGTP, dCTP and cTTP. Second strand was generated using dUTP instead of dTTP to tag the second strand. Subsequent steps to generate the sequencing libraries were performed with the KAPA HTP Library Preparation Kit for Illumina sequencing with minor modifications, i.e., after indexed adapter ligation to the dsDNA fragments, the library was treated with USER enzyme (NEB_M5505L) in order to digest the second strand derived fragments. After amplification of the libraries, samples with unique sample indexes were pooled and sequenced paired-end 2x50bp on a HiSeq2500 system following standard Illumina guidelines.	Illumina HiSeq 2500	36
EGAD00001003807	Whole transcriptome RNA sequencing (RNA-seq) of human induced pluripotent stem cell lines from three independent donors at seven islet developmental stages: definitive endoderm (DE), primitive gut tube (GT), posterior foregut (PF), pancreatic endoderm (PE), endocrine progenitors (EP), endocrine-like cells (EN), and beta-like cells (BLC).	Illumina HiSeq 2000	24
EGAD00001003808		Illumina HiSeq 2500	47
EGAD00001003809	This dataset includes 186 whole genome sequencing samples which combine to create 93 pairs. Each pair is comprised of two sequencing experiments carried out on the same donor to the NIHR BioResource Rare Disease cohort. These samples have been used to validate the Telomerecat method (a method for estimating telomere length from whole genome sequencing).	Illumina HiSeq 2000	52
EGAD00001003810	An RNA Seq study of the effects of HDAC inhibitor Quisinostat on six different synovial sarcoma cell lines	NextSeq 500	12
EGAD00001003811	Our project will examine the role of PIK3CA mutations and their sensitivity to endocrine therapies and its role, with the addition of complete ovarian suppression. We plan to test our hypotheses using tumour samples collected from patients enrolled in the SOFT/IBCSG24-02 clinical study (Suppression of Ovarian Function Trial - (NCT00066690). SOFT is a phase III trial that randomised 3066 premenopausal women to evaluate if adding ovarian suppression to adjuvant endocrine therapy will improve clinical outcomes. This dataset contains all the data available for this study on 2017-11-22.	Illumina HiSeq 2500	81
EGAD00001003812	Whole genome sequencing of sampels from isolated populations from Croatia. The samples are sequenced using the Illumina HiSeq X Ten system. This dataset contains all the data available for this study on 2017-11-22.	HiSeq X Ten	20
EGAD00001003813	The data contain whole exome sequencing of 27 Greenlanders in nine trios. Data were produced by Agilent SureSelect capture followed by paired-end Illumina HiSeq 2000 sequencing to a depth of 90.1X. More details on processing and analysis can be found in Moltke et al, Nature 2014 (PMID 25043022).	Illumina HiSeq 2000	27
EGAD00001003814	The data contain whole deep RNA sequencing of leukocytes from 17 Greenlanders. RNA was purified from peripheral blood with the PAXGene Blood miRNA Kit (Qiagen). The RNA sequencing library was prepared following the instructions of the TruSeq RNA Sample Prep Kit v2 (Illumina). For mRNA isolation and fragmentation 200 ng of total RNA was purified by oligo-dT beads. The qualified libraries were amplified on cBot to generate the cluster on the flowcell (TruSeq PE Cluster Kit V3–cBot–HS, Illumina). The amplified flow cell was sequenced paired-end on the HiSeq 4000 System (TruSeq SBS KIT-HS V3, Illumina).	Illumina HiSeq 4000	17
EGAD00001003815	Whole exome sequencing	Illumina HiSeq 2000	48
EGAD00001003816	HALT AML mRNA - RNASeq mapped reads		22
EGAD00001003818	BAM files of targeted next-generation DNA sequencing data of 13 chordoid gliomas of the third ventricle (2 paired tumor-normal samples and 11 tumor-only samples). Genomic DNA was extracted from formalin-fixed, paraffin-embedded blocks of tumor tissue from 13 patients with chordoid glioma of the third ventricle using the QIAamp DNA FFPE Tissue Kit (Qiagen). Genomic DNA was also extracted from leukocytes in a peripheral blood sample from one of the patients and a non-neoplastic gastric biopsy specimen from one of the patients. Capture-based next-generation DNA sequencing was performed at the University of California, San Francisco Clinical Cancer Genomics Laboratory, using an assay that targets all coding exons of approximately 500 cancer-related genes, select introns of 47 genes, and TERT promoter with a total sequencing footprint of 2.8 Mb (UCSF500 Cancer Panel). Sequencing libraries were prepared from genomic DNA, and target enrichment was performed by hybrid capture using a custom oligonucleotide library (Nimblegen SeqCap EZ Choice). Captured libraries were sequenced as paired-end 100 bp reads on an Illumina HiSeq 2500 instrument. Duplicate sequencing reads were removed computationally to allow for accurate allele frequency determination and copy number calling.	Illumina HiSeq 2500	15
EGAD00001003819	The dataset includes a subset of 762 individuals that were found to be closely related (≤3rd degree), including 263 Chinese and 499 Malays from the Singapore Living Biobank. There samples are whole-exome sequenced on Illumina HiSeq2000 platform (125bp paired end) with the exonic regions being captured using the Nimblegen SeqCap EZ Exome v3 kits.All the files are in the BAM format.	Illumina HiSeq 2000	762
EGAD00001003820	Whole transcriptome, strand-specific RNA-seq libraries were prepared from total RNA purified using RNeasy mini kit (Qiagen) using Ribo-Zero technology (Epicentre, an Illumina company) for depletion of rRNA followed by library preparation using ScriptSeq ScriptSeq RNA-Seq Library preparation Kit from Illumina. The paired raw sequence reads were processed using TopHat2 and mapped to the humane reference genome HG19.	NextSeq 500	16
EGAD00001003821	WES was performed using the KAPA-Hyper prep kit from Illumina (Roche, Basel, Switzerland) for library construction, followed by exome capture using Niblegen SeqCap EZ Human Exome Library v3.0 (Roche). Reads were mapped using BWA MEM against the humane reference genome HG19.	NextSeq 500	42
EGAD00001003822	The dataset comprises 8 breast cancer, 11 ovarian cancer, 1 benign tumour, 18 normal tissue, 2 endometrium, and 23 white blood cell samples. Genome wide methylation analysis was performed by Reduced Representation Bisulfite Sequencing (RRBS) on Illumina HiSeq 2500. Data is provided as FASTQ files	Illumina HiSeq 2500	63
EGAD00001003823	Somatic mutations were called using whole exome Sequencing (WES) data from colorectal cancer samples (dataset EGAD00001003821) using MuTect2, with matched constitutional WES-data obtained from leukocytes samples as reference.		37
EGAD00001003824	Whole genome sequencing data on 10 human cancer cell lines	Complete Genomics Illumina Genome Analyzer IIx	14
EGAD00001003825	Patients with T-cell prolymphocytic leukemia (T-PLL) were profiled with multiple OMICS approaches based on Next-Generation Sequencing (NGS). In total, data from RNA-Seq, Whole-Exome Sequencing, Whole-Genome Sequencing and amplicon panel analyses in 134 samples are available. All samples were processed as paired-end libraries on Illumina sequencing machines. The data are available as paired FastQ files.	Illumina HiSeq 2000 Illumina MiSeq	134
EGAD00001003827	The data set contains bam files aligned using bwa-0.7.8 mem -t 8 -R.	HiSeq X Ten	4
EGAD00001003828	This dataset contains paired fastq files for LMS tumor samples	Illumina HiSeq 2000 Illumina HiSeq 2500	37
EGAD00001003829	The data set contains paired end fastq files for whole exome sequencing data for Leiomyosarcoma tumor and control samples	Illumina HiSeq 2000 Illumina HiSeq 2500	96
EGAD00001003831		NextSeq 500	6
EGAD00001003832	Patient information SSc patients were recruited at the Department of Rheumatology of the Leiden University Medical Center (Leiden, The Netherlands). All patients met the American Rheumatism Association classification criteria for SSc (Subcommittee for scleroderma criteria 1980), and were classified according to LeRoy and Medsger criteria as either limited or diffuse cutaneous disease (LeRoy EC, Black C, Fleischmajer R, Jablonska S, Krieg T, Medsger TA Jr, Rowell N 1988). Institutional review board approval and written informed consent was obtained before patients entered this study. Two 4 mm skin biopsies were taken from a standardized location on the most proximal part of the lower arm, distal from the elbow. In 10 patients the skin biopsy came from a clinically affected area and in 4 patients the skin was locally unaffected. One sample was used for RNA sequencing and one sample was used for immunohistochemistry. Skin biopsies from healthy individuals were commercially sourced (Tissue Solutions, UK) and collected from donors undergoing skin resection surgery and after informed consent. To match the healthy skin with patients as much as possible, skin biopsies from healthy controls were also taken from a similar position (the under-arm (for 4 controls) and leg (for 2 controls)). Healthy skin donors were selected to match the age and sex of the SSc patient cohort. Biopsies from patients and controls were equally treated and were both stored at -80°C until RNA isolation was performed. RNA from frozen skin biopsies was isolated using RNeasy kit from fibrous tissue (Qiagen, the Netherlands). RNA quantity was determined by using SimplyNano 2000 and quality was assessed on Tapestation (Agilent, the Netherlands). All samples included in the study had a RIN score above 7.0. Transcriptome characterisation and analysis RNA sequencing was performed using polyA selection and a stranded protocol using Ion Torrent next generation sequencing technology (Service XS, The Netherlands). The Ion PI Template OT2 200 Kit v3 and Ion PI Sequencing 200 Kit v3 were used according to the manufacturer’s instructions. 20 samples were run on 11 PI chips. PI chip analyses, base calling and quality checks were performed using the Torrent Server Suite. An average of 42 million 100 bp reads was generated per sample. Following quality control, reads were aligned to the human genome (Homo sapiens GRh38.78) using Bowtie2 and STAR (Dobin et al. 2013; Langmead and Salzberg 2012). Reads were first aligned with STAR. For the unmapped reads from STAR, a second alignment step was performed using bowtie2 (local very sensitive options)	Ion Torrent Proton	20
EGAD00001003834	This dataset contains whole genome sequencing FASTQ data for 12 cholangiocarcinoma tumor samples, and their matched normal samples. These 12 samples are in addition to 59 samples available in dataset EGAD00001001988, and consist of patients from Thailand, Romania, and Singapore. Paired-end sequencing data was generated by Illumina Hiseq 2000 and 2500, with insert sizes of 170 and 350.	Illumina HiSeq 2000 Illumina HiSeq 2500	24
EGAD00001003835	Whole genome sequencing data of 25 prostate tumor and corresponding normal samples, aligned with the CGP BWA-mem workflow.	Illumina HiSeq 2000	50
EGAD00001003837	This dataset, named Stockholm tumor progression cohort, contains exome-sequencing samples of matched primary and metastasis samples from 20 metastatic breast cancer patients. All patients have one or more sequenced normal samples as well. The total number of samples is 125. The dataset has been used, apart from other studies, to explore tumor evolution patterns in metastatic breast cancer at Karolinska Institute Stockholm.	Illumina HiSeq 2500	125
EGAD00001003838		Illumina HiSeq 2000	19
EGAD00001003839		Illumina HiSeq 2000	26
EGAD00001003840		Illumina HiSeq 2000	3
EGAD00001003841	One sample of human genomic DNA. DNA extracted from whole blood. Reads obtained using an exome enrichment kit (Truseq, Illumina) and sequencing of 100bp paired-end reads on a HiSeq 2500 sequencing system (Illumina).	Illumina HiSeq 2500	1
EGAD00001003845	A SMC01_ChIP-Seq(H3K27me3) paired end data for skeletal muscle cells	Illumina HiSeq 2500	1
EGAD00001003846	A SMC02_ChIP-Seq(H3K27me3) paired end data for skeletal muscle cells	Illumina HiSeq 2500	1
EGAD00001003847	A SMC03_ChIP-Seq(H3K27me3) paired end data for skeletal muscle cells	Illumina HiSeq 2500	1
EGAD00001003848	A SMC04_ChIP-Seq(H3K27me3) paired end data for skeletal muscle cells	Illumina HiSeq 2500	1
EGAD00001003849	A SMC05_ChIP-Seq(H3K27me3) paired end data for skeletal muscle cells	Illumina HiSeq 2500	1
EGAD00001003850	A SMC06_ChIP-Seq(H3K27me3) paired end data for skeletal muscle cells	Illumina HiSeq 2500	1
EGAD00001003851	A SMC07_ChIP-Seq(H3K27me3) paired end data for skeletal muscle cells	Illumina HiSeq 2500	1
EGAD00001003852	A SMC08_ChIP-Seq(H3K27me3) paired end data for skeletal muscle cells	Illumina HiSeq 2500	1
EGAD00001003853	A SMC09_ChIP-Seq(H3K27me3) paired end data for skeletal muscle cells	Illumina HiSeq 2500	1
EGAD00001003854	A ADMSC01_ChIP-Seq(H3K27me3) paired end data for adipose-derived mesenchymal stroaml cells	Illumina HiSeq 2500	1
EGAD00001003855	A ADMSC02_ChIP-Seq(H3K27me3) paired end data for adipose-derived mesenchymal stroaml cells	Illumina HiSeq 2500	1
EGAD00001003856	A ADMSC03_ChIP-Seq(H3K27me3) paired end data for adipose-derived mesenchymal stroaml cells	Illumina HiSeq 2500	1
EGAD00001003857	A ADMSC04_ChIP-Seq(H3K27me3) paired end data for adipose-derived mesenchymal stroaml cells	Illumina HiSeq 2500	1
EGAD00001003858	A SMC01_smRNA-Seq single end data for skeletal muscle cells	Illumina HiSeq 2500	1
EGAD00001003859	A SMC02_smRNA-Seq single end data for skeletal muscle cells	Illumina HiSeq 2500	1
EGAD00001003860	A SMC03_smRNA-Seq single end data for skeletal muscle cells	Illumina HiSeq 2500	1
EGAD00001003861	A SMC04_smRNA-Seq single end data for skeletal muscle cells	Illumina HiSeq 2500	1
EGAD00001003862	A SMC05_smRNA-Seq single end data for skeletal muscle cells	Illumina HiSeq 2500	1
EGAD00001003863	A SMC06_smRNA-Seq single end data for skeletal muscle cells	Illumina HiSeq 2500	1
EGAD00001003864	A SMC07_smRNA-Seq single end data for skeletal muscle cells	Illumina HiSeq 2500	1
EGAD00001003865	A SMC08_smRNA-Seq single end data for skeletal muscle cells	Illumina HiSeq 2500	1
EGAD00001003866	A SMC09_smRNA-Seq single end data for skeletal muscle cells	Illumina HiSeq 2500	1
EGAD00001003867	A ADMSC01_smRNA-Seq single end data for adipose-derived mesenchymal stroaml cells	Illumina HiSeq 2500	1
EGAD00001003868	A ADMSC02_smRNA-Seq single end data for adipose-derived mesenchymal stroaml cells	Illumina HiSeq 2500	1
EGAD00001003869	A ADMSC03_smRNA-Seq single end data for adipose-derived mesenchymal stroaml cells	Illumina HiSeq 2500	1
EGAD00001003870	A ADMSC04_smRNA-Seq single end data for adipose-derived mesenchymal stroaml cells	Illumina HiSeq 2500	1
EGAD00001003871	A SMC01_WGBS paired end data for skeletal muscle cells	HiSeq X Ten	1
EGAD00001003872	A SMC02_WGBS paired end data for skeletal muscle cells	HiSeq X Ten	1
EGAD00001003873	A SMC05_WGBS paired end data for skeletal muscle cells	HiSeq X Ten	1
EGAD00001003874	A SMC06_WGBS paired end data for skeletal muscle cells	HiSeq X Ten	1
EGAD00001003875	A SMC07_WGBS paired end data for skeletal muscle cells	HiSeq X Ten	1
EGAD00001003876	A SMC08_WGBS paired end data for skeletal muscle cells	HiSeq X Ten	1
EGAD00001003877	A SMC09_WGBS paired end data for skeletal muscle cells	HiSeq X Ten	1
EGAD00001003878	A ADMSC01_WGBS paired end data for adipose-derived mesenchymal stroaml cells	HiSeq X Ten	1
EGAD00001003879	A ADMSC02_WGBS paired end data for adipose-derived mesenchymal stroaml cells	HiSeq X Ten	1
EGAD00001003880	A ADMSC03_WGBS paired end data for adipose-derived mesenchymal stroaml cells	HiSeq X Ten	1
EGAD00001003881	A ADMSC04_WGBS paired end data for adipose-derived mesenchymal stroaml cells	HiSeq X Ten	1
EGAD00001003882	EBiSC Whole Genome Sequencing raw FASTQ	HiSeq X Five	70
EGAD00001003883	Background: Lung carcinoma-in-situ (CIS) lesions are the pre-invasive precursor to lung squamous cell carcinoma. However, only half progress to invasive cancer in three years, while a third spontaneously regress. Whether modern molecular profiling techniques can identify those pre-invasive lesions that will subsequently progress and distinguish them from those that will regress is unknown. Methods: Progressive and regressive CIS lesions were laser-captured and their genome, epigenome and transcriptome interrogated. We analysed 83 progressive lesions, 41 regressive and 33 normal epithelial control samples. DNA methylation and gene expression profiles were further validated using publicly available lung cancer data. Results: Somatic mutation burden was higher in progressive lesions than regressive CIS lesions, across base substitutions, rearrangements, and copy number changes. Driver mutations were present in both progressive and regressive CIS lesions, but were more numerous in progressive cases. Progressive and regressive CIS lesions had distinct epigenomic and transcriptional profiles, with a strong chromosomal instability signature. Gene expression, methylation and copy number profiles can all predict accurately which CIS lesions will progress to lung cancer. Conclusion: Pre-invasive CIS lesions that will subsequently progress to invasive lung cancer can be distinguished from those that will regress using molecular profiling. Progression is associated with a strong chromosomal instability signature. These findings inform the development of novel therapeutic targets.	HiSeq X Ten	69
EGAD00001003884	The genetic basis of many rare childhood cancers remains unknown. These include a spectrum of infant soft tissue tumors without canonical gene fusions, encompassing congenital mesoblastic nephroma (CMN) of the kidney and infantile fibrosarcoma (IFS). Here, we integrated whole genome and transcriptome sequencing and identified diagnostic markers and novel therapeutic strategies.	HiSeq X Ten	37
EGAD00001003885	The genetic basis of many rare childhood cancers remains unknown. These include a spectrum of infant soft tissue tumors without canonical gene fusions, encompassing congenital mesoblastic nephroma (CMN) of the kidney and infantile fibrosarcoma (IFS). Here, we integrated whole genome and transcriptome sequencing and identified diagnostic markers and novel therapeutic strategies.	Illumina HiSeq 2500	19
EGAD00001003886	In the present study, we have examined fungal and bacterial infection in brain tissue from 10 AD patients and 16 control subjects by next-generation sequencing NGS using MiSeq sequencing platform (Illumina).	Illumina MiSeq	41
EGAD00001003887	Sequencing was performed using OncoPanel v.2 (OPv2), an Agilent SureSelect custom designed bait set consisting of the coding regions of 504 genes, previously linked to human cancer. Sequencing wa sperformed on an Illumina HiSeq 2500. 8 highly differentiated, fusion-negative rhabdomyosarcoma tumor samples were sequenced. BAM files are available for download.	Illumina HiSeq 2500	8
EGAD00001003888	A SMC03_WGBS paired end data for skeletal muscle cells	HiSeq X Ten	1
EGAD00001003889	A SMC04_WGBS paired end data for skeletal muscle cells	HiSeq X Ten	1
EGAD00001003890	we conducted whole genome sequencing (WGS) to characterize the genomic alterations of 36 never-smoker Chinese patients with lung adenocarcinomas (LUADs). This dataset is containing clean fastq files of 36 never-smoker Chinese patients with lung adenocarcinomas (LUADs)	HiSeq X Ten	72
EGAD00001003891	Transcriptome sequencing was performed on 214 patients with myelodysplasia in this study. RNA was obtained from bone marrow CD34+ cells (n=100) and/or bone marrow mononuclear cells (n=165). Transcriptome sequencing was performed for both cell fractions in 51 patients. A total of 211 patients were genotyped by targeted deep sequencing. We also studied bone marrow CD34+ cells and bone marrow mononuclear cells obtained from three healthy adults each.	Illumina HiSeq 2500	266
EGAD00001003892	Hepatocellular carcinoma specimens, intrahepatic cholangiocarcinoma specimens and liver normal tissues collected from 7 samples, including 44 fastq files from whole exome sequencing.	HiSeq X Ten	21
EGAD00001003894	The dataset (vcf files) consists of rare germline variants of 68 Finnish acute myeloid leukemia patients. We performed exome sequencing and filtered the germline variants against ExAC total MAF<0.01 in two gene panels. The 35 genes in the panels studied here have previously been associated with hematological malignancies and/or solid tumors. The dataset contains only variants of the two gene panels.		68
EGAD00001003895	the dataset contains RNA bam files of Renal Cell Carcinoma patients, which belongs to "An Empirical Approach Leveraging Tumorgrafts to Dissect the Tumor Microenvironment in Renal Cell Carcinoma Identifies Missing Link to Prognostic Inflammatory Factors"	Illumina HiSeq 2000	59
EGAD00001003898	This dataset provides whole genome sequencing data of normal/tumors pairs from 4 patients with uterine or ovarian carcinosarcoma using the HiSeq 2000 sequencing system. It includes 10 samples (4 normals, 4 uterine tumors and 2 ovarian tumors). Through separate whole genome sequencing of carcinomatous and sarcomatoid components, we analyse and compare the genomic alterations of these components.	Illumina HiSeq 2000	10
EGAD00001003900		Illumina HiSeq 2000	83
EGAD00001003901		Illumina HiSeq 2000	3
EGAD00001003902		Illumina HiSeq 2000	10
EGAD00001003903	Targeted sequencing of 284 patients with AV nodel reentry tachycardia (AVNRT). Sixty-seven genes, plausibly involved in AVNRT pathophysiology, were targeted. Using haloplex target enrichment system. Raw paired end fastq files are provided in this dataset.	Illumina MiSeq	284
EGAD00001003904	Comprehensive transcriptional characterization of bone marrow endothelial cells by RNA sequencing was performed to determine the molecular properties/signatures of endothelium during bone marrow recovery and niche formation. Regenerative bone marrow endothelium was FACS-isolated from bone marrow aspirates of Acute Myeloid Leukemia patients 17 days after receiving chemotherapy (n=3). Niche-forming endothelial cells were FACS-isolated from fetal bones (gestational age 15-20 weeks) (n=3). Healthy adult bone marrow endothelial cells (n=7) were used as steady-state controls. cDNA was prepared using the SMARTer procedure (SMARTer Ultra Low RNA Kit, Clonetech). The provided file type is FASTQ.	Illumina HiSeq 2500	13
EGAD00001003905	RNA-Seq files accompanying the paper titled "Somatic Histone H3 Mutations in Diffuse Intrinsic Pontine Gliomas and Non-Brainstem Paediatric Glioblastomas".	Illumina HiSeq 2000	66
EGAD00001003906	October 2017 data update (bam/fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	Illumina HiSeq 2500	28
EGAD00001003907	A Hematogenous Route for Medulloblastoma Leptomeningeal Metastases		79
EGAD00001003908	- Six samples from the DEV cell line: 2 controls, 2 transduced with IL4R WT and 2 transduced with IL4R mutant (I242N) - This DEV cell line is not commercially available and was acquired from a colleague in the Netherlands		6
EGAD00001003909	Raw lane level fastq files from Whole genome sequencing in support of ICGC PRAD-CA Variant calls	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500 unspecified	86
EGAD00001003910	Deep single-cell RNA sequencing data for 11,138 T cells from tumour, adjacent normal tissue and peripheral blood of treatment-naive CRC patients. The DATA ACCESS AGREEMENT is provided at https://github.com/zhangyybio/single-T-cell-data-access. Applicants can request access to the data by directly downloading it or by sending an email to cancerpku@pku.edu.cn. The process that is used to approve an application includes verifying the institution, participants and research purposes of the application. In general this process will take about two weeks. In principal, any academic research institutions complying with the laws and bioethic regulation policies of China will be approved.	Illumina HiSeq 4000	11138
EGAD00001003911	We generated human induced pluripotent stem cell (iPSC) lines with a GFP reporter inserted in the endogenous NKX6.1 locus. Characterisation of the reporter lines demonstrated faithful GFP labelling of NKX6.1 expression during pancreas and motor neuron differentiation. We performed three independent in vitro differentiations towards the pancreatic endocrine lineage. We FACS-purified GFP positive and negative cells from stage 7 cultures, and generated Smart-Seq2 RNA-sequencing libraries for the pre-sorted cells, as well as the two GFP-sorted cell populations. Gene expression profiling by RNA-sequencing reveals that the NKX6.1-positive population closely resembles mature human beta cells and the functional evaluation of purified populations shows that the glucose-responsive beta-like cells are enriched within the NKX6.1-positive population. These reporter lines provide a valuable resource to the scientific community for the derivation of functional relevant pancreas and neuronal cell subtypes.	Illumina HiSeq 4000	15
EGAD00001003912	This data is belong to 2018 AML-ETO patients' genome data which is aligned to human reference(human_g1k_v37.fasta). There are 12 paired tumor/normal samples from SNUH. All samples has passed QC and recalibration steps while aligning to reference.	Illumina HiSeq 2000	24
EGAD00001003913	74 CD49f single-cell methylomes are from cord blood of donor1, and 84 from cord blood of donor2. Samples from donor1 have one sequencing lane, and samples from donor2 have five sequencing lanes. This dataset was generated using Post-Bisulfite Adapter Ligation (PBAL), a bisulfite based whole genome protocol. In total this dataset consists of 494 runs.	Illumina HiSeq 2500	158
EGAD00001003914	This dataset provides whole genome sequencing data of tumor/normal pairs from 20 patients with hepatoblastoma using the illumina Novaseq sequencing system. It includes 40 samples (20 normals and 20 hepatoblastoma tumors). Our comprehensive analysis identified somatic mutations, structural variations, copy number variations and non-coding variants in hepatoblastoma.	Illumina NovaSeq 6000	40
EGAD00001003915	The dataset contains raw sequences (FASTQ files) from the Illumina 2x150bp paired-end RNA sequencing profiles of 11 fetal human brain samples at 7, 9, 12, 15 and 21 gestational weeks	Illumina HiSeq 2000 Illumina HiSeq 2500	9
EGAD00001003916	Cancer exomes consisting of FASTQ paired-end reads from ovary samples	Illumina HiSeq 2500	19
EGAD00001003917	Germline exomes consisting of FASTQ paired-end reads from blood samples	Illumina HiSeq 2500	19
EGAD00001003918	Cancer RNA-seq consisting of FASTQ paired-end reads from ovary samples	Illumina HiSeq 2500	16
EGAD00001003919	We performed whole genome, whole or targeted exome sequencing for 289 individuals from India. This included 152 clinically diagnosed MODY and 137 control samples. Whole genome libraries were constructed using TruSeqNano DNA Library Preparation Kit (Illumina, CA) and sequenced on Illumina HiSeq2500 (Illumina, CA). The whole exome analysis was performed using Agilent SureSelect (Santa Clara, CA) Human All Exome kit v5 (50 Mb). Exome capture libraries were sequenced on HiSeq 2500 (Illumina, CA). Targeted exome sequencing was performed using custom probes corresponding to 1965 genes implicated in pancreatic cell biology and/or diabetes.	Illumina HiSeq 2500	289
EGAD00001003920	WGS sequence data from cell lines BT-54/BT-88/BT-92/BT-142		7
EGAD00001003923	The discovery of the BRAF V600E mutation in almost all cases of hairy-cell leukemia has led to the widespread adoption of the BRAF inhibitor vemurafenib for treatment of chemotherapy-resistant cases. Impressive responses are reported; however, acquired resistance is common. Whilst diverse mechanisms of vemurafenib resistance have been elucidated in melanoma, the basis of resistance in HCL is unclear. Here we apply whole genome and deep targeted sequencing to investigate resistance mechanisms and potential therapeutic strategies in a patient with aquired resistance to vemurafenib.	Illumina HiSeq 2500	15
EGAD00001003924	The discovery of the BRAF V600E mutation in almost all cases of hairy-cell leukemia has led to the widespread adoption of the BRAF inhibitor vemurafenib for treatment of chemotherapy-resistant cases. Impressive responses are reported; however, acquired resistance is common. Whilst diverse mechanisms of vemurafenib resistance have been elucidated in melanoma, the basis of resistance in HCL is unclear. Here we apply whole genome and deep targeted sequencing to investigate resistance mechanisms and potential therapeutic strategies in a patient with aquired resistance to vemurafenib.	HiSeq X Ten	3
EGAD00001003925	This data is belong to 2014 AML-WGS patients' genome data which is aligned to human reference(human_g1k_v37.fasta). There are 10 paired tumor/normal samples from SNUH. All samples has passed QC and recalibration steps while aligning to reference.	HiSeq X Ten	20
EGAD00001003926	Patient-derived organoids model treatment response of metastatic gastrointestinal cancers (80 targeted exome capture samples and 2 whole-genome sequencing samples)	HiSeq X Ten Illumina HiSeq 2500	82
EGAD00001003927	Merged bam files for PACA-CA Whole Genome Sequencing, for DCC release 27		246
EGAD00001003928	This data is belong to 2016 AML prospective_v1 patients' genome data which is aligned to human reference(human_g1k_v37.fasta). There are 5 paired tumor/normal samples from SNUH. All samples has passed QC and recalibration steps while aligning to reference.	HiSeq X Ten	10
EGAD00001003929	Exome sequencing data from homologous recombination deficient primary breast cancers as assessed by the functional RAD51 based HR test. There are 12 tumour samples, of which 10 also have matching normal.	Illumina HiSeq 2500	22
EGAD00001003931	Sequencing data from 1,005 cancer patients and 812 healthy controls. All samples prepared using Safe-SeqS technology and sequenced on an Illumina MiSeq and/or HiSeq instrument. Paired FASTQ files for correspond to read 1 and the index read present (R and I respectively).	Illumina HiSeq 4000	212
EGAD00001003932	This data is belong to 2014 AML patients' exome data which is aligned to human reference(human_g1k_v37.fasta). There are 51 paired tumor/normal samples from SNUH. All samples has passed QC and recalibration steps while aligning to reference.	Illumina HiSeq 2000	102
EGAD00001003933	Whole exome sequencing (WES), shallow whole genome sequencing (sWGS), ultra-deep targeted sequencing (TS), RNA whole transcriptome sequencing (RNAseq) bam files. Targeted TCR sequencing in RNA (RNA-TCRseq) cram files.	Illumina HiSeq 2500 Illumina MiSeq	267
EGAD00001003934	EBiSC Whole Genome Sequencing processed VCF including VEP consequences		70
EGAD00001003935	Sequencing of V4 hypervariable region of 16S gene of microbiota present in feces of IBD patients	Illumina MiSeq	315
EGAD00001003936	Sequencing of V4 hypervariable region of 16S gene from microbiota present in intestinal biopsies of IBD patients	Illumina MiSeq	107
EGAD00001003937	BBMRI - BIOS project - Freeze 2 - Bam files - Imprinting analysis	Illumina HiSeq 2000	131
EGAD00001003940	This dataset contains whole genome sequencing data from 24 patients. For each patient a tumour and control sample has been sequenced on a Illumina HiSeq2000 instrument in paired-end mode. Up to three lanes per sample have been sequenced resulting in 112 Fastq files.	Illumina HiSeq 2000	48
EGAD00001003941	Whole-Genome Sequencing of a Healthy Aging Cohort.	Complete Genomics	511
EGAD00001003942	EBiSC Whole Genome Sequencing processed CRAM		70
EGAD00001003943	The oral and gut microbiomes of melanoma patients were characterized before the initiation of ant-PD1 immunotherapy, and compared to treatment response. Validation studies were performed in germ-free mice using stool from patients who responded/did not respond to ant-PD1 immunotherapy. All baseline oral(n=86) and gut (n=43) microbiome samples were subject to 16S sequencing - V4 region ( merged fastq files have been made available through this portal). Whole genome shotgun sequencing (WGS) was performed on a subset of fecal samples (n=25)- these files are also available( paired end reads). Also available are 16S sequencing results of stool samples from donors (n=2) used in fecal microbiota transplant and murine samples (n=12) from germ-free mice transplanted with stool from responder/non-responder patients. The fastq files associated with this dataset are stored at ENA under the following links: Fecal 16S – PRJEB22894 https://www.ebi.ac.uk/ena/browser/view/PRJEB22894 Oral 16S – PRJEB22874 https://www.ebi.ac.uk/ena/browser/view/PRJEB22874 Murine 16S – PRJEB22895 https://www.ebi.ac.uk/ena/browser/view/PRJEB22895 Fecal WGS – PRJEB22893 https://www.ebi.ac.uk/ena/browser/view/PRJEB22893		167
EGAD00001003944	Data set of 22 tumor/normal pairs of non-small cell lung cancer (NSCLC) patients. All tissue pairs were screened with MeDIP methylation enrichment sequencing and validations were performed with targeted bisulfite re-sequencing.	Illumina HiSeq 2500	50
EGAD00001003945	Bam files for PACA-CA RNA Seq analysis, for DCC release 27		219
EGAD00001003946	DNA from 10 human pancreatic islet samples was processed for Whole-genome Bisulphite Sequencing. The resulting libraries were sequenced on an Illumina Hiseq 2000 to generate 100bp paired-end read data. The resulting fastq.gz and mapped bam files were deposited.	Illumina HiSeq 2000	10
EGAD00001003947	18 human pancreatic islet preparations derived from 17 donors were processed for ATAC-seq. The data was generated on an Illumina Hiseq 2500 sequencing machine to generate 50bp paired end read data. The resulting fastq.gz and mapped bam files were deposited.	Illumina HiSeq 2500	18
EGAD00001003948	Merged bam files for PACA-CA Whole Genome Sequencing, for DCC release 27		39
EGAD00001003950	The dataset consists of samples from papillary thyroid cancer patients. A total of 292 DNA samples from blood/normal and cancer tissue are subjected to whole exome sequencing using Illumina. The fastq files generated were aligned with reference genome ‘hg19’, duplicates were marked, realignment around indels and quality recalibration were performed to produce good quality variants. The recalibrated “.bam” files are included with this dataset.		290
EGAD00001003951	Whole genome sequencing of 4 childhood T-ALL patients, which was further used in single-cell analysis in the paper "Single cell sequencing reveals the origin and the order of mutation acquisition in T-cell acute lymphoblastic leukemia".	Illumina HiSeq 2500	4
EGAD00001003953	Fastq files for the whole genome sequencing data (Illumina HiSeq 2500; 32.6-fold) for two diffuse gastric cancers revealing the fusion breakpoints. 2102T: CTNND1-ARHGAP26 gene fusion (g.chr11:57,578,103-g.chr5:142,358,707) 354T: ANXA2-MYO9A gene fusion (g.chr15:60,656,550-g.chr15:72,157,966)	Illumina HiSeq 2500	2
EGAD00001003955	This dataset comprises single-cell RNA sequencing of the human Lin-CD34+38-45RA-90+49f+ phenotype isolated from 2 normal cord donors. Library preparation was performed following a modified CEL-Seq2 protocol.	NextSeq 500	2
EGAD00001003956	Illumina platform sequencing data of SureSelect exome libraries prepared from 3 samples from one donor: a normal, primary breast cancer, and cell line derived from metastasis		3
EGAD00001003957	Raw lane level fastq files from Whole genome sequencing in support of ICGC PRAD-CA Variant calls	Illumina HiSeq 2000 Illumina HiSeq 2500 unspecified	1
EGAD00001003958	Whole exome sequencing data for 18 mucoepidermoid carcinoma samples. The samples were used for Illumina TruSeq library construction and captured using Agilent V4 exome panel. The PE fastq files are provided.	Illumina HiSeq 2000	18
EGAD00001003959	Whole genome sequencing data for 25 adenoid cystic carcinoma samples. The samples were used for Illumina TruSeq library construction and were sequenced on an Illumina HiSeq 2000. The PE fastq files are provided.	Illumina HiSeq 2000	25
EGAD00001003960	This data is belong to 2014 Lung squamous patients' exome data which is aligned to human reference(human_g1k_v37.fasta). There are 104 paired tumor/normal samples from SMC. All samples has passed QC and re-calibration steps while aligning to reference.	Illumina HiSeq 2000	208
EGAD00001003961	Whole exome data from tumor/normal pairs for adult type ovarian granulosa cell tumor sequencing project. This data set contains 24 tumor whome exomes and 20 matched normal whole exomes generated using the Agilent V4 exome hybrid capture platform, with sequencing performed on an Illumina HiSeq 2000. This dataset contains BAM files generated by aligning paired-end reads to the hg19 reference genome.	Illumina HiSeq 2000	44
EGAD00001003962	January 2018 data update (bam/fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	HiSeq X Ten Illumina HiSeq 2500	34
EGAD00001003963	March 2018 cumulative data release (bams,fastqs) for reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency as part of the International Human Epigenome Consortium	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500 NextSeq 500	193
EGAD00001003964	CD8+CD69+CD103+ and CD8+CD69+CD103- T cells were flow sorted from 1 primary triple negative breast cancer, 1 primary HER2 amplified breast cancer, and 1 triple negative liver metastasis. Prior to flow sorting, fresh tumor samples were digested to produce single cell homogenates on the day of surgery. Following RNA extraction, RNASeq was performed using polyA selection.	NextSeq 500	6
EGAD00001003965	Whole genome sequencing of Control (blood), Tumor and metastasis triplets for 12 samples.	Illumina HiSeq 2000	36
EGAD00001003966	This dataset conatains RNA sequencing data from 24 patients. Up to two lanes per tumour sample have been seqeunced on a Illumina HiSeq2000 instrument in paired-end mode resulting in 58 Fastq files.	Illumina HiSeq 2000	24
EGAD00001003967	Targeted Gene Panel for 171 PTCLs	Illumina HiSeq 2000	171
EGAD00001003968	The Janus Serum Bank (JSB) is a population-based cancer research biobank. This dataset contains small RNA sequencing (RNA-seq) data of 520 JSB samples from cancer-free individuals. Sequencing libraries were indexed and 12 samples were sequenced per lane on a HiSeq 2500 (Illumina) to an average depth of 18 million reads per sample. The dataset files are raw FASTQ files from the sequencing machine (50bp, single-end sequencing)	Illumina HiSeq 2500	520
EGAD00001003969	RNA-seq analyses were performed on cDNA libraries prepared from PolyA+ RNA using the Illumina TruSeq protocol for mRNA. The final libraries were sequenced with a paired-end 2×75 bp protocol aiming at 8.5 Gb per sample for a 30x mean coverage of the annotated transcriptome. All sequencing reactions were conducted on an Illumina HiSeq instrument (Illumina, San Diego, CA, USA).	Illumina HiSeq 2000	7
EGAD00001003970	For whole-exome sequencing 1 µg of DNA from fresh-frozen tumors was fragmented by sonication technology (for DNA from fresh-frozen tumors: Bioruptor, diagenode, Liѐge, Belgium; for DNA from FFPE material: Covaris). The fragments were end-repaired and adaptor-ligated, including incorporation of sample index barcodes. After size selection, libraries were subjected to an enrichment process with Sure select XT (Agilent). The final libraries were sequenced with a paired-end 2×75 bp protocol for an average coverage of 100-120x	Illumina HiSeq 2000	7
EGAD00001003971	ICGC-TCGA DREAM Somatic Mutation Calling - Tumour Heterogeneity Challenge - WGS mapped reads		59
EGAD00001003972	Fastq files for PACA-CA RNA Seq analysis, for DCC release 27	Illumina HiSeq 2500	219
EGAD00001003973	This dataset contains whole exome sequencing data from 24 patients. The Agilent SureSelect Human All Exon 50-Mb target enrichment kit was used to capture all human exons for deep sequencing. For each patient a tumour and control sample has been sequenced on a Illumina HiSeq2000 instrument in paired-end mode. Up to three lanes per sample have been sequenced resulting in 118 Fastq files.	Illumina HiSeq 2000	-
EGAD00001003974	Raw data files for the German Epigenome Project (DEEP), IHEC/EpiRR submission of 2017. metadata available at: http://deep.dkfz.de/#/experiments	Illumina HiSeq 2000 Illumina HiSeq 2500 NextSeq 500	17
EGAD00001003975	Raw lane level fastq files from Whole genome sequencing in support of ICGC PRAD-CA Variant calls	HiSeq X Ten Illumina HiSeq 2500 unspecified	42
EGAD00001003976	Well-differentiated, dedifferentiated, and matched normal tissues from liposarcoma (51 specimens) from 17 patients were obtained for whole exome sequencing. Tumors were submitted from 9 patients were used for RNA sequencing. The bam files are made available in this dataset.	Illumina HiSeq 2000	51
EGAD00001003977	RNA was extracted from formalin-fixed and paraffin embedded tumors of a large cohort of bladder cancer patients before treatment with anti-PD-L1. RNA was sequenced using a capture based approach (exome capture, RNA access).	Illumina HiSeq 2500	348
EGAD00001003978	This data is belong to WGS-Lung Cancer patients' genome data which is aligned to human reference(human_g1k_v37.fasta). There are 30 paired tumor/normal samples from Samsung Hospital. All samples has passed QC and recalibration steps while aligning to reference.	Illumina HiSeq 2000	60
EGAD00001003979	This dataset contains ChIP sequencing data from 24 patients. ChIP of 5–10 mg flash-frozen primary ependymoma tumour was performed using 5 mg H3K27ac antibody per ChIP experiment. The enriched DNA has been sequenced on a Illumina HiSeq2000 instrument in paired-end mode. Up to two lanes per sample have been sequenced resulting in 70 Fastq files.	Illumina HiSeq 2000	-
EGAD00001003980	Desmoplastic small round cell tumor (DSRCT) RNAseq data. 14 tumor samples.	Illumina HiSeq 2000	14
EGAD00001003981	This dataset pertains to transcriptome sequencing of paired RNA samples.RNA was isolated from the tumor and adjacent normal tissues of 12 patients (24 samples). We have performed rRNA removal followed by total RNA sequencing in Illumina HiSeq platform.We have uploaded TopHat2 aligned BAM files.	Illumina HiSeq 2500	24
EGAD00001003982	This dataset contains whole-genome sequencing data of tumors from 9 patients with mycosis fungoides. The data was generated using the Illumina HiSeq X-Ten platform.	HiSeq X Ten	9
EGAD00001003983	This dataset contains RNA-sequencing data of tumors from 8 patients with mycosis fungoides. The data was generated using the Illumina HiSeq 4000 platform.	Illumina HiSeq 4000	8
EGAD00001003984	Each tumor sample was cut into three pieces, yielding two end-pieces for cryovials and a middle portion placed in 10% buffered formalin. End pieces were homogenized manually and with a paddle blender (Stomacher). All paraffin-embedded blocks, including formalin-fixed tumor samples and molecular-fixed fallopian tubes, were sectioned and stained with hematoxylin and eosin prior to expert histopathological review to confirm the presence of high grade serous carcinoma. Homogenized end pieces were then flash frozen and later used for WGS. For all tumor and matched normal (peripheral blood) samples, DNA was extracted with the Qiagen AllPrep DNA/RNA kit (tumor samples from patients 25,26,28-32) or the Qiagen Blood and Tissue Extraction Kit (tumor samples from patients 1-4,7,9-17, and all blood samples). For all tumor and normal samples, DNA extraction was followed by library construction and sequencing using Illumina HiSeq2500 whole genome shotgun v4 chemistry with paired-end 125bp reads.	Illumina HiSeq 2500	89
EGAD00001003985	Each tumor sample was cut into three pieces, yielding two end-pieces for cryovials and a middle portion placed in 10% buffered formalin. End pieces were homogenized manually and with a paddle blender (Stomacher). All paraffin-embedded blocks, including formalin-fixed tumor samples and molecular-fixed fallopian tubes, were sectioned and stained with hematoxylin and eosin prior to expert histopathological review to confirm the presence of high grade serous carcinoma. Homogenized end pieces were then flash frozen, and RNA was extracted using the miRNeasy Mini kit. Nanodrop was used to assess quality (260/280) and quantity. Total RNA samples were also QC checked using the Caliper HT RNA HiSens assay. Samples ranging from 60-255ng RNA were re-arrayed into a 96-well plate. 5'-RACE PCR was carried out as described in "The interface of malignant and immunologic clonal dynamics in high-grade serous ovarian cancer" (Zhang et al.). Briefly, this involved first round and nested PCR with TRB (TCR beta chain) and IGH (immunoglobulin heavy chain) gene-specific primers. The indexed libraries were sequenced on the Illumina HiSeq platform with paired-end 250bp reads using v2 chemistry reagents.	Illumina HiSeq 2500 NextSeq 500	442
EGAD00001003986	A total of 192 positions per patient were deeply sequenced in each corresponding tumor sample (including 4 experimental controls and SNVs predicted to originate at each node of the sample phylogeny, see Zhang et al. for details). Genomic DNA templates were used as starting material to generate PCR products. PCR was set up using Phusion DNA polymerase according to the manufacturer’s specifications. The standard PCR conditions used were an initial denaturation at 98C for 30 seconds, followed by 35 cycles of 98C for 10 seconds, 60C for 15 seconds and 72C for 8 seconds, and a final extension at 72C for 10 minutes. PCR products were cleaned up using PCRClean DX beads. Amplicons were pooled by template for sequencing sample preparation. Sample preparation involved a second round of amplification using Phusion DNA polymerase with 6 PCR cycles, with primers specified in Zhang et al. DNA quality was assessed using the Caliper LabChip GX HighSensitivity Assay and DNA quantity was measured using a Qubit dsDNA HS assay kit on a Qubit fluorometer. The indexed libraries were pooled together and sequenced on the Illumina NextSeq500 platform with paired-end 150bp reads using v2 chemistry reagents.	NextSeq 500	180
EGAD00001003987	This dataset pertains to whole exome sequencing of paired DNA samples of Gingivo-buccal oral cancer patient.DNA was isolated from the tumor and blood tissues of 47 patients (94 samples).We have performed Nextera exome capture and sequenced exome libraries in Illumina HiSeq platform.We have uploaded BWA-ALN aligned BAM files.	Illumina HiSeq 2500	94
EGAD00001003988	Paired end Whole Exome Sequencing of fine-needle aspirates from 51 Mutliple Myeloma patients.	Illumina HiSeq 2000	176
EGAD00001003989	Longitudinal biopsies from a melanoma patient who initially responded to MEK plus CDK4/6 inhibitor therapy were whole exome sequenced to identify potential resistance mutations. The biopsies included normal tissue, pre-treatment, on-treatment, and several post-resistance timepoints.	Illumina HiSeq 2500	6
EGAD00001003990	Shallow sequencing of metastatic colorectal cancer samples for the Angiopredict and Nobev cohorts described in: van Dijk et al., JCO, in revision	Illumina HiSeq 2000 Illumina HiSeq 2500	186
EGAD00001003991	Complete clinical phenotypic description of all patients; the number listed represents all the samples linked to the 609 patients present in the dataset. Please consult the key file to visualise the sample-patient relationship		1094
EGAD00001003992	Whole Human Islet paired-ended RNA-seq of 64 human pancreatic donors.	Illumina Genome Analyzer IIx Illumina HiSeq 2500	64
EGAD00001003993	The present series corresponds to 161 RNA-seq samples from tumors with matched WES or WGS. Hepatocellular carcinoma (HCC) accounts for more than 90% of liver cancers, and is a major health problem. It is the 3rd cause of cancer-related mortality. Advances in genomic analyses have formed a comprehensive understanding of different underlying pathobiological layers resulting in hepatocarcinogenesis. Thus, the development of next-generation sequencing technologies has made it possible to generate more comprehensive catalogues of somatic alteration events (single nucleotide substitutions, structural variations, and epigenetic changes) in liver cancer genome than ever before.	Illumina HiSeq 2000	161
EGAD00001003994	The present series corresponds to 24 whole genome sequencing (12 Tumoral/Non-tumoral pairs). Hepatocellular carcinoma (HCC) accounts for more than 90% of liver cancers, and is a major health problem. It is the 3rd cause of cancer-related mortality. Advances in genomic analyses have formed a comprehensive understanding of different underlying pathobiological layers resulting in hepatocarcinogenesis. Thus, the development of next-generation sequencing technologies has made it possible to generate more comprehensive catalogues of somatic alteration events (single nucleotide substitutions, structural variations, and epigenetic changes) in liver cancer genome than ever before.	Illumina HiSeq 2000	28
EGAD00001003995	Fifteen pleomorphic invasive lobular carcionoma samples and their matched normal controls were subjected to targeted exome sequencing using the Beijing Genomics Institute TumorCare gene panel. Genomic DNA samples were randomly fragmented and captured libraries of each exome were sequenced on an Illumina Hiseq2000 system. CRAM files are provided for each tumor and normal pair.	Illumina HiSeq 2000	30
EGAD00001003996	Illumina platform sequencing of whole genome libraries prepared from normal, Barrett's oesophagus and oesophageal cancer samples from 44 donors		1
EGAD00001003997	From 2nd trimester human foetuses we derived liver and intestinal stem cells. These were clonally expanded until enough material was available for whole genome sequencing. For each foetus, reference tissue (skin or bulk liver) was also sequenced to determine all germline variants. These were subtracted from the clones to determine all somatic mutations that had been acquired during embryonic and fetal development.	HiSeq X Ten NextSeq 500	50
EGAD00001003999	Deep single-cell RNA sequencing data for 12346 T cells from tumour, adjacent normal tissue and peripheral blood of treatment-naïve NSCLC patients	Illumina HiSeq 2500 Illumina HiSeq 4000	12346
EGAD00001004000	Targeted gene screen of cell line tumours for testing the new V4 Colorectal gene panel. . This dataset contains all the data available for this study on 2018-03-07.	Illumina HiSeq 2500	53
EGAD00001004001	Targeted gene screen of FFPEs, cell lines and primary CRC tumours for testing the new V4 Colorectal gene panel. . This dataset contains all the data available for this study on 2018-03-07.	Illumina HiSeq 2500	92
EGAD00001004007	Data supporting: "Esophageal adenocarcinoma organoid cultures recapitulate human disease heterogeneity and provide a model for clonality studies and precision therapeutics." Li et al. WGS (BAM files) RNAseq (BAM files) Tumours, organoids, normals	Illumina HiSeq 2000	53
EGAD00001004008	This dataset include NPC blood tumor pair sequencing bam file, include 21 pairs, 42 bam files	Illumina HiSeq 2000	42
EGAD00001004011	This data is belong to 2015 AML-ETO patients' genome data which is aligned to human reference(human_g1k_v37.fasta). There are 10 paired tumor/normal samples from SNUH. All samples has passed QC and recalibration steps while aligning to reference.	Illumina HiSeq 2000	20
EGAD00001004012	This data is belong to additional 2015 AML-ETO patients' genome data which is aligned to human reference(human_g1k_v37.fasta). There are 2 paired tumor/normal samples from SNUH. All samples has passed QC and recalibration steps while aligning to reference.	Illumina HiSeq 2000	4
EGAD00001004013	Organoids are self-organizing 3D structures grown from stem cells that recapitulate essential aspects of organ structure and function. Here we describe a method to establish long-term culture conditions of human airway epithelial organoids that contain all major cell populations and allow personalized human disease modelling. We collected macroscopically inconspicuous lung tissue from non-small-cell lung cancer (NSCLC) patients undergoing medically indicated surgery and isolated epithelial cells to engineer 3D organoids. We exploit the potential to derive sub-clones from AOs to demonstrate the feasibility of CRISPR gene editing. Finally, we show that AOs readily allow modelling of viral infections such as RSV and for the first time demonstrate the possibility to study neutrophil-epithelium interaction in an organoid model. Taken together, we anticipate that human AOs will find broad applications in the study of adult human airway epithelium in health and disease.	HiSeq X Ten	4
EGAD00001004014	Whole Exome Sequencing Data from paediatric solid tumors	Illumina HiSeq 2500	54
EGAD00001004016	Sebaceous carcinomas (SeC) are cutaneous malignancies that, in rare cases, metastasize and prove fatal. Here we report whole exome sequencing on 32 SeC, revealing distinct mutational classes that explain both cancer ontogeny and clinical course. A UV-damage signature predominated 10/32 samples, while 9 were instead defined by microsatellite instability (MSI) mutations. UV-damage SeC exhibited poorly differentiated, infiltrative histopathologycompared to MSI signature SeC (p = 0.003), features previously associated with dissemination. Strikingly, UV-damage SeC transcriptomes and anatomic distributionclosely resembling those of cutaneous squamous cell carcinomas (SCC), implicating sun-exposed keratinocytes as a cell of origin. Like SCC, this UV-damage subclass harbors a high somatic mutation burden with >50 mutations/Mb, predicting immunotherapeutic response. In contrast, ocular SeC acquire far fewer mutations without a dominant signature, but show frequent truncating mutations in the ZNF750 epidermal differentiation regulator. Our data exemplify how different mutational processes convergently drive histopathologically related but clinically distinct cancers.	Illumina HiSeq 2500	79
EGAD00001004018	The aim of CAGEKID is to carry out comprehensive detection of DNA markers for conventional (clear cell) renal carcinoma. The project includes complete analysis of somatic and constitutional DNA variation, methylation patterns and expression in a large number of constitutional/tumor pairs. CAGEKID is a part of the International Cancer Genome Consortium, ICGC.		708
EGAD00001004020	Amplicon data of tumor samples generated for validation of WES findings and further sub clonal mapping	Ion Torrent PGM	78
EGAD00001004021		Illumina HiSeq 2000	32
EGAD00001004022		Illumina HiSeq 2000	10
EGAD00001004023		Illumina HiSeq 2000	52
EGAD00001004027	This data is belong to WES-Lung Cancer patients' genome data which is aligned to human reference(human_g1k_v37.fasta). There are 36 paired tumor/normal samples from Samsung Hospital. All samples has passed QC and recalibration steps while aligning to reference.	Illumina HiSeq 2000	72
EGAD00001004028	WGS sequencing for 63 cases (126 samples) from the ICGC ESAD-UK project Tumours 50x Normals 30x HiSeq X BAM files These samples are earmarked for inclusion in ICGC release 27 (deferred to release 28)	Illumina HiSeq 2000	1
EGAD00001004029	WGS sequencing for 43 cases (86 samples) from the ICGC ESAD-UK project Tumours 50x Normals 30x HiSeq X BAM files These samples are earmarked for inclusion in ICGC release 28	Illumina HiSeq 2000	86
EGAD00001004031	AngioPredict CNV and Exome data	Illumina HiSeq 2500	527
EGAD00001004032	Fastq files of whole-genome bisulfite sequence of non-cancerous tissue of HBV-associated hepatocellular carcinoma	Illumina Genome Analyzer IIx Illumina HiSeq 2000	3
EGAD00001004033	Fastq files of whole-genome bisulfite sequence of tumor tissue of HBV-associated hepatocellular carcinoma	Illumina Genome Analyzer IIx Illumina HiSeq 2000	5
EGAD00001004034	RNA-seq data (bam files) from the hypothalamus of 4 individuals with Prader-Willi syndrome and 4 age-matched control individuals. Detailed information about the study design, case-control matching and RNA-seq data processing is provided in the accompanying publication [Bochukova et al (2018) Cell Reports].	Illumina HiSeq 2000	8
EGAD00001004035	Exome sequencing was performed on 15 unrelated female patients suffering from primary infertility due to Ovarian Meiotic Defects (OMD). Each reference number corresponds to one of the tested subject. DNA was extracted from Saliva using Oragene saliva DNA collection kit (DNAgenotek Inc., Ottawa, Canada).Exome capture was performed with the Agilent V5 kit and sequencing was performed on Illumina HiSeq 2000.	Illumina HiSeq 2000	15
EGAD00001004036	Whole exome sequencing of non-brainstem paediatric high grade glioma from the HERBY phase II randomised trial. DNA from 86 cases was subjected to Illumina paired end whole exome sequencing using a customised SureSelect Human All Exon V6 capture set. Germline DNA from whole blood was sequenced for 83 cases. 26 cases were sequenced from both fresh frozen tissue and FFPE material, 10 were sequenced from only fresh frozen material and 50 from only FFPE. Data is provided as bwa aligned BAM files	Illumina HiSeq 2000	195
EGAD00001004037	Aim to characterise cancer gene landscape in CLL, particularly in cases with mutated POT1 gene. Treatment-naive CLL cases will be interrogated by targeted exome sequencing using a cancer gene panel. . This dataset contains all the data available for this study on 2018-03-14.	Illumina HiSeq 2500	123
EGAD00001004038	Identification of genes involved in congenital disorders of glycosylation and 3-methylglutaconic aciduria. There are more than 100 genes known for congenital disorders of glycosylation and new disorders are discovered each year. WE included patients with a so far unsolved glycosylation disease. The diagnostic group 3-methyglutaconic aciduria is a heterogenous group of disorders mostly caused by abnormal phospholipid synthesis or in association with mitochondrial dysfunction. We included patients with a so far unsolved disease and 3-methylglutaconic aciduria. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2018-03-14.	Illumina HiSeq 2500	31
EGAD00001004039	Albinism is genetically heterogeneous rare genetic condition affecting 1:17000 in the Western world (but more frequent in Africa) whose main feature is a profound visual impairment, characterised by foveal hypoplasia, abnormal chiasmatic connections, nystagmus and photofobia. All these features result in severly altered visual acuity (<0,1), absent depth perception and poor night vision. People with albinism are primarily visually handicapped. In addition, for some types of albinism, the visual phenotype can be presented with partial or total hypopigmentation, hence resulting in a secondary phenotype which can lead to skin cancer if skin is not adequately protected. Recently a new syndrome has been described, FHONDA, with the same visual abnormalities of albinism but without pigment alteration. The traditional classification differentiates Oculoculatenous albinism (OCA), where hypopigmentation involves hair, skin and eyes versus Ocular Albinism (OA), where hypopigmentation only affects the eyes. These are non-sydrimic types of albinism. Some syndromic forms (Hermansky-Pudlak=HPS, Chediak-Higashi=CHS) affect cells beyond pigment cells, present in the lungs, immune system, platelets and intestines, resulting in more severe phenotypes that can be fatal. Mutations in at least 19 genes are assocaited with the corresponding types of albinism. Most hospitals will only diagnose the most frequent cases using traditional Sanger, MLPA approaches. Some will use CGH arrays. We aim to diagnose all cases of albinism through the Albinochip proposal, which combines a Sequenom first step of known mutations combined with subsequent NGS approaches. In some cases we fail to find a second mutation, these are good candidates for further full exome analyses. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2018-03-14.	Illumina HiSeq 2500	48
EGAD00001004040	Whole Exome Sequencing of trios (proband + parents) or probands only with Neonatal Diabetes Mellitus (NDM) or Congenital Hyperinsulinism of Infancy (CHI) of unknown genetic origin. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2018-03-14.	Illumina HiSeq 2500	57
EGAD00001004041	As a contribution to the International Cancer Genome Consortium, exome sequencing of 142 Japanese gastric cancer with various histological subtypes have been conducted. This study aims to identify unique and common driver genes and molecular subtypes in Japanese gastric cancer. Please refer ICGC website for detail: http://icgc.org/icgc/cgp/69/420/1012357	Illumina HiSeq 2500	142
EGAD00001004042	As a contribution to the International Cancer Genome Consortium, exome sequencing of 102 Japanese gastric cancer with various histological subtypes have been conducted. This study aims to identify unique and common driver genes and molecular subtypes in Japanese gastric cancer. Please refer ICGC website for detail: http://icgc.org/icgc/cgp/69/420/1012357	Illumina HiSeq 2500	102
EGAD00001004043	The dataset consists in 64 fastq files from 23 patients with acute promyelocytic leukemia. Exome sequencing was conducted on several stages (Diagnosis, Remission, Relapse) for each patient. For 5 patients, only Diagnosis and Relapse samples are available.	Illumina HiSeq 1000	64
EGAD00001004044	Files from whole exome sequencing of eight tumors from eight pancreatic cancer patients along with matched PanIN precursor lesion(s) and a matched normal tissue.	Illumina HiSeq 2000	28
EGAD00001004045	Whole Genome Sequencing has been applied in 32 SRCC patients and the raw data have been subjected to standard procedures. Files with genomic variant calling were obtain at the last step.	HiSeq X Ten;ILLUMINA	64
EGAD00001004046	Analysis of the reference epigenomes and regulatory landscape of CLL as a whole and its major clinico-biological subtypes (with mutated and unmutated IGHV) in the light of the normal B-cell differentiation. We have extensively characterized the reference epigenomes of seven primary chronic lymphocytic leukemia samples (CLLs) with mutated (n=5) and unmutated IGHV (n=2) as well as several mature B-cell subpopulations (naive B cells from blood and tonsil, germinal center B cells, memory B cells and plasma cells from tonsil) using genome-wide maps of six histone marks (H3K4me3, H3K4me1, H3K27ac, H3K36me3, H3K9me3 and H3K27me3), DNA accessibility (ATAC-seq), DNA methylation (whole-genome bisulfite sequencing) and gene expression (RNA-seq). Furthermore, we have mapped the regulatory chromatin landscape of 100 additional CLL cases using chIP-seq of H3K27ac and ATAC-seq and linked these data to additional layers of information (whole-genome and/or whole-exome sequencing (WGS/WES), RNA-seq and DNA methylation microarrays) studied in the context of the International Cancer Genome Consortium (ICGC).	Illumina HiSeq 2000 NextSeq 500	303
EGAD00001004047	Peripheral blood mononuclear cells (PBMC) of CLL patients were isolated by density-gradient centrifugation over Linfosep (Biomedics, Madrid, Spain). B cells were purified with a CD19+ magnetic-bead system (MidiMACS, Miltenyi Biotec, Bergish Gladbash, Germany) according to the manufacturers’ instructions. Mean B-cell purity was >99% and the mean percentage of CD5+/CD19+ cells after purification was >98%, as measured by flow cytometry. Total RNA was extracted from purified cells in a single step using TriPure Isolation Reagent (Roche Applied Science, Vilvoorde, Belgium).Whole transcriptome sequencing libraries were prepared using the TruSeq Stranded Total RNA Sample Preparation Kit (Illumina). Libraries underwent 2 × 76 bp paired-end sequencing on a HiSeq 2500 instrument (Illumina). The median number of paired-end reads was 60.5 million (range, 49.7-79.7 million).	Illumina HiSeq 2500	32
EGAD00001004048	This dataset contains raw sequences (BAM files) of P1 trio: mother, father and affected child (P1). Whole exome sequencing (WES).	Illumina HiSeq 2500	3
EGAD00001004051	fastq of 345 Japanese gastric cancer	Illumina HiSeq 2000	345
EGAD00001004052	Ultra low coverage sequencing results from the project 'Rapid multiplex small DNA sequencing on the MinION nanopore sequencing platform'. Sequencing data of sample NA12877 and NA12878 generated from 3 nanopore sequencing runs are included in this dataset.	MinION	6
EGAD00001004055	The dataset "RNA-seq colorectal adenomas NKI-AvL TGO series NGS-ProToCol" includes 2 x 30 fastq files from paired-end total RNA sequencing on Illumina HiSeq2500 for 30 snap-frozen colorectal adenomas.	Illumina HiSeq 2500	30
EGAD00001004056	The dataset "RNA-seq colorectal carcinomas NKI-AvL TGO series NGS-ProToCol" includes 2 x 30 fastq files from paired-end total RNA sequencing on Illumina HiSeq2500 for 30 snap-frozen colorectal carcinomas.	Illumina HiSeq 2500	30
EGAD00001004057	The dataset "RNA-seq normal adjacent colon NKI-AvL TGO series NGS-ProToCol" includes 2 x 18 fastq files from paired-end total RNA sequencing on Illumina HiSeq2500 for 18 snap-frozen normal adjacent colon tissues.	Illumina HiSeq 2500	18
EGAD00001004058	The dataset "RNA-seq colorectal adenomas NKI-AvL TGO series Gut2009" includes 2 x 32 fastq files from paired-end mRNA sequencing on Illumina HiSeq2500 for 32 snap-frozen colorectal adenomas.	Illumina HiSeq 2500	32
EGAD00001004059	The dataset "RNA-seq colorectal carcinomas NKI-AvL TGO series Gut2009 " includes 2 x 29 fastq files from paired-end mRNA sequencing on Illumina HiSeq2500 for 29 snap-frozen colorectal carcinomas.	Illumina HiSeq 2500	29
EGAD00001004061	200PT : WG Aligned Sequence (bam)/ Aligned WG sequence data in this dataset are from CPCGene Tumour/Normal Pairs used in the 200PT Study		1
EGAD00001004062	This dataset includes whole genome sequencing of 198 epileptic individuals. Libraries preparation and whole-genome sequencing: gDNA was cleaned up using ZR-96 DNA Clean & ConcentratorTM-5 Kit (Zymo) prior to being quantified using the Quant-iTTM PicoGreen dsDNA Assay Kit (Life Technologies) and its integrity assessed on agarose gels. Libraries were generated using the TruSeq DNA PCR-Free Library Preparation Kit (Illumina) according to the manufacturer‚Äôs recommendations. Libraries were quantified using the Quant-iTTM PicoGreen dsDNA Assay Kit (Life Technologies) and the Kapa Illumina GA with Revised Primers-SYBR Fast Universal kit (Kapa Biosystems). Average size fragment was determined using a LabChip GX (PerkinElmer) instrument. The libraries were denatured in 0.05N NaOH and diluted to 8pM using HT1 buffer. The clustering was done on a Illumina cBot and the flowcell was ran on a HiSeq 2500 for 2x125 cycles (paired-end mode) using v4 chemistry and following the manufacturer's instructions. A phiX library was used as a control and mixed with libraries at 0.01 level. Bioinformatics: The Illumina control software was HCS 2.2.58, the real-time analysis program was RTA v. 1.18.64. Program bcl2fastq v1.8.4 was used to demultiplex samples and generate fastq reads. The filtered reads were aligned to reference Homo_sapiens assembly b37. Each readset was aligned to creates a Binary Alignment Map file (.bam).	Illumina HiSeq 2500	198
EGAD00001004063	EZH2, H3K4me3, H3K27ac and H3K27me3 ChIP-seq data consisting of fastq single-end reads from peripheral blood CLL cells	Illumina HiSeq 2000 NextSeq 500	34
EGAD00001004064	This dataset contains high-throughput RNA-sequencing of 12 samples, each sample comprising neural precursor cells derived from human induced pluripotent stem cells from individuals with and without the 16p13.11 microduplication (a copy number variant associated with a range of neurodevelopmental disorders). 4 samples derive from patients carrying the 16p13.11 microduplication, and 8 derive from unaffected family controls. RNA samples were processed to deplete rRNA using the TruSeq Stranded Total RNA with Ribo-Gold kit. Libraries were then sequenced using the NextSeq 500/550 High-Output v2 Kit on the Illumina NextSeq 550 platform, to produce 75 base pair paired-end sequencing reads at an average depth of around 100 million reads per sample. Raw sequencing reads data are stored in two FASTQ files per sample for these paired-end reads.	NextSeq 550	12
EGAD00001004066	We generated 42 human whole-exome sequencing data sets from fresh-frozen (FF) and FFPE samples. These samples include normal and tumor tissues from two different organs (liver and colon), that we extracted with three different FFPE extraction kits (QIAamp DNA FFPE Tissue kit and GeneRead DNA FFPE kit from Qiagen, Maxwell\textsuperscript{TM} RSC DNA FFPE Kit from Promega). Variant calling analysis shows a very high rate of concordance between matched FF / FFPE pairs and equivalent performance for the three kits we analyzed. We find a significant variation in the difference of total number of variants called between FF and FFPE samples for the three different FFPE DNA extraction kits. Coverage analysis shows that FFPE samples have less good indicators than FF samples, yet the coverage quality remains above accepted thresholds. We detect limited but significant variations in coverage indicator values between the three FFPE extraction kits. Globally, the GeneRead and QIAamp kits have better variant calling and coverage indicators than the Maxwell kit on the samples used in this study, although this kit performs better on some indicators and has advantages in terms of practical usage. Taken together, our results confirm the potential of FFPE samples analysis for clinical genomic studies, but also indicate that the choice of a FFPE DNA extraction kit should be done with careful testing and analysis beforehand in order to maximize the accuracy of the results.	Illumina HiSeq 2000	42
EGAD00001004067	Custom panel sequencing data from 1714 clear cell renal cell carcinoma samples		1714
EGAD00001004068	Whole-genome, whole-exome and transcriptome sequencing of pancreatic ductal adenocarcinomas from young adults reveals recurrent NRG1-fusions in KRAS wild-type tumors.	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000	36
EGAD00001004069	Identification of tumor-specific effects on gene expression profile of regulatory T cells and conventional T cells in humans, investigation of the clonal origin of regulatory T cells and impact analysis of tumor-specific conversion of conventional T cells into induced regulatory T cells on the peripheral regulatory T cell repertoire in humans.	Illumina HiSeq 2500	2304
EGAD00001004070	RNA sequencing of non-brainstem paediatric high grade glioma from the HERBY phase II randomised trial. RNA from fresh frozen surgical tissue in 20 cases was subjected to Illumina whole transcriptome paired end sequencing. Data is provided as paired-end FASTQ files	Illumina HiSeq 2000	20
EGAD00001004071	We integrate genomic (whole-genome sequencing, WGS) and transcriptome (polyA-enriched RNA-Seq) sequencing from 90 NSCLC cases and comprehensively identified the distinct genomic features of Chinese NSCLC patients.		90
EGAD00001004072	200PT : SNV vcf files. SNV calls generated using SomaticSniper and PhyloWGS, from the CPCGene 200PT Subclonality study		1
EGAD00001004073	200PT : CNA vcf files. Copy Number Abberation calls generated using TITAN and PhyloWGS, from the CPCGene 200PT Subclonality study		1
EGAD00001004074	Genome-wide profiling of DNA methylation levels by RRBS in 150 glioblastoma tumor samples. Patients were selected to represent the general population of glioblastoma patients based on Austrian Brain Tumor Registry. These DNA methylation profiles were created for the validation of the glioblastoma progression study (GBMatch) and consist of 106 profiles from FFPE samples and 44 profiles from fresh-frozen samples. For the 44 fresh-frozen samples also WGS data (43 genomes) and RNA-seq data (37 transcriptomes) have been produced for validation purposes.	Illumina HiSeq 3000	150
EGAD00001004075	For this tissue dataset, we applied low-pass whole genome sequencing to 98 non-advanced and advanced adenomas. As small number of lesions was sequenced multiple times, this dataset consists of 103 fastq files. These adenomas were classified as lesions with low-risk or high-risk of progression, according to the presence of specific DNA copy number changes (Carvalho et al, CancerPrevRes, 2018).	Illumina HiSeq 2000	103
EGAD00001004076	37 transcriptomes derived from fresh-frozen glioblastoma tumor samples. These transcriptomes have been produced for validation purposes and match the corresponding RRBS and WGS profiles in that DNA and RNA was extracted from the same tumor samples.	Illumina HiSeq 3000	37
EGAD00001004077	43 low-coverage genomes derived from fresh-frozen glioblastoma tumor samples. These genomes have been produced for validation purposes and match the corresponding RRBS and RNA-seq profiles in that DNA and RNA was extracted from the same tumor samples.	Illumina HiSeq 3000	43
EGAD00001004078	For this tissue dataset, we applied low-pass whole genome sequencing to 96 advanced adenomas. Advanced adenomas were classified as lesions with low-risk or high-risk of progression, according to the presence of specific DNA copy number changes (Carvalho et al, CancerPrevRes, 2018).	Illumina HiSeq 2000	96
EGAD00001004079	RNA-seq data from sorted populations from 10 CML samples and 4 normal bone marrow samples.	NextSeq 500	28
EGAD00001004080	This dataset contains four data files relating to the Cambridge Interval SomaLogic pQTL study to go with the corresponding genetic data: (1) Genome-wide pQTL summary associations for each analyte. (2) Mapping table for the genetic variants analysed - containing rsID, position and allele information. (3) Normalised quantitative readouts for each analyte, along with covariates used for the pQTL analysis. (4) Table of SOMAmer analytes mapped to their protein targets. Please see the readme file in the dataset for more information.		3301
EGAD00001004081	Smart-seq2 protocol was used to perform single cell RNA-sequencing on 465 immune cells. The immune cells analysed include 215 HLA-DQ2: gluten-(DQ2.5-glia-α1, -α2, -ω1, and -ω2) tetramer-sorted T cells, 247 transglutaminase 2 (TG2)-positive plasma cells from intestinal biopsy or peripheral blood from celiac disease patients, and 3 unassigned cells in 3 batches.	Illumina HiSeq 4000	1
EGAD00001004082	In this study, we applied an Illumina HighSeq platform-based high-coverage WES technique, which, in addition to the exons, allows the determination of 5′- and 3′-UTRs, promoters to a certain length, along with off-target sequences, such as introns, intergenic regions and infecting viruses. Brains from suicide victims (n = 23; 15 males and eight female) who had suffered from major depressive disorder and from control participants (n = 21; 14 males and seven females) who had died from other causes were used for whole-exome sequencing. Alignment files in bam format were uploaded.	Illumina HiSeq 2000	44
EGAD00001004084	ChIP-Seq - CEBPE - REH. The ETV6/RUNX1 translocated acute lymphoblastic leukaemia cell line. REH was used to perform ChIP-Seq using a CEBPE antibody. Cells were fixed in 1% formaldehyde for 10mins, prior to preparation of chromatin using Active Motif Express ChIP-IT. 2ug of antibody (anti CEBPE Atlas Antibodies HPA002928)was added to 25ug of chromatin O/N at 4C with rotation. Duplicate reactions were pooled and purified. 10ng of ChIP’d and input DNA used for Illumina NGS preparation (NEBNext ChIP-Seq Library kit; New England Biolabs), CEBPE and Input DNA ChIP samples were sequenced on a MiSeq using 150bp Kit v3 paired end and a HiSeq 2500 using 2x101 version 4 paired end (Illumina) respectively. Reactions performed in duplicate. shCEBPE RNA-Seq - REH. REH cells were lentivirally transduced with a pTRIPZ shRNA vector for transcriptional profiling of CEBPE. Two controls (empty and non-targeting) and two CEBPE shRNAs (V3THS_150517(A13), V3THS_404312(G3) Dharmacon, GE) were transduced into REH cells. Cells were treated with 1ug/ml doxycyclin for 144hrs and total RNA purified using Qiagen RNeasy. Knock down of CEBPE was validated by qRT. RNA integrity >7.7 for all samples. Libraries were prepared using NEBNext Ultra II Directional RNA Library Prep Kit and sequenced on an Illuimna HiSeq 2500 using 2x101 version 4 paired end chemistry. 3 biological replicates of each samples were prepared.	Illumina HiSeq 2500 Illumina MiSeq	1
EGAD00001004085	In this study, we have examined microbial infection in brain tissue from 9 control samples from healthy patients and 10 samples from patients diagnosed with Multiple sclerosis, by Next-generation sequencing NGS using Miseq sequencing platform (Illumina).	Illumina MiSeq	19
EGAD00001004086	We will take a bone marrow aspirate and peripheral blood samples from a healthy patient aged around 60, and use flow cytometry to isolate 100 HSCs, 50 MEPs, and 50 GMPs. We will grow these up into colonies, then whole genome sequence each colony. Somatic mutations will act as a unique barcode for each clone. We will then design a panel for targeted resequencing of the mutations that we find. It will then be possible to look for these mutations in the peripheral blood over several years, to see the dynamics of how HSCs contribute to the peripheral blood in health. This dataset contains all the data available for this study on 2018-04-19.	HiSeq X Ten Illumina HiSeq 2500	207
EGAD00001004087	We took a bone marrow aspirate and peripheral blood samples from a healthy patient aged around 60, and use flow cytometry to isolate 100 HSCs, 50 MEPs, and 50 GMPs. We grew these up into colonies, then whole genome sequenced each colony. Somatic mutations act as a unique barcode for each clone. We have designed a panel for targeted resequencing of the mutations that we find. We are now looking for these mutations in the peripheral blood, to see the dynamics of how HSCs contribute to the peripheral blood in health. This dataset contains all the data available for this study on 2018-04-19.	Illumina HiSeq 2500	48
EGAD00001004088	Multiple primary tumors (MPT) affect a substantial proportion of cancer survivors and may result from various causes including inherited predisposition. Currently, germline genetic testing of MPT cases for cancer predisposition gene (CPG) variants is mostly targeted by tumor type. We ascertained pre-assessed MPT cases from genetics centers (defined as ≥2 primaries by age 60 years or ≥3 by 70) and performed whole genome sequencing (WGS) on 460 individuals from 440 families. Despite previous negative genetic assessment/molecular investigations, pathogenic variants in moderate and high-risk CPGs were detected in 67/440 (15.2%) of probands. WGS detected variants that would not be (or were not) detected by targeted resequencing strategies including structural variants at low frequency (6/440 (1.4%) of probands). In most individuals with a germline variant assessed as pathogenic or likely pathogenic (P/LP), at least one of their tumor types was characteristic of variants in the relevant CPG. However, in 29 probands (42.2% of those with a P/LP variant) the tumor phenotype appeared discordant. The frequency of individuals with truncating or splice site CPG variants and at least one discordant tumor type was significantly higher than a control population (χ2=43.642 P=<0.0001). 2/67 (3%) of probands with P/LP variants had evidence of multiple inherited neoplasia allele syndrome (MINAS) with deleterious variants in two CPGs. Summing together variant detection rates from a similarly ascertained previous MPT case series, the present results suggest that first-line comprehensive CPG analysis in a clinical genetics referral-based MPT cohort would detect a deleterious variant in about a third of cases.	Illumina HiSeq 2000	81
EGAD00001004090	This dataset contains the aligned whole genome sequencing data of cell line 380. This cell was established from the peripheral blood of a 15-year-old boy with acute lymphoblastic leukemia at relapse, showing an immature phenotype and carrying an IGH-MYC (t(8;14)) as well as an IGH-BCL2 (t(14;18)) chromosomal translocation. The sequencing was performed on an Illumina X-ten sequencer.	HiSeq X Ten	1
EGAD00001004091	Cancer gene panel (T200.1) sequencing data from tumor/normal pairs for adult type ovarian granulosa cell tumor sequencing project. This data set contains 55 tumor panel sequencing data and 44 matched normal panel sequencing data generated using the MD Anderson Cancer Center T200.1 cancer gene hybrid capture platform, with sequencing performed on an Illumina HiSeq 2000. This dataset contains BAM files generated by aligning paired-end reads to the hg19 reference genome.	Illumina HiSeq 2000	99
EGAD00001004092	The dataset "Low-coverage Whole Genome Sequencing, colorectal adenomas NKI-AvL TGO series NGS-ProToCol" includes 30 fastq files from single-end low-coverage WGS on Illumina HiSeq2500 for 30 snap-frozen colorectal adenomas.	Illumina HiSeq 2500	30
EGAD00001004093	The dataset "Low-coverage Whole Genome Sequencing, colorectal carcinomas NKI-AvL TGO series NGS-ProToCol" includes 30 fastq files from single-end low-coverage WGS on Illumina HiSeq2500 for 30 snap-frozen colorectal carcinomas.	Illumina HiSeq 2500	30
EGAD00001004094	The dataset "Low-coverage Whole Genome Sequencing, normal adjacent colon NKI-AvL TGO series NGS-ProToCol" includes 18 fastq files from single-end low-coverage WGS on Illumina HiSeq2500 for 18 snap-frozen normal adjacent colon tissues.	Illumina HiSeq 2500	18
EGAD00001004095	Whole exome sequencing (fastq files) of 41 pairs (82 samples) of myxofibrosarcoma	Illumina HiSeq 2000	82
EGAD00001004096	We sequenced the coding exons of core genes involved in telomere maintenance using peripheral blood DNA of 192 CRC patients. The primary sequencing data were generated by using Ion Torrent Personal Genome Machine® (PGM™) platform.	Ion Torrent PGM	192
EGAD00001004098	siRNA knockdown of 43 Allelic Imbalance target TFs followed by mRNA-seq done in triplicates in three (GP5D, LoVo, COLO320DM) different cell colorectal adenocarcinoma cell lines.	Illumina HiSeq 2000 Illumina HiSeq 4000	426
EGAD00001004099	Chip-exo and Chip-nexus for FOXA1, HNF4A, KLF5, MYC, and TCF7L2 in colorectal cancer cell lines LoVo, GP5D, COLO320DM	Illumina HiSeq 4000	23
EGAD00001004100	Whole genome sequencing of commercial LoVo, GP5D, COLO320DM, CaCo-2 and RPE1 cell lines and three RPE1-TP53 knock-out cell lines separated by 6 months of culture from their most recent common ancestor.	HiSeq X Ten Illumina HiSeq 2500	8
EGAD00001004101	Target sequencing (fastq files) of 99 pairs (198 samples) of myxofibrosarcoma	Illumina HiSeq 2000 Illumina MiSeq	198
EGAD00001004102	RNA sequencing (fastq files) of 29 samples of myxofibrosarcoma	Illumina HiSeq 2000	29
EGAD00001004104	Clonally expanded human pluripotent and adult (liver + intestine) stem cell clones were subjected to whole genome sequencing to determine the mutational impact of in vitro culture	HiSeq X Ten NextSeq 500	11
EGAD00001004105	Clonally expanded liver adult stem cell clones of healthy liver and cirrhotic liver (due to alcohol abuse, NASH and PSC), as well as biopsies of liver cancers were subjected to whole genome sequencing to determine the mutational impact of precancerous liver disease	HiSeq X Ten NextSeq 500	44
EGAD00001004106	The gut microbiota composition is unique to every individual but is shaped by common factors including diet, lifestyle, medication use, early-life determinants, living environment or genetics. Most of these factors may be influenced by ethnicity. This study explored variations in fecal microbiota composition in 6048 individuals with different ethnic backgrounds living in the same geographical area (Amsterdam, the Netherlands). The HELIUS data are owned by the Amsterdam University Medical Centers, location AMC in Amsterdam, The Netherlands. To allow sharing of microbiome data collected in HELIUS with (inter)national researchers, 16s rRNA sequence analysis has been stored at the European genome-phenome archive (EGA; accession code EGAD00001004106). This requires that access needs to be granted, also because the HELIUS data are stored with relevant phenotypical variables. Access is granted to all researchers affiliated with an internationally recognized research institution who request to use the HELIUS data within the EGA context, after having signed the data transfer agreement. Any researcher can request the data by submitting a proposal to the HELIUS Executive Board as outlined at http://www.heliusstudy.nl/en/researchers/collaboration, by email: heliuscoordinator at amsterdamumc dot nl. The HELIUS Executive Board will check proposals if they do not conflict with ethical approvals and informed consent forms of the HELIUS study.	Illumina MiSeq	6056
EGAD00001004108	The whole blood of six female volunteers and sperm from one male volunteer were used to extract genomic DNA using a DNeasy Blood & Tissue Kit (QIAGEN). 500 ng gDNA was fragmented into 300 bp by Covaris. Then, the libraries were constructed using a KAPA Hyper Prep Kit (Kapa Biosystems). In total we have 7 samples and the files we uploaded are pair-end fastq files.	Illumina HiSeq 4000	7
EGAD00001004109	Dataset included RNA-seq data (Two Fastq files per sample as paired end sequencing was performed) from ribosomal-depleted total RNA in 28 Follicular Lymphoma (FL) criopreserved samples to analyze long non-coding RNA and coding transcript expression profiles. Sample metadata is referred to histological groups of FL tumors (FL1-3A versus FL3B/DLBCL) either in tumor purified cell samples (N=12) as in unpurified tumor samples including normal cells of the lymph node microenvironment (N=16).	Illumina HiSeq 2000	28
EGAD00001004111	Reverse-stranded paired-end 75 base-pair RNA sequencing libraries of 93 metastatic FFPE samples were constructed using Illumina Total RNA Stranded Kits. Ribosomal RNAs (rRNAs) were depleted by using the Ribo-Zero rRNA Removal Kit (Illumina). Libraries were sequenced on a HiSEQ2500 machine. Five samples were re-sequenced using paired-end 50 base-pair libraries due to the smaller insert sizes.	Illumina HiSeq 2500	93
EGAD00001004112	This data set consist genomic information of 10 Chordoid Glioma samples: - Exome sequencing: 10 tumors and matched normal DNA for four of them (BAM files) - RNAseq : 10 tumors (fastq files) - CNV array: 9 tumors (IDAT files)	NextSeq 500	10
EGAD00001004113	DNA (n=1281) and RNA (n=767) were extracted from bone marrow aspirates where CD138+ selection had been performed to enrich plasma cells from patients with monoclonal gammopathy of undetermined significance (MGUS), smoldering multiple myeloma (SMM), or multiple myeloma (MM). DNA and/or RNA were sent to Foundation Medicine where targeted sequencing was performed using their Foundation 1 Heme panel. Resulting BAM files were returned along with annotations for somatic events including single nucleotide mutations, indels and structural rearrangements.	Illumina HiSeq 4000	1281
EGAD00001004114	The failure to develop effective therapies for paediatric glioblastoma (pGBM) and diffuse intrinsic pontine glioma (DIPG) is in part due to their intrinsic heterogeneity. Analysis of 142 sequenced cases revealed multiple tumour subclones, spatially and temporally co-existing in a stable manner as observed by multiple sampling strategies. This dataset provides multi region sequencing of high grade gliomas and diffuse intrinsic pontine gliomas from 15 patients. DNA was extracted from FFPE sections in 2-13 regions of each tumour and sequenced with Agilent SureSelect whole exome sequencing. Germline DNA was also sequenced in 14 cases. Data was aligned to hg19 with bwa and is provided as 79 separate BAM files.	Illumina HiSeq 2000	79
EGAD00001004115	Whole genome sequencing reads consisting of paired end Fastq and aligned bam files from pediatric medulloblastoma samples.	HiSeq X Ten	22
EGAD00001004116	RNA sequencing of paediatric high grade gliomas and diffuse intrinsic pontine gliomas. RNA was sequenced from fresh frozen surgical material or from primary cells cultured under stem cell conditions. RNA was subjected to Illumina whole transcriptome paired end sequencing. Data is provided as paired-end FASTQ files	Illumina HiSeq 2000	16
EGAD00001004117	Tumor DNA was extracted from 100 bone marrow aspirate samples where CD138+ selection had been performed to enrich plasma cells from patients with multiple myeloma. Patient matched control DNA from either peripheral blood leukocytes or CD34+ stem cell harvests was also isolated. Both tumor and control DNA underwent library preparation using the Hyperplus kit (KAPA Biosystems) and were hybridized to baits for a targeted SeqCap myeloma panel (Nimblegen) encompassing 129 genes, regions for SNPs for copy number determination, and the IGH, IGK, IGL loci, as well as approximately 5 Mb surrounding the MYC locus. Samples were sequenced on a HiSeq2500 using 100 bp paired end reads. Resulting BAM files were returned along with annotations for somatic events including single nucleotide mutations, indels and structural rearrangements.	Illumina HiSeq 2500	200
EGAD00001004118	We have studied a unique case of astroblastoma arising in a 6 year-old girl, with multiple recurrences over a period of 10 years, with the pathognomonic MN1:BEND2 fusion 11 surgical samples from either fresh frozen of paraffin embedded material and 1 blood sample were subjected to Illumina short read whole exome sequencing using Agilent SureSelect whole exome v4. Data is provided as 15 BAM files aligned to hg19 with bwa.	Illumina HiSeq 2000	15
EGAD00001004119	Chromatin immunoprecipitation (ChIP) was carried out employing antibodies against H3K36me3 and RNA polymerase II using the HistonePath and TranscriptionPath assays by ActiveMotif. Whole genome sequencing was carried out using an Illumina HiSeq2000 and data is provided as 6 BAM files. H3K36me3 chipseq RNA polymerase II chipseq and input coverage for each cell line.	Illumina HiSeq 2000	6
EGAD00001004121	A total of 14 samples that has been analyzed with the Spatial Transcriptomics method. H&E stain can be sent if requested.	NextSeq 500	14
EGAD00001004122	Set of multi-region sequenced breast cancer primary samples, lymph nodes and ctDNA. We collected samples from 11 breast cancer patients with lymph node involvement but no sign of distant metastasis. We performed a mix of whole-exome sequencing, targeted capture sequencing, and whole-genome sequencing of primary tumour samples and lymph nodes, as well as targeted capture sequencing of circulating tumour DNA.	Illumina HiSeq 2500	183
EGAD00001004123	Whole genome sequencing of 5 paediatric glioma cells lines - KNS42, SF188, UW479, RES186 and RES259. Illumina paired end sequencing is provided as 5 BAM files aligned to hg19 with bwa.	Illumina HiSeq 2000	5
EGAD00001004124	CRISPR-Cas9 genome editing is widely used to study gene function, from basic biology to biomedical research. Structural rearrangements are a ubiquitous feature of cancer cells and their impact on the functional consequences of CRISPR-Cas9 gene-editing has not yet been assessed. Utilizing CRISPR-Cas9 knockout screens for 250 cancer cell lines, we demonstrate that targeting structurally rearranged regions, in particular tandem or interspersed amplifications, is highly detrimental to cellular fitness in a gene independent manner. In contrast, amplifications caused by whole chromosomal duplications have little to no impact on fitness. This effect is cell line specific and dependent on the ploidy status. We devise a copy-number ratio metric that substantially improves the detection of gene-independent cell fitness effects in CRISPR-Cas9 screens. Furthermore, we develop a computational tool, called Crispy, to account for these effects on a single sample basis and provide corrected gene fitness effects. Our analysis demonstrates the importance of structural rearrangements in mediating the effect of CRISPR-Cas9-induced DNA damage, with implications for the use of CRISPR-Cas9 gene-editing in cancer cells.	Illumina HiSeq 2000	12
EGAD00001004125	Data collected as part of the Normal prostatectomy project analysis. Whole genome sequencing (WGS, targeted at 30X for normal tissue and 50X for tumour tissue) was performed on morphologically normal tissue samples from 30 patients with prostate cancer. In addition, seven prostate tissue samples were sequenced from 7 non-cancer patients: two collected after a cystoprostatectomy and five from samples collected at autopsy. Matched blood controls were included for all patients. An extra five samples were sequenced from the stroma of cell cultured fibroblasts. In addition a few tumour samples obtained at prostatectomy and their blood matched controls are included in this dataset from the main study that are not included elsewhere.	Illumina HiSeq 2000	71
EGAD00001004126	Sequence data in fastq format was aligned to the GRCH38 reference genome. Aligned sequence was preprocessed with GATK for Indel Realignment and Base Quality Score Recalibration. Duplicates were marked with Picard Mark Duplicates. Aligned sequence is in bam format. Details of the alignment can be found int he bam header. In total, data generated from 174 tumour samples 102 matched blood normal controls was aligned. Tumour samples were classified as Anaplastic Thyroid, Poorly-differentiated or well-differentiated cancers.		-
EGAD00001004127	Sequence was aligned to the GRCH38 reference genome. Aligned sequence was analyzed with GATK Haplotype Caller, to generate germline variant calls across the SureSelect All Exon V5+UTR target region. Variant calls are in VCF format. In total there are samples from 173 donors. 101 donors have calls generated from both normal and tumour samples tumour samples, 94 of which have a matched normal. Details for the call can be found in the vcf headers.		-
EGAD00001004128	Sequence was aligned to the GRCH38 reference genome. Aligned sequence was analyzed with SomaticSniper. Somatic variant calls are in VCF format. In total there are 94 tumour samples, each with a matched normal.		-
EGAD00001004129	Sequence was aligned to the GRCH38 reference genome. Aligned sequence was analyzed with GATK/MuTect, to generate somatic variant calls across the SureSelect All Exon V5+UTR target region. Somatic variant calls are in VCF format. In total there are 166 tumour samples, 94 of which have a matched normal. Somatic variants for tumours without a matched normal, were called against a panel of normals. Details for the mutect call can be found in the vcf header.		-
EGAD00001004130	Whole genome sequencing of cutaneous melanoma skin and brain metastases and matched normal DNA, as well as RNA sequencing of material from the skin and brain metastases. In addition, RNA sequencing was performed for Dabrafenib and Trametinib treated patient-derived xenografts, together with untreated and vehicle treated controls.	HiSeq X Five Illumina HiSeq 2500 Illumina HiSeq 4000	9
EGAD00001004131	Pheno-seq is a new approach that integrates high-throughput imaging and transcriptomic profiling of clonal spheroids/organoids to dissect functional tumor cell heterogeneity in 3D cell culture systems. The method is based on the iCELL8 technology (TakaraBio) that uses barcoded nanowells and a micro-solenoid valve dispenser. The CRC_spheroid dataset contains demultiplexed RNA-sequencing profiles (FASTQ file format, NextSeq 500) of 95 clonal tumor spheroids derived from a patient with colorectal cancer.	NextSeq 500	1
EGAD00001004132	This dataset has two Variants Files in VCF format used in ABB project (https://github.com/Francesc-Muyas/ABB). One has the variants found in a Rare Variant Association Study performed in CLL patients. This has 1217 samples represented. The other variant file has 209 SNPs predicted in 10 samples by GATK HaplotypeCaller and selected for Sanger Sequencing Validation. Raw reads were aligned against the Human Reference genome (Hg19) with BWA mem and variants were obtained using GATK HaplotypeCaller.		1217
EGAD00001004133	Epigenetic profiling of colorectal cancer initiating cells (CC-ICs) to identify bivalently marked genes (H3K4me3 and H3K27me3 ChIP-seq), and investigation of changes in transcriptome following EZH2 inhibition using RNA-seq.	Illumina HiSeq 2500 NextSeq 500	17
EGAD00001004134	The dataset includes sequencing data generated using the TruSight Cancer Panel (TSCP) a targeted NGS assay for analysis of CPGs and orthogonally generated data supporting at least one pathogenic variant in a CPG for a total of 645 pathogenic CPG variants. The set of pathogenic CPG variants includes strong representation of some of the most challenging types of pathogenic variants, with 339 indels, including 16 complex indels and 24 insertions or deletions with length greater than 5bp, and 74 exon CNVs, including 23 single exon CNVs. There are 502 pathogenic variants in BRCA1 or BRCA2, making this an important first-line validation dataset for laboratories performing NGS testing of BRCA1 and BRCA2.	Illumina HiSeq 2500	639
EGAD00001004135	Synovial sarcoma (SS) is defined by a recurrent t(x;18) chromosomal translocation, which produces the hallmark SS18-SSX oncogenic fusion. Incorporation of SS18-SSX into BAF complexes renders BAF complexes aberrant in two distinct manners: the addition of 78aa of SSX onto SS18, and concomitant loss of BAF47 assembly. However, the importance and functional contributions of each of these perturbations on BAF complex targeting and gene expression regulation remain unclear. Here we use an integrative set of genomic approaches in human cancer cell lines and primary tumor samples to define the mechanistic consequences of the SS18-SSX fusion oncoprotein. We find that SS18-SSX hijacks BAF complexes to broad polycomb domains to activate bivalent genes, driving a unique gene expression program distinct from other loss-of-function BAF complex malignancies. Importantly, restoration of BAF47 rescues enhancer activation but is dispensable for proliferative arrest in cell lines. These results demonstrate that gain-of-function SS18-SSX-mediated BAF complex targeting and gene activation is the driving event in SS, and present a mechanism by which distinct functions of BAF complexes can be co-opted to drive oncogenesis.	Illumina HiSeq 2000 NextSeq 500	85
EGAD00001004136	The overall goal of the Identification of recurrent mutations in Cushing’s disease project is to study the impact of whole-exome sequencing (WES) on the clinical care of cancer patients and oncology provider practices. The aims of Project are to implement and establish the feasibility of WES in patients with USP8 wild-type corticotroph adenomas; to develop a framework for the understanding of the molecular mechanism of the pathogenesis of corticotroph adenoma.	Illumina HiSeq 2500	44
EGAD00001004137	WGS sequencing for 409 cases (832 samples) from the ICGC ESAD-UK project Tumours 50x Normals 30x HiSeq X BAM files These samples are all available in ICGC release 28	Illumina HiSeq 2000	1
EGAD00001004138	The dataset is comprised of seven samples, one blood sample (germline control) of the patient, one neuroblastoma metastasis from the bone marrow and five derived cell models. The models include the primary culture, the first xenogenograft passage, a monolayer culture derived from the first xenograft passage and two samples of the fourth xenograft passage, cells and supernatant. For all these samples, whole-exome sequencing data have been generated. The BAM files contain the alignments against the human genome, assembly GRCh37, and also the unaligned reads.	Illumina HiSeq 4000	7
EGAD00001004139	This dataset consists of 44 compressed paired fastq files, 15 of which are generated from whole exome sequencing, and 29 of which are generated from DNA sequencing using a targeted gene panel capturing the exonic regions of 73 prostate cancer driver genes. Targeted DNA sequencing was performed on an Illumina MiSeq (v3 600 cycle kit), and exome sequencing was done using an Illumina HiSeq 2500 (v4 250 cycle kit) machine. The fastq files are named in accordance with the sample aliases provided, which reflect the pathology of interest to this study (small cell prostatic carcinoma--SCPC), whether it was sequenced using an exome or targeted gene panel, whether the FFPE sample was sourced from tumor or benign tissue (labeled T or B, respectively), and whether there exists multiple samples belonging to a single patient.	Illumina HiSeq 2500 Illumina MiSeq	44
EGAD00001004140	Whole-genome sequencing (WGS) was performed for 13 pairs of tumor-normal samples from patients diagnosed with NKTL. Genomic DNA from tumor tissue was extracted with QIAamp DNA Mini Kit. The DNA for the matching normal was obtained from blood or buccal swabs and purified by Blood and Cell Culture DNA Mini kit or E.Z.N.A. Tissue DNA Kit (Omega Bio-tek) according to manufacturer’s instructions. The quantity and quality were assessed by Quant-iT PicoGreen dsDNA Assay Kit (Invitrogen) and agarose gel electrophoresis. All sequencing libraries were prepared using TruSeq Nano DNA Library Prep Kit (Illumina). Paired-end sequencing was performed on Illumina HiSeq 2000 or HiSeq X Ten as 2x101 bp or 2x151 bp, respectively. 8 NKTL FFPE specimens were screened for somatic mutations using deep targeted capture sequencing (TCS). FFPE rolls or slides were extracted using QIAamp DNA FFPE Tissue kit (QIAGEN). The FFPE genomic DNA was treated with NEBNext FFPE DNA Repair Mix and assessed by Quant-it PicoGreen dsDNA Assay Kit (Invitrogen). The library was generated from 10-200 ng DNA with SureSelectXT Low Input Target Enrichment System for Illumina Paired-End Sequencing Library (Agilent Technologies) according to manufacturer’s instructions. RNA based probe was designed with SureDesign (Agilent Technologies) to target-capture 140 genes. Next, the captured libraries were pooled in equimolar concentration and sequenced on Illumina Novaseq 6000 platform with SP or S1 chip. Reads aligning to 40 selected genes were isolated post-alignment for this submission. Prefix used in filenames: T - Tumor samples N - Matched-Normal samples	HiSeq X Ten Illumina HiSeq 2000 Illumina NovaSeq 6000	34
EGAD00001004141	This study contain the WGS and RNA-seq aligned bam files for this particular inflammatory hepatocellular adenoma sample.	Illumina HiSeq 2000 Illumina HiSeq 4000	2
EGAD00001004142	146 DNA samples obtained from 73 DLBCL patients (matching tumor and normal) were sequenced with PCR free 1.0 genome shotgun sequencing. All files are in bam format.		146
EGAD00001004143	Tumor exome reads consisting of bam files from jaw samples.	Illumina HiSeq 2500	18
EGAD00001004144	This dataset contains FASTQ files obtained through whole exome sequencing of glioma and matched blood samples.	Illumina HiSeq 2500	117
EGAD00001004145	The saliva microbiota of 972 Finnish children, aged 9-14 years was characterized using the 16S rRNA (V3-V4) gene sequencing with Illumina Hiseq platform.	Illumina HiSeq 2500	972
EGAD00001004146	Total RNA-seq of intestinal gluten tetramer+ and tetramer- CD4+ T-cells from celiac disease patients, as well as intestinal CD4+ T-cells from healthy control individuals (paired-end fastq files).	NextSeq 500	14
EGAD00001004147	The dataset contains three BAM files that include SPATC1L variants identified in Italian patients affected by hearing loss (both hereditary and age-related hearing loss). Data have been produced by whole exome sequencing and targeted re-sequencing, using Ion Proton and Ion Torrent PGM platforms respectively.	Ion Torrent PGM Ion Torrent Proton	3
EGAD00001004148	bulk RNA-seq data of the 5 HCC patinets. Single cell RNA seq data of these patients was under the accession number EGAD00001003337	Illumina HiSeq 4000	5
EGAD00001004149	bulk Exome-seq data of the 5 HCC patinets. Single cell RNA seq data of these patients was under the accession number EGAD00001003337	Illumina HiSeq 4000	10
EGAD00001004150	This data set contains whole exome sequences of individuals with self-stated parental relatedness from the East London Genes & Health cohort. Rare frequency functional variants in these healthy individuals will be studied with respect to the genetic health of the participants and loss-of-function analysis of human genes. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2018-06-06.	Illumina HiSeq 4000	-
EGAD00001004151	A case-control series of melanoma cases from Leeds, UK have been sequenced in the Fluidigm platform to identify genetic variants associated with sporadic melanoma development. Samples in which potentially contributing variants have been detected are being sequenced in an orthogonal platform for variant confirmation. . This dataset contains all the data available for this study on 2018-06-06.	Illumina HiSeq 4000	201
EGAD00001004152	Targeted pulldown of approx 60 ffpe normal samples to use as normal controls . This dataset contains all the data available for this study on 2018-06-06.	Illumina HiSeq 2500	80
EGAD00001004153	Gastric neuroendocrine tumors (gNETs) occur with an estimated frequency of 2 per 100,000 in the general population. Type I gastric neuroendocrine tumors (NETs) represent the 75% of gNTEs and arise from gastric enterochromaffin-like (ECL) cells. They have late age of onset and usually benigh course. Classically, hypergastrinemia in patients who have autoimmune atrophic gastritis, causes hyperplasia of gastric ECL cells that progresses into type I gastric NETs and parietal cell (PC) destruction. The genetic bases in families with this disease are unknown. We performed an exome sequencing study of an atypical aggressive familial gNETs case (with early age onset, nodal infiltrations and gastric adenocarcinomas) that followed a recessive model. We identified a deleterious mutation in homozygosis in the ATP4A gene, which encodes the proton pump responsible for acid secretion by gastric parietal cells. This mutation lead to achlorhydria first, and hypergastrinemia and gNET developing as consequence (Calvete et al. 2014). Recently, two more families with gNETs, classical clinical traits and recessive model have been studies by WES but we didn't find any mutation in the ATP4a gene. However, putative mutations affecting genes that contribute to the development and the integrity of PC have been found suggesting that genetic alterations associated to this disorder target to a unique cell type (parietal cells). In order to cinfirm this hypothesis, it is necessary the search for new genes implicated in the gNETs, more familial cases are needed to be studied. We have identified four more new familial gNETs cases. Here, we propose their study by WES. The first family is formed by thress siblings with gNETs. The other families include two siblings with gNETs. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2018-06-06.	Illumina HiSeq 2500	7
EGAD00001004154	This data set is comprised of data from seven distinct high grade serous epithelial ovarian cancer (HGS-EOC) partients, from whom multiple biopsies were taken at the time of surgery, from the ovary and from different locations in the peritoneal cavity. This data set contains 28 samples, sequenced with a whole exome sequencing approach.	NextSeq 500	28
EGAD00001004155	Genotype calls for 83 Aboriginal Australian genomes split by chromosomes. In short, genotypes were called individually with samtools. They were subsequently filtered with thresholds related to sequencing depth, location of variants, sequencing error, and strand bias. Once combined, the genotypes were filtered when not in Hardy-Weinberg equilibrium. The genomes were phased with IMPUTE using the 1000 Genomes reference panel. NB: for the Y chromosomes, only the 44 Aboriginal Australian males are included.		83
EGAD00001004156	High-coverage whole genome sequences were collected to study patterns of genomic variation across the broad geography of Indonesia and New Guinea. This region has experienced an extremely complex demographic history, including repeated bouts of admixture with archaic and modern human groups. We have sequenced the genomes of 161 individuals from 14 populations spanning this geographical region, from communities close to mainland Asia through to New Guinea.	HiSeq X Ten	161
EGAD00001004157	Five subjects from pedigree with co-occurrence of neurofibromatosis type 1 and moyamoya were sequenced in duplicate (0 and1). Kinship and phenotype: NF025, NF026 and NF027 were sibling all affected by neurofibromatosis type 1. NF026 also presented moyamoya. NF0262 and NF0263 were sibling both affected by neurofibromatosis type 1. NF0262 also presented moyamoya. NF026 and NF0262 were first cousins.	Illumina HiSeq 1000	10
EGAD00001004158	The extent to which cells in normal tissues accumulate mutations during life is poorly understood. Some mutant cells expand into clones that can be detected by genome sequencing. We mapped mutant clones in normal esophageal epithelium from nine donors aged 20-75. Somatic mutations accumulate with age and are mainly caused by intrinsic mutational processes. We found strong Darwinian selection of clones carrying mutations in 14 cancer genes, with tens to hundreds of such clones per square centimeter. By middle age, clones with cancer-associated mutations cover most of the epithelium, with NOTCH1 and TP53 mutations affecting 40% and 10% of all cells, respectively. Remarkably, the prevalence of NOTCH1 mutations in normal esophagus is several times higher than in esophageal cancers. The esophagus emerges as an evolving patchwork of mutant clones that colonize the majority of the epithelium, with implications for our understanding of cancer and ageing.	Illumina HiSeq 2500	-
EGAD00001004159	The extent to which cells in normal tissues accumulate mutations during life is poorly understood. Some mutant cells expand into clones that can be detected by genome sequencing. We mapped mutant clones in normal esophageal epithelium from nine donors aged 20-75. Somatic mutations accumulate with age and are mainly caused by intrinsic mutational processes. We found strong Darwinian selection of clones carrying mutations in 14 cancer genes, with tens to hundreds of such clones per square centimeter. By middle age, clones with cancer-associated mutations cover most of the epithelium, with NOTCH1 and TP53 mutations affecting 40% and 10% of all cells, respectively. Remarkably, the prevalence of NOTCH1 mutations in normal esophagus is several times higher than in esophageal cancers. The esophagus emerges as an evolving patchwork of mutant clones that colonize the majority of the epithelium, with implications for our understanding of cancer and ageing.	HiSeq X Ten	-
EGAD00001004160	We compared bacterial communities in breast milk from teen (≤19 yr, n = 26) vs. adult (>19 yr, n = 56) mothers, normal weight (BMI 18.5-24.9, n = 63) vs. overweight (BMI ≥ 25, n = 19) mothers, primiparous (parity = 1, n = 41) vs. multiparous (parity > 1, n = 44), early (5-46d postpartum, n = 39) vs. established lactation (4-6 mo postpartum, n = 45), breastfeeding (EBF: PBF, n = 72) vs. mixed feeding (n = 11) and mothers with (Na/K ratio < 0.6, n = 75) and without SCM (Na/K ration ≥ 0.6, n = 10).	Illumina MiSeq	86
EGAD00001004161	BAM files from 5 CCND1-negative MCL cases. 4 BAM files corresponded to long insert size Mate Pair-WGS and 3 to WES. In 2 of the cases both technologies were performed.	Illumina HiSeq 2000	7
EGAD00001004162	Undifferentiated sarcomas (USARC) of adults are diverse, rare and aggressive soft tissue cancers. Recent efforts have confirmed that USARC exhibit one of the highest burdens of structural aberrations across human cancer. Here, we sought to unravel the genomic basis of this structural complexity by integrating whole genome sequencing, ploidy analysis and methylation profiling of 53 USARC. We identified whole genome doubling as a prevalent and pernicious force in USARC tumourigenesis. Deconvolution of the complex copy number and rearrangement landscapes show distinct signatures associated with chromothripsis, early-haploidy, and successive whole-genome-doubling events, suggesting four divergent models of sarcoma development. We show similar distinct evolutionary tumourigenic pathways in different sarcoma subtypes from the Cancer Genome Atlas. Thirteen percent of tumours exhibited a hypermutator phenotype, opening new avenues for clinical management such as immunotherapy, whilst the period prior to and between genome doubling events may represent clinically relevant interventional points in USARC.	HiSeq X Ten	56
EGAD00001004163	Cancer genomes are frequently characterized by numerical and structural karyotypic abnormalities. Here we combined an inducible centromere-specific inactivation approach with selection for a conditionally essential gene, a strategy we term ‘CEN-SELECT’, and show that single-chromosome missegregation during cell division can directly drive a broad spectrum of structural rearrangement types. Cytogenetic profiling revealed that missegregated chromosomes are 120-fold more susceptible to developing seven major categories of structural variants, including translocations, insertions, deletions, and reassembly into chromothriptically rearranged chromosomes. Whole-genome sequencing of clones with genetically propagatable derivative chromosomes identified complex rearrangements and copy-number alterations that can result in gene inactivation or extrachromosomal gene amplification. We conclude that chromosome segregation errors are sufficient to drive extensive structural variation that recapitulates those commonly associated with human cancers.	HiSeq X Ten Illumina HiSeq 2000	22
EGAD00001004164	Whole exome and RNA-seq of matched normal gastric mucosa (n=34) and gastric cancer tissues (n=34) from gastric cancer patients (n=34)	Illumina HiSeq 4000	136
EGAD00001004168	The illumina exome chip genotyping data for 943 PDAC cases and 3,908 controls in the Chinese population. Genotypes were called by the Illumina GenomeStudio software, and the selected variants were re-called by zCall. Standard quality control were performed.		4856
EGAD00001004169	PBMCs were purified from blood samples of 8 HTLV-1 infected individuals, and cryo-preserved in fetal calf serum containing 10% DMSO. DNA from each samples was extracted using Qiagen Blood & Tissue kit according to the manufacturer's protocol. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/ .	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina MiSeq NextSeq 500	97
EGAD00001004171	The dataset contains one BAM file that includes a SLC9A3R1 variant identified in two Italian patients affected by age-related hearing loss. Data have been produced by targeted re-sequencing, using Ion Torrent PGM platform.	Ion Torrent PGM	1
EGAD00001004172	This dataset contains targeted amplicon sequencing of Germline DNA extracted from 56 blood samples. They were sequenced on Illumina HiSeq 2500 and aligned to human genome assembly GRCh37 (hg19)to produce 127 bam files (2-3 technical replicates per sample).	Illumina HiSeq 2500 Illumina MiSeq	55
EGAD00001004173	This dataset contains targeted amplicon sequencing of DNA extracted from 300 samples of 142 patients (158 methanol-fixed relapse biopsies and 142 FFPE archival diagnostic tissues). Samples were sequenced on Illumina HiSeq 2500 and were aligned to human genome assembly GRCh37 (hg19)to produce 600 bam files (2 technical replicates per sample).	Illumina HiSeq 2500 Illumina MiSeq	300
EGAD00001004174	This dataset contains 319 bam files of shallow WGS data (0.1X) aligned to human genome assembly GRCh37 (hg19) from 300 tumor samples sequenced on HiSeq2500 in SE-50bp mode.	Illumina HiSeq 2500	300
EGAD00001004175	The dataset contains 438 plasma samples and 418 tissues samples from 102 breast cancer patients and 30 benign breast tumor patients. There are two kinds of file types: bam and fastq. Amplicon sequencing and Capture sequencing were used in our experiment.	Ion Torrent PGM NextSeq 500	124
EGAD00001004176	RNA-sequencing data of pediatric B-cell precursor acute lymphoblastic leukemia, including 18 high hyperdiploid cases and 9 ETV6/RUNX1-positive cases. Sequencing libraries were constructed using the Human Ribo-Zero rRNA Removal Kit (Illumina, San Diego, CA) and sequenced on an Illumina NextSeq 500. RNA sequencing data were processed using the TCGA mRNA-seq pipeline (https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/Expression_mRNA_Pipeline/#mrna-analysis-pipeline).	NextSeq 500	27
EGAD00001004179	This dataset contains WES and RNA-Seq fastq files for 65 CML patient samples at various stages of disease progression.	Illumina HiSeq 2000 Illumina HiSeq 2500 NextSeq 500	183
EGAD00001004180	The French ICGC project on liver tumors is coordinated by Pr Jessica Zucman-Rossi and funded by Inca (French Institute for Cancer). The aim of the present project is to identify the catalog of somatic and germline mutations in liver tumors using whole genome (WGS) and whole exome sequencing (WGS), integrated with DNA methylation and RNA sequencing (RNA-seq) data. The present series corresponds to 60 whole exome tumor/normal pairs with matched RNA-seq.	Illumina HiSeq 2000 Illumina HiSeq 4000	120
EGAD00001004183	Tumor transcriptome and whole exome sequencing data (matched tumor/normal for somatic mutation calling) along with key phenotypic information are provided for patients enrolled in the phase 2 IMmotion150 trial, assessing efficacy of atezolizumab monotherapy or combination of atezolizumab and bevacizumab versus standard of care (sunitinib) in 1L renal cell carcinoma. This data set accompanies the respective Nature Medicine publication (PMID: 29867230).	Illumina HiSeq 2500	589
EGAD00001004184	Whole exome NGS data of 21 sucide victims and 23 control patients sequenced on Illumina HiSeq 2000 platform using the Agilent SureSelect Human All Exon + UTRs V5 target enrichment kit. The dataset contains the paired-end unfiltered FASTQ files, the GRCh37 (b37) aligned BAM files mapped by the BWA MEM algorithm, and the variant files in VCF 4.1 format called with the GATK HaploType caller (version 3.3).	Illumina HiSeq 2000	44
EGAD00001004185	These files contain the normalized and raw count abundances miRNA for aSAH patients. These abundances were obtained using Next-Generation Sequencing after selection of the miRNA in the RNA biobank. In total, there are 28 VSP- and 28 VSP+ patients for two-time points. Normalized data were obtained by applying size factor and VSN normalizations as described in Pulcrano-Nicolas et al. Stroke 2018. Raw count data corresponding to the raw abundances of miRNA in aSAH patients.		56
EGAD00001004186	Somatic mutations in epithelial cells from endometriosis and normal uterine endometrium, with a total of 24 samples. Target enrichment was conducted by Agilent SureSelect Human All Exon V5 + IncRNA kit. Sequencing was conducted by Illumina HiSeq 2500 platform. Somatic mutation call was performed by Strelka.		24
EGAD00001004187	One hundred cryopreserved bone marrow and peripheral blood samples from patients with acute myeloid leukemia (AML) with 10-90% blasts were selected from the biobank of the Department of Hematology of Leiden University Medical Center (LUMC). The AML cases cover all subtypes, and specifically include known subtype-defining balanced chromosomal translocations according to the WHO classification. The samples were obtained from 96 patients and include three pairs of de novo and relapsed AML and one pair of de novo and presumed therapy-related AML (tAML). Total RNA was isolated from mononuclear cells without prior enrichment for leukemic blasts. The quality and integrity of total RNA was checked and RNA libraries were prepared using the TruSeq RNA library preparation kit v2 (Illumina, San Diego, CA) in an ISO/IEC 17025-accredited protocol. This workflow started with enrichment of messenger RNA by oligo dT magnetic beads. After fragmentation, cDNA synthesis was performed, followed by adaptor ligation and PCR amplification. Paired-end sequencing with a read length of 126 bp was performed on an Illumina HiSeq 2500 v4 sequencer to at least 12.5 Gbp per sample. Image analysis, base calling, and quality check was performed with Illumina data analysis pipeline RTA v1.18.64 and Bcl2fastq v1.8.4. RNAseq reads are provided in compressed Sanger FASTQ format.	Illumina HiSeq 2500	100
EGAD00001004188	28 Pretreated Ewing sarcoma tumor blood samples were collected from the Hospital for Sick Children (SickKids) and Mount Sinai Hospital in Toronto, Canada in accordance with each institution’s Research Ethical Board (REB) guidelines. Detailed clinical information (age at presentation, gender, tumor site, stage, etc.) were obtained from the corresponding institutional tumor banks. Transcriptome (RNA-Seq) sequencing was performed using established protocols on Illumina instruments.	Illumina HiSeq 2500	28
EGAD00001004189	This dataset includes 111 bam files from WGS sequence data aligned to human genome assembly GRCh37 (hg19) from 56 tumour and matched normal samples. Libraries were constructed with ~350-bp insert length using the TruSeq Nano DNA Library prep kit (Illumina) and sequenced on an Illumina HiSeq X Ten System in paired-end 150-bp reads mode. The average depth was 60× (range 40-101×) in tumours and 40× (range 24-73×) in matched blood samples.	HiSeq X Ten	111
EGAD00001004190	This dataset contains raw sequencing reads for matched MGUS/SMM to MM patient samples, including normal germline controls. FASTQ files were generated on Illumina NextSeq 500 and HiSeq 4000 machines following exome capture using the Agilent Clinical Research Exome kit. DNA was extracted from CD138+CD38++ cells (representing MGUS/SMM/MM cells) and CD138-CD38- (representing normal cells) isolated from bone marrow. 10 patients are included with 3 samples each representing normal, MGUS/SMM, MM stages.	Illumina HiSeq 4000 NextSeq 500	301
EGAD00001004192	The colorectal adenoma-carcinoma sequence has provided a paradigmatic framework for understanding the successive somatic genetic events and consequent clonal expansions leading to cancer. As for most cancer types, however, understanding of the earliest phases of colorectal neoplastic change, which may occur in morphologically normal tissue, is comparatively limited because of the difficulty of detecting somatic mutations in normal cells. Each colorectal crypt is a small clone of cells derived from a single recently-existing stem cell. Here, we sequenced hundreds of normal crypts from 42 individuals. Signatures of multiple mutational processes were revealed, some ubiquitous and continuous, others only found in some individuals, in some crypts or during some phases of the cell lineage from zygote to adult cell. Likely driver mutations were present in ~1% of normal colorectal crypts in middle-aged individuals, indicating that adenomas and carcinomas are rare outcomes of a pervasive process of neoplastic change across morphologically normal colorectal epithelium.	HiSeq X Ten	578
EGAD00001004193	The colorectal adenoma-carcinoma sequence has provided a paradigmatic framework for understanding the successive somatic genetic events and consequent clonal expansions leading to cancer. As for most cancer types, however, understanding of the earliest phases of colorectal neoplastic change, which may occur in morphologically normal tissue, is comparatively limited because of the difficulty of detecting somatic mutations in normal cells. Each colorectal crypt is a small clone of cells derived from a single recently-existing stem cell. Here, we sequenced hundreds of normal crypts from 42 individuals. Signatures of multiple mutational processes were revealed, some ubiquitous and continuous, others only found in some individuals, in some crypts or during some phases of the cell lineage from zygote to adult cell. Likely driver mutations were present in ~1% of normal colorectal crypts in middle-aged individuals, indicating that adenomas and carcinomas are rare outcomes of a pervasive process of neoplastic change across morphologically normal colorectal epithelium.	Illumina HiSeq 2500	1632
EGAD00001004194	Complete Microbiome Metagenomics from feces of 461 IBD patients; The sequencer used was the Illumina HiSeq 2000 with a paired end reads design, reflected in the 2 FastQ format files per sample.	Illumina HiSeq 2000	355
EGAD00001004195	The dataset includes paired end fastq files of whole genome sequencing data on the Illumina platfrom. Individual samples are multiple annealing and looping based amplified single fibroblasts and multiple displacement amplified single T lympocytes, including unamplified bulk samples.	HiSeq X Ten Illumina HiSeq 2500	36
EGAD00001004197	We spiked a small number of placental tissue samples with different combinations of Candida albicans, Plasmodium falciparum, Toxoplasma gondii, Human Cytolomega virus and Salmonella bongori (various combination of the equivalents of 1, 10, 100, 1000 and 10000 genome copies). A DNA isolation was performed on these spiked samples and the resulting DNA was subsequently sequenced by MiSeq (18S). These same samples were also analysed by X Ten to allow for a sensitivity comparison of the two methods of the eukaryotic spiked signals (Candida albicans, Plasmodium falciparum and Toxoplasma gondii). In addition, non-spiked placental samples from 50 cases of Fetal Growth Restriction (FGR) (+ matched healthy controls) and 49 cases of Preeclampsia (+ matched healthy controls) and 100 preterm cases were analyzed for their non-human eukaryotic content.	HiSeq X Ten	7
EGAD00001004198	Metagenomics data of 80 placental tissue samples analyzed by X Ten for their possible microbial content. These 80 samples from pre-labor C-section deliveries, representing Cohort 1, were spiked with 1100 CFU Salmonella bongori. These same samples were also analyzed by 16S amplicon sequencing (search for ERP109246 in ENA).	HiSeq X Ten	79
EGAD00001004199	This dataset was made to verify the computational reconstruction of B cell reseptors from single-cell RNA-seq using BraCeR. The dataset contains BCR-derived reads from single-cell RNA-seq from 13 cells using the Smart-seq2 protocol, as well as targeted BCR-sequencing data from the same cells.		26
EGAD00001004200	RNA samples of bone marrow and cord blood were sequenced by 10x genomics platform.	HiSeq X Ten	4
EGAD00001004201	Multiple signatures of somatic mutations have been identified in human cancer genomes. To investigate whether mutational signatures continue to be generated, and if so their temporal patterns of activity, subsets of cell lines were cultured in vitro for extended periods and subjected to single cell cloning and whole genome or exome sequencing or directly to single cell whole genome sequencing. As expected, signatures of past exogenous exposures, such as tobacco smoke and ultraviolet light, were not generated in vitro. In contrast, signatures of normal and defective DNA repair and replication continued to be generated at essentially constant mutation rates. Signatures of APOBEC cytidine deaminase DNA-editing activity exhibited a distinctive pattern with substantial fluctuations in mutation rate over time and episodic bursts of mutations. The initiating factors for these bursts are unclear although retrotransposon mobilisation may play a role. This cell line set now constitutes a comprehensive resource of live experimental models of mutational processes of both known and unknown aetiologies potentially retaining the patterns of activity and regulatory influences operative in human cells in vivo.	Illumina HiSeq 2000 Illumina HiSeq 2500	75
EGAD00001004202	Multiple signatures of somatic mutations have been identified in human cancer genomes. To investigate whether mutational signatures continue to be generated, and if so their temporal patterns of activity, subsets of cell lines were cultured in vitro for extended periods and subjected to single cell cloning and whole genome or exome sequencing or directly to single cell whole genome sequencing. As expected, signatures of past exogenous exposures, such as tobacco smoke and ultraviolet light, were not generated in vitro. In contrast, signatures of normal and defective DNA repair and replication continued to be generated at essentially constant mutation rates. Signatures of APOBEC cytidine deaminase DNA-editing activity exhibited a distinctive pattern with substantial fluctuations in mutation rate over time and episodic bursts of mutations. The initiating factors for these bursts are unclear although retrotransposon mobilisation may play a role. This cell line set now constitutes a comprehensive resource of live experimental models of mutational processes of both known and unknown aetiologies potentially retaining the patterns of activity and regulatory influences operative in human cells in vivo.	Illumina HiSeq 2500	26
EGAD00001004203	Multiple signatures of somatic mutations have been identified in human cancer genomes. To investigate whether mutational signatures continue to be generated, and if so their temporal patterns of activity, subsets of cell lines were cultured in vitro for extended periods and subjected to single cell cloning and whole genome or exome sequencing or directly to single cell whole genome sequencing. As expected, signatures of past exogenous exposures, such as tobacco smoke and ultraviolet light, were not generated in vitro. In contrast, signatures of normal and defective DNA repair and replication continued to be generated at essentially constant mutation rates. Signatures of APOBEC cytidine deaminase DNA-editing activity exhibited a distinctive pattern with substantial fluctuations in mutation rate over time and episodic bursts of mutations. The initiating factors for these bursts are unclear although retrotransposon mobilisation may play a role. This cell line set now constitutes a comprehensive resource of live experimental models of mutational processes of both known and unknown aetiologies potentially retaining the patterns of activity and regulatory influences operative in human cells in vivo.	HiSeq X Ten	192
EGAD00001004204	We used targeted sequencing to capture and measure the abundance as well as the size profiles of EBV DNA in plasma of subjects with and without NPC	Illumina HiSeq 2500 NextSeq 500	337
EGAD00001004205	Whole-exome sequencing was performed from organoids derived from 10 liver cancer biopsies (7 hepatocellular carcinoma and 3 cholangiocarcinoma), corresponding liver and non-tumoral biopsies. For 3 of the organoids, both early and late passage organoids were sequenced. Whole-exome sequencing was performed using the Agilent Clinical Research Exome capture kit followed by Illumina sequencing. BAM files are provided in this dataset.	Illumina HiSeq 2500	31
EGAD00001004206	This dataset contains 135 H3K27ac ChiP-seq experiments. Monocytes and granulocytes from TB and non-TB samples were obtained, ChIP-seq was performed, and the reads were aligned to hg19.	Illumina HiSeq 2000	161
EGAD00001004207	This dataset includes whole genome sequencing data from 93 Bajau and Saluan individuals that were used in the Ilardo et al 2018 study on adaptation to diving in Sea Nomads. Sequencing libraries were built using the TruSeq Nano DNA Library Preparation Kit on an Illumina NeoPrep instrument. Each pool was sequenced 125 Paired-End over one or two lanes on the Illumina HiSeq2500 (version 4 chemistry). Samples were sequenced to an average depth of 5x.	Illumina HiSeq 2500	93
EGAD00001004208	Dataset contains targeted sequencing data of 712 plasma cell free DNA samples and 428 white blood cell samples collected from 428 men with metastatic prostate cancer. Target capture was performed using a hydridization-based custom Roche SeqCap EZ Choice kit, designed to capture all exons of 72 prostate cancer driver genes. Cell free DNA was extracted from 10 mL blood samples. Libraries were sequenced using Illumina HiSeq 2500 or Illumina MiSeq instruments to a median coverage of 750x. 62% of samples had ctDNA fraction above 2% of total cfDNA. Note that "Dataset type" is erroneously listed as "Amplicon sequencing", because "Captured-based targeted sequencing" or "Hybridization-based targeted sequencing" were not available options in EGA at the time of submission.	Illumina HiSeq 2500	1140
EGAD00001004210	The dataset comprises RNA-seq information of 4 subpopulations sorted from human fetal pancreas of 3 different donors. Low input libraries were generated using the Smart-seq2 protocol after Ampure XP cleanup of the total RNA extracted from the sorted cells. Libraries were multiplexed and sequenced paired-end over 2 lanes of HiSeq4000 each. Raw data was aligned to the human genome refernence GRCh37 using STAR v2.5.1b with GENCODE v19 as transcriptome reference, and unaligned reads were folded into the final uploaded bams.	Illumina HiSeq 4000	12
EGAD00001004211	RNA extracted from middle temporal gyrus (MG) brain region of healthy elderly controls. Three pairs of samples were generated, each pair consisting of one sample that was enriched for circular RNAs using RNase R, and a second sample that was not enriched (total N=6). Remaining samples in the study from other functionally distinct brain regions are currently under process and will be released soon.	Illumina HiSeq 4000	6
EGAD00001004212	Files from whole exome sequencing of 14 tumors from two cancer patients (endometrial and lung cancer) along with a matched normal tissue per patient.	Illumina HiSeq 2000	16
EGAD00001004213	Sequences from 95 subjects presenting intellectual disability (ID) and 98 subjects presenting intellectual disability and a diagnosis of autism spectrum disorder (ASD). The mtDNA was amplified by long-range PCR with 3 pairs of primers producing overlapping fragments. The three fragments were mixed in equimolar ratios and each sample was sequenced in an Ion Torrent Personal Machine according to manufacturer's user guide (reference genome: NC_012920.1 (rCRS)).		193
EGAD00001004215	NGS-ProToCol RNA-seq dataset contains 41x normal adjacent prostate and 51x prostate cancer samples taken from fresh frozen radical prostatectomies, sequenced using random-hexamer priming. RNA-seq was performed on the Illumina HiSeq 2500 platform, 2 x 126 bp stranded paired-end reads at a depth of 70 mln reads.	Illumina HiSeq 2500	92
EGAD00001004216	The dataset “NKI-AvL OpACIN RNA-seq of stage III melanoma patients" includes 18 FASTQ files from single-end total RNA sequencing on Illumina HiSeq2500 for 18 stage III melanoma patients.	Illumina HiSeq 2500	18
EGAD00001004217	The dataset “NKI-AvL OpACIN DNA-seq of stage III melanoma patients" includes 2 x 18 normal and 2 x 18 tumor FASTQ files from paired-end whole exome sequencing on Illumina HiSeq2500 for 18 stage III melanoma patients.	Illumina HiSeq 2500	36
EGAD00001004218	Tumor DNA was extracted from formalin-fixed and paraffin embedded tumors of a large cohort of bladder cancer patients before treatment with anti-PD-L1. Normal DNA was extracted from matched PBMCs. Whole exome sequencing was performed. This is a subset of patients for which RNA sequencing is also provided (with more detailed phenotypic information).	Illumina HiSeq 2500	488
EGAD00001004220	41 samples from Zambia generated for the H3Africa Chip Design Study. The dataset includes BAM, FASTQ and decompressed gVCF files.	Illumina HiSeq 2500	41
EGAD00001004221	WGS and RNA-Seq data from a GBM patient PT-AB0029	Illumina HiSeq 2000	-
EGAD00001004222	WGS and RNA-Seq data from a GBM patient PT-AB6372	Illumina HiSeq 2000 Illumina HiSeq 2500	2
EGAD00001004223	WGS and RNA-Seq data from a GBM patient PT-AH1410	Illumina HiSeq 2000	-
EGAD00001004224	WGS and RNA-Seq data from a GBM patient PT-AK7565	Illumina HiSeq 2500	-
EGAD00001004225	WGS and RNA-Seq data from a GBM patient PT-AL4257	Illumina HiSeq 2000 Illumina HiSeq 2500	2
EGAD00001004226	Genome sequence data from a GBM patient PT-AR3050	Illumina HiSeq 2500	-
EGAD00001004227	WGS and RNA-Seq data from a GBM patient PT-AR5365	Illumina HiSeq 2000 Illumina HiSeq 2500	1
EGAD00001004228	WGS and RNA-Seq data from a GBM patient PT-BK0248	Illumina HiSeq 2500	-
EGAD00001004229	WGS and RNA-Seq data from a GBM patient PT-BM772	Illumina HiSeq 2000	1
EGAD00001004230	WGS and RNA-Seq data from a GBM patient PT-CA2271	Illumina HiSeq 2000 Illumina HiSeq 2500	-
EGAD00001004231	WGS and RNA-Seq data from a GBM patient PT-CM1209	Illumina HiSeq 2000 Illumina HiSeq 2500	2
EGAD00001004232	WGS and RNA-Seq data from a GBM patient PT-DF5919	Illumina HiSeq 2000 Illumina HiSeq 2500	2
EGAD00001004233	WGS and RNA-Seq data from a GBM patient PT-DS9789	Illumina HiSeq 2000 Illumina HiSeq 2500	1
EGAD00001004234	WGS and RNA-Seq data from a GBM patient PT-EV3071	Illumina HiSeq 2000 Illumina HiSeq 2500	-
EGAD00001004235	WGS and RNA-Seq data from a GBM patient PT-FB6711	Illumina HiSeq 2000	1
EGAD00001004236	Genome sequence data from a GBM patient PT-FR7453		-
EGAD00001004237	WGS and RNA-Seq data from a GBM patient PT-GB9186	Illumina HiSeq 2000	-
EGAD00001004238	WGS and RNA-Seq data from a GBM patient PT-GB9483	Illumina HiSeq 2500	-
EGAD00001004239	WGS and RNA-Seq data from a GBM patient PT-GC1519	Illumina HiSeq 2500	1
EGAD00001004240	WGS and RNA-Seq data from a GBM patient PT-GJ3716	Illumina HiSeq 2500	2
EGAD00001004241	WGS and RNA-Seq data from a GBM patient PT-GR2309	Illumina HiSeq 2500	-
EGAD00001004242	WGS and RNA-Seq data from a GBM patient PT-HN6692	Illumina HiSeq 2500	-
EGAD00001004243	WGS and RNA-Seq data from a GBM patient PT-HO0394	Illumina HiSeq 2000 Illumina HiSeq 2500	-
EGAD00001004244	WGS data from a GBM patient PT-HS9105		-
EGAD00001004245	WGS and RNA-Seq data from a GBM patient PT-JB1730	Illumina HiSeq 2000	-
EGAD00001004246	WGS and RNA-Seq data from a GBM patient PT-JE6375	Illumina HiSeq 2000 Illumina HiSeq 2500	-
EGAD00001004247	WGS and RNA-Seq data from a GBM patient PT-JP2405	Illumina HiSeq 2500	-
EGAD00001004248	WGS and RNA-Seq data from a GBM patient PT-JW6420	Illumina HiSeq 2000 Illumina HiSeq 2500	1
EGAD00001004249	WGS and RNA-Seq data from a GBM patient PT-KM5291	Illumina HiSeq 2000 Illumina HiSeq 2500	1
EGAD00001004250	WGS and RNA-Seq data from a GBM patient PT-LC3356	Illumina HiSeq 2000 Illumina HiSeq 2500	1
EGAD00001004251	WGS and RNA-Seq data from a GBM patient PT-LR9369	Illumina HiSeq 2000	-
EGAD00001004252	WGS and RNA-Seq data from a GBM patient PT-LS4891	Illumina HiSeq 2000	-
EGAD00001004253	WGS and RNA-Seq data from a GBM patient PT-MB9777	Illumina HiSeq 2000 Illumina HiSeq 2500	3
EGAD00001004254	WGS and RNA-Seq data from a GBM patient PT-MD9088	Illumina HiSeq 2500	1
EGAD00001004255	WGS and RNA-Seq data from a GBM patient PT-PD6881	Illumina HiSeq 2000 Illumina HiSeq 2500	1
EGAD00001004256	WGS and RNA-Seq data from a GBM patient PT-RD1291	Illumina HiSeq 2000 Illumina HiSeq 2500	-
EGAD00001004257	WGS and RNA-Seq data from a GBM patient PT-RL5404	Illumina HiSeq 2000	1
EGAD00001004258	WGS and RNA-Seq data from a GBM patient PT-RL7940	Illumina HiSeq 2000	-
EGAD00001004259	WGS and RNA-Seq data from a GBM patient PT-RW9277	Illumina HiSeq 2000 Illumina HiSeq 2500	1
EGAD00001004260	WGS and RNA-Seq data from a GBM patient PT-SK0976	Illumina HiSeq 2000 Illumina HiSeq 2500	1
EGAD00001004261	WGS and RNA-Seq data from a GBM patient PT-SO0258	Illumina HiSeq 2000	1
EGAD00001004262	WGS and RNA-Seq data from a GBM patient PT-TM5196	Illumina HiSeq 2500	1
EGAD00001004263	WGS and RNA-Seq data from a GBM patient PT-VO7089	Illumina HiSeq 2000 Illumina HiSeq 2500	-
EGAD00001004264	WGS data from a GBM patient PT-WP9124	Illumina HiSeq 2500	1
EGAD00001004265		Illumina HiSeq 2000	195
EGAD00001004266	Many studies over the past 10 years, culminating in the recent report of the International Stem Cell Initiative (ISCI, 2011) have shown that hPSC acquire genetic and epigenetic changes during their time in culture. Many of the genetic changes are non-random and recurrent, probably because they provide a selective growth advantage to the undifferentiated cells. Some are shared by embryonal carcinoma cells, the malignant counterparts of ES cells. The origins of these growth advantages are poorly understood, but may come from altered cell cycle dynamics, resistance to apoptosis or altered patterns of differentiation. Less is known about the nature and consequences of epigenetic changes, but it is likely that these similarly affect hPSC behaviour; e.g., enhanced expression of DLK1, an imprinted gene, is associated with altered hPSC growth (Enver et al 2005). Inevitably, these genetic and epigenetic changes will impact on our ability to use hPSC for regenerative medicine, either because malignant transformation of the undifferentiated cells or their differentiated derivatives to be used for transplantation compromises safety, or because they impede the function of those differentiated derivatives, or because they affect the efficiency with which the undifferentiated cells can be expanded and differentiated into desired cell types. Focusing initially upon the existing clinical grade hESC lines, later moving to iPSC, we will Consolidate and extend knowledge of the rate, type and functional impact of the genetic variations that occur during hPSC culture. We will use whole genome and exome sequencing as well as SNP arrays, together with clonal analysis and other cytogenetics techniques. Common changes will be compared with those found in the normal human population, at low frequency in the original cell population or observed during iPSC generation in the HIPSCI project currently based at the WTSI. These studies will provide a better understanding of the range of genetic changes that occur in hPSC beyond the CNVs already identified. In conjunction with cancer genome resources and expertise at WTSI, bioinformatic analyses of these hPSC data will allow us to assess potential impact on hPSC behaviour pertinent to applications in regenerative medicine, notably the likelihood that specific changes arising in undifferentiated PSC cultures may be associated with potential malignant transformation of differentiated progeny This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ .	Illumina HiSeq 2000	72
EGAD00001004268	Samples from 149 trios from the Saguenay-Lac-Saint-Jean asthma familial cohort were all sequenced using a custom capture panel developed by our group, followed by next-generation sequencing. This custom capture panel covers around 3% of the genome, including coding and non-coding immune regulatory regions. We inferred the sequence in the non-sequenced siblings who were part of the same families as the trios and we imputed the sequence using IMPUTE2 in the whole cohort.		1214
EGAD00001004269	This dataset includes 112 head and neck tumour samples with matched normal (blood) samples sequenced using a custom hybrid capture panel.	Illumina HiSeq 2000	224
EGAD00001004270	Genome-wide copy number profiling was performed using low-pass whole genome sequencing on archival non-dysplastic mucosa (n=9), low-grade dysplasia (LGD; n=30), high-grade dysplasia (HGD; n=13), mixed LGD/HGD (n=7) and CA-CRC (n=19).	Illumina NovaSeq 6000	81
EGAD00001004271	The dataset comprises of seven samples described below 1. Muscle samples from three patients with late-onset PEO caused by compound heterozygous POLG variants M0305 POLG W748S/R1096C M1105 POLG A467T/T251I+P587L M1804 POLG A467T/X1240G+35aa 2. Muscle sample from a patient with adPEO with heterozygous TWNK variants M0230 TWNK p.Arg357Pro 3. Blood control samples from two patients with late-onset PEO caused by compound heterozygous POLG variants DNA2012-1630_S1 POLG W748S/R1096C DNA2018-0168_S2 POLG A467T/T251I+P587L 4. Muscle samples from healthy control individuals DNA2018-0172_S4 Healthy control 2 DNA2018-0173_S5 Healthy control 1	NextSeq 500	8
EGAD00001004272	We will sequence at 15X coverage the genomes of 960 IBD patients. These samples are currently onsite at Sanger and made available for sequencing via our collaboration with the UK IBD Genetics consortium. During the next quinquennium we intend to sequence the genomes of many thousand IBD patients and these 960 represent the first stage of this effort. Ultimately we will perform association tests comparing these genomes to similar numbers of control genomes to identify rare and low-frequency variants underlying IBD. . This dataset contains all the data available for this study on 2018-08-03.	HiSeq X Ten	1432
EGAD00001004273	In this project we have sequenced the exome of skin moles (melanocytic naevi) and also normal skin from young and old people. We are interested in looking at the clonality of these lesions and the burden of UV mutations . This dataset contains all the data available for this study on 2018-08-03.	Illumina HiSeq 2500	184
EGAD00001004274	1cm biospies of from patients undergoing bladder cystectomy will be collected. The underlying muscle and stroma will be removed and the remaining epithelia dissected into small sequential areas which will be sent for ultra-deep exome sequencing using a panel of known cancer and viral genes. Sequence analysis using similar methods to Martincorena I et al (Science 2015, 348:880) will provide an idea of the somatic mutational landscape in these patient samples. Individual patient muscle samples will also be sequenced as a reference. . This dataset contains all the data available for this study on 2018-08-03.	Illumina HiSeq 2000 Illumina HiSeq 2500	71
EGAD00001004275	Exome sequencing was performed on fresh-frozen multiple regions of carcinoma, adjacent non-cancerous mucosa and blood from 12 CA-CRC patients (n=55 exomes).	Illumina Genome Analyzer II	64
EGAD00001004276	In the present study two large, multiply affected bipolar disorder families from Cuba were investigated using whole exome sequencing (Illumina HiSeq2500 v4). The variant calling files (VCFs) of 15 individuals provided here were generated using the Varbank exome pipeline from the Cologne Center for Genomics (CCG, https://varbank.ccg.uni-koeln.de).		15
EGAD00001004279	Genomic DNA of tumours and matched normal gastric tissues was extracted (QIAGEN). Libraries were constructed with 300-400 bp insert length, and 101bp or 151bp paired-end sequencing was performed on Illumina Hiseq instruments	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500	80
EGAD00001004280	This dataset contains whole genome sequencing BAM files for 78 tumor-normal pairs (a total of 156 samples) used in the St. Jude Clinical Pilot. Mapping was performed using BWA. This dataset accompanies the paper "Clinical Cancer Genomic Profiling by Three-Platform Sequencing of Whole Genome, Whole Exome and Transcriptome"	Illumina HiSeq 2000	78
EGAD00001004281	The dataset contains the somatic point mutation data from the exome-targeted region of 36 exome or whole genome sequenced microsatellite unstable colorectal cancers and the somatic point mutation data from 93 additional MiSeq sequenced microsatellite unstable colorectal cancers.		129
EGAD00001004286	Comprehensive genetic analyses including whole-exome sequencing, targeted sequencing, and whole-genome sequencing of the human genome and the Epstein-Barr virus (EBV) genome were performed to reveal the molecular pathogenesis of EBV-associated hematological malignancy.	Illumina HiSeq 2500	453
EGAD00001004287	This dataset contains whole exome sequencing BAM files for 78 tumor-normal pairs (a total of 156 samples) used in the St. Jude Clinical Pilot. Mapping was performed using BWA. This dataset accompanies the paper "Clinical Cancer Genomic Profiling by Three-Platform Sequencing of Whole Genome, Whole Exome and Transcriptome"	Illumina HiSeq 2000	156
EGAD00001004288	To validate the methylation status of the four candidate tumor suppressor genes (ADHFE1, EOMES, SALL1, TFPI2) in Han Chinese ESCC patients, we recruited 103 patients and obtained the paired tumors(entitled as T) and adjacent normal tissues (entitled as N) as well. Targeted bisulfite sequencing was conducted to detect the methylation profiles of these four genes in these 103 paired tissues. Furthermore, the raw sequence data (fastq files) was aligned using the BSseeker2 and this dataset included all of the bam file after alignment.	Illumina HiSeq 2000	205
EGAD00001004289	Data supporting: "Low-cost and clinically applicable copy number profiling using repeat DNA." Abujudeh et al. DNA WGS (BAM files) DNA fastSeq (fastq files) Tumours, Barrett's, normals.	Illumina HiSeq 2000 Illumina MiSeq	60
EGAD00001004290	This dataset contains whole genome sequencing BAM files for 78 tumor-normal pairs (a total of 156 samples) used in the St. Jude Clinical Pilot. Mapping was performed using BWA. This dataset accompanies the paper "Clinical Cancer Genomic Profiling by Three-Platform Sequencing of Whole Genome, Whole Exome and Transcriptome"	Illumina HiSeq 2000	156
EGAD00001004291	We performed ATAC-seq experiments using 2 placental samples and 2 buffycoat samples.	Illumina HiSeq 2500	4
EGAD00001004292	Targeted capture of cancer gene panel bait set in single cell derived organoids from colon tissue and colorectal cancer from 1 patient. . This dataset contains all the data available for this study on 2018-08-13.	Illumina HiSeq 2000 Illumina HiSeq 2500	112
EGAD00001004293	Whole-exome sequencing of a cohort of families (probands and affected/unaffected relatives) suffering from one of two rare thyroid disorders: congenital hypothyroidism (CH) and resistance to thyroid hormone (RTH). . This dataset contains all the data available for this study on 2018-08-13.	Illumina HiSeq 2000 Illumina HiSeq 2500	110
EGAD00001004294	This study will analyse the guide sequence which were used for making mutations in the Cas9-expressing cells. We used GeCKO v2 library which were released by Feng Zhang, 2014. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2018-08-13.	Illumina HiSeq 2500 Illumina MiSeq	92
EGAD00001004295	Many studies over the past 10 years, culminating in the recent report of the International Stem Cell Initiative (ISCI, 2011) have shown that hPSC acquire genetic and epigenetic changes during their time in culture. Many of the genetic changes are non-random and recurrent, probably because they provide a selective growth advantage to the undifferentiated cells. Some are shared by embryonal carcinoma cells, the malignant counterparts of ES cells. The origins of these growth advantages are poorly understood, but may come from altered cell cycle dynamics, resistance to apoptosis or altered patterns of differentiation. Less is known about the nature and consequences of epigenetic changes, but it is likely that these similarly affect hPSC behaviour; e.g., enhanced expression of DLK1, an imprinted gene, is associated with altered hPSC growth (Enver et al 2005). Inevitably, these genetic and epigenetic changes will impact on our ability to use hPSC for regenerative medicine, either because malignant transformation of the undifferentiated cells or their differentiated derivatives to be used for transplantation compromises safety, or because they impede the function of those differentiated derivatives, or because they affect the efficiency with which the undifferentiated cells can be expanded and differentiated into desired cell types. Focusing initially upon the existing clinical grade hESC lines, later moving to iPSC, we will Consolidate and extend knowledge of the rate, type and functional impact of the genetic variations that occur during hPSC culture. We will use whole genome and exome sequencing as well as SNP arrays, together with clonal analysis and other cytogenetics techniques. Common changes will be compared with those found in the normal human population, at low frequency in the original cell population or observed during iPSC generation in the HIPSCI project currently based at the WTSI. These studies will provide a better understanding of the range of genetic changes that occur in hPSC beyond the CNVs already identified. In conjunction with cancer genome resources and expertise at WTSI, bioinformatic analyses of these hPSC data will allow us to assess potential impact on hPSC behaviour pertinent to applications in regenerative medicine, notably the likelihood that specific changes arising in undifferentiated PSC cultures may be associated with potential malignant transformation of differentiated progeny. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2018-08-13.	Illumina HiSeq 2000 Illumina HiSeq 2500	105
EGAD00001004297	Lymphoblastoid cell lines established using either wildtype or BALF5-deficient Epstein-Barr virus were analyzed by RNA sequencing.	Illumina HiSeq 2500	2
EGAD00001004298	Capture-based whole-genome sequencing of Epstein-Barr virus (EBV) was performed in hematological malignancies such as EBV-positive diffuse large B-cell lymphoma, extranodal NK/T-cell lymphoma, and chronic active EBV infection.	Illumina HiSeq 2500	264
EGAD00001004299	Comprehensive genetic analyses including whole-exome sequencing, targeted sequencing, and whole-genome sequencing were performed to reveal the molecular pathogenesis of chronic active Epstein-Barr virus infection.	Illumina HiSeq 2500	187
EGAD00001004300	June 2018 data update (fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	HiSeq X Ten Illumina HiSeq 2500	4
EGAD00001004301	Whole exome sequencing data generated from organoid cultures established from gastric cancers, paired gastric tumor frozen tissues and blood leukocyte DNA.	HiSeq X Ten Illumina HiSeq 1500 unspecified	130
EGAD00001004302	RNASeq data generated from organoid cultures established from gastric cancers and normal mucosae, paired tumor frozen tissues, and cultured fibroblast.	HiSeq X Ten Illumina HiSeq 1500	131
EGAD00001004303	Sequence data (bam files) of two RRBS samples for paper "A comprehensive analysis of 195 DNA methylomes reveals shared and cell specific features of partially methylated domains". Short Description: CD4+ T memory cells (CD3+ CD4+ CD45RA- CD45RO+ CD25-) from donors were sorted by flow-cytometry either as a bulk culture ('ex vivo' sample) or in a single-cell format into 96 well-plates ('clone' sample) in the presence of a TCR stimulus.	Illumina HiSeq 2500	2
EGAD00001004304	We intend to use single cell transcriptome analysis to explore the heterogenity of different cell types within the kidney. . This dataset contains all the data available for this study on 2018-08-20.	Illumina HiSeq 2500	1
EGAD00001004305	As part of the Human Cell Atlas we will study fetal tissue. . This dataset contains all the data available for this study on 2018-08-20.	Illumina HiSeq 2500 Illumina HiSeq 4000	27
EGAD00001004306	We performed whole-exome sequencing on multiple regions (n=2-3) from four primary untreated breast tumors (n=1 HER2+, n=2 ER+/HER2-, n=1 triple-negative), as well as matched normal. We also performed whole-exome sequencing on one region from the pre-treatment diagnostic core biopsy and multiple regions (n=2-6) from the post-treatment surgical specimen for five HER2+ primary breast tumors, as well as matched normal; all were treated with combination chemotherapy and trastuzumab. Analysis of these specimens allows characterization of breast tumor heterogeneity and clonal evolution.	Illumina HiSeq 2500	42
EGAD00001004307	Exome sequencing from cfDNA blood samples. 30 sets of 2x76 Illumina reads in Fastq format.	NextSeq 500	30
EGAD00001004308	The Central Asian Kyrgyz highland population provides a unique opportunity to address genetic diversity and understand the genetic mechanisms underlying hypoxia-induced high altitude pulmonary hypertension (HAPH). While a significant fraction of the population is unaffected, there are susceptible individuals who display HAPH in the absence of any lung, cardiac or hematologic disease. We report herein the analysis of the whole genome sequencing of healthy individuals compared with HAPH patients and other controls. In this study, 34 male individuals from Central Asian Kyrgyz highland are sequenced with Illumina HiSeq 2000 with mean-coverage of 30X.	Illumina HiSeq 2000	34
EGAD00001004309	Targeted next-generation-sequencing of 494 cancer-associated genes was done in a series of 14 frozen pairs of matched primary breast cancers and brain metastases (28 samples). DNA libraries of all coding exons were prepared using the Haloplex Target Enrichment System. Sequencing was done using the 2*150bp paired-end technology on the Illumina NextSeq500 platform.	Illumina MiSeq NextSeq 500	28
EGAD00001004310	Whole exome sequencing data of 17 SPTCL cases, including 7 matched-normal samples.	Illumina HiSeq 2500 Illumina HiSeq 4000	24
EGAD00001004311	The GoDARTS T2D-GENES exome sequencing study includes 1924 samples, 965 T2D cases and 959 T2D controls, from European ancestry. This cohort is part of a larger exome sequencing effort from the T2D-GENES project and contains the exome sequencing vcf from the GoDARTS samples. The other data generated from the T2D-GENES project can be found in dbGAP. Samples underwent deep exome sequencing, with SNVs and INDEls called according to GATK best practices.		1924
EGAD00001004312	ChIP-Seq files accompanying the paper titled "Identification of Therapeutic Targets in Rhabdomyosarcoma Through Integrated Genomic, Epigenomic, and Proteomic Analyses".	Illumina HiSeq 2000	158
EGAD00001004313	The dataset includes 13 bam files. Each bam file is a different colorectal cancer patient organoid.	Illumina MiSeq	13
EGAD00001004314	The dataset contains data from a single patient sample with partial lipodystrophy. The data is supplied in the form of 2 files. , a BAM file containing the (raw) sequencing data and a VCF file containing the called variants. The data is limited to a region consisting of the AGPAT2 gene on chromosome 9 and 1MB on both sides.	HiSeq X Ten	1
EGAD00001004315	WGBS files accompanying the paper titled "Identification of Therapeutic Targets in Rhabdomyosarcoma Through Integrated Genomic, Epigenomic, and Proteomic Analyses".	Illumina HiSeq 2000	37
EGAD00001004316	24 samples from Cameroon generated for the H3Africa Chip Design Study. The dataset includes BAM, FASTQ and decompressed gVCF files.	Illumina HiSeq 2500	24
EGAD00001004317	The blood samples of four liver cancer patients and four healthy people, and the solid liver tumor samples of two liver cancer patients are collected for this dataset. Blood samples were centrifuged first at 1,600 × g for 10 minutes, and then the plasma was transferred into new micro tubes and centrifuged at 16,000 × g for another 10 minutes. The plasma was collected and stored at -80⁰C. CfDNA was extracted from 5 ml plasma using the Qiagen QIAamp Circulating Nucleic Acids Kit and quantified by Qubit 3.0 Fluoromter (Thermo Fisher Scientific). Bisulfite conversion of cfDNA was performed by using EZ-DNA-Methylation-GOLD kit (Zymo Research). After that, Accel-NGS Methy-Seq DNA library kit (Swift Bioscience) was used to prepare the sequencing libraries. The DNA libraries were then sequenced with 150bp paired-end reads.	HiSeq X Ten	10
EGAD00001004318	We used single-cell transcriptomics to study >60,000 cells from the developing murine cerebellum, and show that different molecular subgroups of childhood cerebellar tumors mirror the transcription of cells from distinct, temporally restricted cerebellar lineages.	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	47
EGAD00001004319	This dataset contains Linked-Read Whole Exome Sequencing (lrWES) from individuals with known disease-causing variants. The dataset comprises of 30 samples from 10 donors, where multiple samples from the same donor reflect experimental differences assaying the effect of input DNA length on coverage and phasing. Raw data (i.e. BAM files) and variant analysis (i.e. VCF files) for each sample are included in this dataset.	Illumina HiSeq 4000	30
EGAD00001004320	RNA-Seq data from 6 Giant Cell Lesions of the Jaw (GCLJ) samples.	Illumina HiSeq 2000	6
EGAD00001004321	In situ promoter capture Hi-C on Hodgkin lymphoma cell line L-428 in experimental triplicates. Hi-C libraries were prepared as previously described (Orlando et al., 2018, https://currentprotocols.onlinelibrary.wiley.com/doi/pdf/10.1002/cphg.63). Promoter capture was based on 32,313 biotinylated 120-mer RNA baits (Agilent). Hi-C libraries were sequenced using Illumina HiSeq 2000 technology. The files are in FASTQ format.	Illumina HiSeq 2000	1
EGAD00001004322	ChIP-seq data (H3K4Me3, H3K27Ac histone modifications) of Hodgkin lymphoma cell line L-428. Samples were processed as previously described (Sud et al., 2018). The files are in bam format, aligned to build 37 of the human genome.	Illumina HiSeq 2000	1
EGAD00001004323	Primary plasma cell leukemia (pPCL) samples were sequenced using the Nimblegen MedExome Plus hybridization capture to detect translocations, copy number changes, and mutations in 20 pPCL samples and patient matched controls. Sequencing was performed on a NextSeq500 using 75 bp paired end reads.	NextSeq 500	40
EGAD00001004324	This dataset consist on 70 maternal plasma samples (bam files) used in the FetalQuantSD. The maternal plasma DNA samples were sequenced using the HiSeq 2000 platform (Illumina) with a 50-cycle paired-end mode.	Illumina HiSeq 2000	70
EGAD00001004325	RNA sequencing data from Vγ9Vδ2-T cells from chronic lymphocytic leukemia patients and age-mnatched healthy controls. Matched Vγ9Vδ2-T cell samples before and after expansion with autologous monocyte-derived dendritic cells for each donor are included.	NextSeq 500	16
EGAD00001004326	This dataset includes transcriptome sequencing of 17 paired NAFLD-HCC samples and adjacent normal tissues. All the experiments were performed on Illumina HiSeq 2000 platform with raw reads stored in fastq format.	Illumina HiSeq 2000	34
EGAD00001004327	Paired-end, ribosome depleted, total RNA Sequencing	Illumina HiSeq 2500	36
EGAD00001004328	DNA was obtained from either CD138+ cells from the bone marrow of multiple myeloma patients (tumor) or from stem cell harvests from the same patient (control). 100 ng of DNA was fragmented, end-repaired, and adapters ligated using the HyperPlus kit (KAPA Biosystems). After PCR amplification the libraries were hybridized with probes against either the entire exome (MedExome, Nimblegen) or a targeted panel of 140 genes using SeqCap reagents (Nimblegen). Hybridized libraries underwent further amplification before being sequenced on a NextSeq500 (Illumina) using 75 bp paired end reads.	NextSeq 500	12
EGAD00001004329	Somatic mutations of 256 whole-genome sequenced colorectal tumors. 234 MSS, 19 MSI and 3 POLE mutants. See Katainen R. et al. CTCF/cohesin-binding sites are frequently mutated in cancer, Nature Genetics 2015. doi:10.1038/ng.3335		256
EGAD00001004330	Target sequencing W/ TruSight Cardio Sequencing Kit. 395 early onset lone AF cases and 375 controls. Sequencing was performed on Illumina NextSeq and HiSeq 2500 systems.	unspecified	1131
EGAD00001004331	RNA-seq (Ribodepleted Directional -75 PE- Hiseq 4000) data of purified and expanded iNKt and T cells from normal donor, and RNA-seq (poly-A 100-PE Hiseq 2500) data from C1R cell line. Data set consist of 3 pairs of fastq files, one pair per sample	Illumina HiSeq 2500 Illumina HiSeq 4000	3
EGAD00001004332	Familial adenomatous polyposis (FAP) and MUTYH‐associated polyposis (MAP) are inherited disorders associated with multiple colorectal adenomas that lead to a very high risk of colorectal cancer. The somatic mutations that drive adenoma development in these conditions have not been investigated comprehensively. In this study we performed analysis of paired colorectal adenoma and normal tissue DNA from individuals with FAP or MAP, sequencing 14 adenoma whole exomes (eight MAP, six FAP), 55 adenoma targeted exomes (33 MAP, 22 FAP) and germline DNA from each patient.	Illumina Genome Analyzer II Illumina HiSeq 2000	121
EGAD00001004333	Whole exome sequencing of 76 individuals with familial atrial fibrillation. BAM files have been aligned with BWA meme algorithm. Fastq files were filtered and trimmed using cutadapt. Samples have been sequnced on an Illumina 2500 machine.	Illumina HiSeq 2500	1131
EGAD00001004334	50 samples from Mali generated for the H3Africa Chip Design Study. The dataset includes BAM, FASTQ and decompressed gVCF files.	Illumina HiSeq 2500	50
EGAD00001004335	Histone ChIP-seq of 13 human embryonic tissues from weeks 6-8 of gestation. H3K4me3, H3K27me3 and H3K27ac. Biological replicates (n=2) for 11 tissues. Tissues (n): Brain (2); Retinal Pigmented Epithelium (eye)(2); Palate(2); Tongue (1); Left ventricle (heart)(2); Lung(2); Liver(2); Pancreas (2*); Stomach(1); Upper limb (2); Lower limb (2); Adrenal gland (2); Kidney (2)	Illumina HiSeq 2500 Illumina HiSeq 4000	77
EGAD00001004336	The dataset for Evolution of neoantigen landscape during immune checkpoint blockade in non-small cell lung cancer includes 17 bam files from next-generation sequencing on the Illumina HiSeq2500. The biospecimens analyzed include matched tumor pre-treatment, post-progression and normal samples.	Illumina HiSeq 2500	17
EGAD00001004337	Whole Genome Sequencing files accompanying the paper titled "Structure and evolution of double minutes in diagnosis and relapse brain tumors". Please read the paper for more details.	Illumina HiSeq 2000	2
EGAD00001004339	Dataset for "Genomic landscape of oral cancers" (CGI WGS)	Complete Genomics	59
EGAD00001004340	The dataset “NKI-AvL CRC-OVC DNA-seq" includes 4 normal and 4 tumor BAM files from paired-end whole exome sequencing on Illumina HiSeq2500 and Illumina NovaSeq6000 for 2 colorectal cancer and 2 ovarian cancer patients.	Illumina HiSeq 2500 Illumina NovaSeq 6000	8
EGAD00001004341	The dataset “NKI-AvL CRC-OVC RNA-seq" includes 4 FASTQ files from single-end total RNA sequencing on Illumina HiSeq2500 for 2 colorectal cancer and 2 ovarian cancer patients.	Illumina HiSeq 2500	4
EGAD00001004342	The dataset “NKI-AvL CRC-OVC scTCR RNA-seq" includes 368 BAM files from paired-end RNA sequencing on Illumina MiSeq for 2 colorectal cancer and 2 ovarian cancer patients.	Illumina MiSeq	368
EGAD00001004344	Data consists of 4,640 RNA-sequencing sample libraries. These libraries were sequenced from four sites of the upper gastro-intestinal tract (Barrett’s oesophagus, proximal normal oesophagus, proximal normal stomach, and duodenum) in two experiments. 4,587 libraries were produced in the first experiment in which whole transcriptomes were isolated single cells dissociated from endoscopic biopsy tissue obtained from the four previously mentioned tissues. The other 53 libraries were produced in the second experiment in which whole transcriptomes were isolated from whole tissue from endoscopic biopsies of the four previously mentioned tissues. The data found here are stored in the raw fastq file format from paired end sequencing.	Illumina HiSeq 4000	4640
EGAD00001004345	This dataset contains variant call format files generated from whole exome sequencing of germline DNA from indiviudals with diagnosed with testicualr germ cell cancer.		960
EGAD00001004346	This is a bulk DNA and RNA sequencing study of human renal tumours . This dataset contains all the data available for this study on 2018-09-19.	HiSeq X Ten	37
EGAD00001004347	We analyzed alternative splicing with Shh medulloblastoma. This dataset contains bam files of whole genome sequencing from 4 cases. Genomic DNA was isolated from both tumor and matched control specimens. We performed whole genome sequence on Illumina Hiseq.	Illumina HiSeq 2000	8
EGAD00001004348	This dataset includes microRNA sequencing data from 198 human serum samples, representing a subset of 66 women with no history of cancer who participated in the UKCTOCS study and with serum samples collected at three timepoints over a period of up to 5 years. Small RNA libraries prepared from the serum samples were sequenced with 50-bp single end reads on an Illumina HiSeq 2000 instrument. Data is provided as FASTQ files.	Illumina HiSeq 2000	198
EGAD00001004351	ERBB2/HER2 transmembrane and juxtamembrane domain mutations in cancer. Exome sequencing of tumor and matched blood and 2 blood samples from relatives.	Illumina HiSeq 2500	4
EGAD00001004352	The Whole Exome Sequencing dataset contains 30 whole exome sequencing files (tumor, germ line DNA) and phenotype metadata for 15 patients on the phase II clinical trial of neoadjuvant immune checkpoint blockade in high-risk resectable melanoma at MD Anderson Cancer Center (NCT02519322). Included are data from baseline samples.	Illumina HiSeq 2500	30
EGAD00001004353	The aim of this study was to compare the mutational landscape of breast cancer diagnosed during pregnancy (BCP) and breast cancer from age/stage non-pregnant patients (controls). We present whole genome sequencing data (Illumina HiSeq X ten platform) of tumor and matched normal tissues from 35 BCP patients and 20 controls. This work provides important novel biological insights and a unique resource to study the biology of breast cancer in young women and how pregnancy could modulate tumor biology.	HiSeq X Ten	106
EGAD00001004355	This dataset consists on 22 samples linked to 22 bam files from whole genome and whole exome sequencing of Esthioneuroblastomas.	Illumina HiSeq 2500	22
EGAD00001004356	Dataset for "Genomic landscape of oral cancers" (Illumina WGS)		106
EGAD00001004357	Whole genome sequencing of sick children in neonatal and paediatric intensive care units. Datasets EGAD00001007780 (GRCh37) and EGAD00001007868 (GRCh38) are extentions of this dataset.	Illumina HiSeq 2000	219
EGAD00001004358	EGAS00001002317 - Whole exome sequencing of data of 18 RIMs with matched bloods. Median depth of 112x (range of 110-120). Performed on Illumina HiSeq Platform. EGAS00001002318 - RNA sequencing data of 18 RIMs on the Illumina HiSeq Platform.	Illumina HiSeq 2000 Illumina HiSeq 2500	54
EGAD00001004359	4 WGS bam files for 4 cases with fusion	Illumina HiSeq 2000	4
EGAD00001004360	10 RNA-Seq bam files including 4 cases with fusion and 6 controls without fusion.	Illumina HiSeq 2000	10
EGAD00001004361	Summary statistics from GWAS meta-analysis of cervical cancer		2
EGAD00001004362	Exemplar asymptomatic controls (n=10, 6 males) and exemplar cases with chronic Achilles tendinopathy (n=10, 6 males), representing divergent extremes of the phenotype spectrum were selected for WES. Individual samples were sequenced at paired ends on the Illumina HiSeq 2000/2500 platform at 30X coverage using the Agilent V5+UTR (71Mbp) capture kit.	Illumina HiSeq 2500	20
EGAD00001004363	FastQ files with paired-end RNAseq data for human fetal brain homogenate from 120 samples (12-19 post-conception weeks).	Illumina HiSeq 2500 Illumina HiSeq 4000	120
EGAD00001004364	Whole-exome sequencing (WES) was performed on a total of 34 PC specimens, with 15 cases having matched gDNAs extracted from blood.	Illumina HiSeq 4000	49
EGAD00001004365	Whole transcriptome sequencing (RNA-seq) was performed on 39 PC specimens. Among them, 21 specimens also had WES data.	Illumina HiSeq 4000	39
EGAD00001004366	Dataset for "Genomic landscape of oral cancers" (Illumina RNA)		110
EGAD00001004367	Single cell RNA-seq analysis of human skin.	Illumina NovaSeq 6000	12
EGAD00001004368	Targeted gene sequencing of cancer driver genes to determine the driver mutations present in newly-derived cancer organoid models	Illumina HiSeq 4000	10
EGAD00001004370	Illumina whole genome sequencing to high depth (x50) of four Tanzanian individuals. Genomic DNA derived from peripheral whole blood.	HiSeq X Ten	4
EGAD00001004371	Microbiome analysis was performed on the patient samples collected pre-FMT and on days after FMT, and on samples collected from the FMT donor. Genomic bacterial DNA was extracted from fecal samples using the QIAamp DNA Stool kit (Qiagen, Hilden, Germany), with the addition of a bead-beating lysis step. Genomic 16S ribosomal-RNA V4 variable regions were amplified and sequenced on the Illumina MiSeq platform.	Illumina MiSeq	11
EGAD00001004372	This data set consists of DQ2.5-glia-a1a- and DQ2.5-glia-w1- specific T-cell receptor sequences from single cells isolated from blood or biopsies of celiac disease patients.	Illumina MiSeq	53
EGAD00001004373	DNA was obtained from either CD138+ cells from the bone marrow of multiple myeloma patients (tumor) or from stem cell harvests or peripheral blood cells from the same patient (control). 100 ng of DNA was fragmented, end-repaired, and adapters ligated using the HyperPlus kit (KAPA Biosystems). After PCR amplification the libraries were hybridized with probes against either a targeted panel consisting of 140 genes and chromosomal regions (Nimblegen) using SeqCap reagents (Nimblegen). Hybridized libraries underwent further amplification before being sequenced on a NextSeq500 (Illumina) using 75 bp paired end reads.	NextSeq 500	263
EGAD00001004374	The dataset includes 43 matched normal samples from 43 NF1-glioma patients profiled by Whole Exome Sequencing.	Illumina HiSeq 2500	43
EGAD00001004375	The dataset includes 59 tumor samples from 56 NF1-glioma patients profiled by Whole Exome Sequencing.	Illumina HiSeq 2500	59
EGAD00001004376	The dataset includes 29 tumor samples from NF1-glioma patients profiled by RNA sequencing.	Illumina HiSeq 2500	29
EGAD00001004378	Fastq files from exome sequencing of paired normal/tumor (pre and post-nCRT) samples from 7 patients with rectal tumors. All samples were sequenced on a 5500xl SOLiD sequencing platform (Thermo Fisher Scientific).	AB 5500xl Genetic Analyzer	22
EGAD00001004379	Shallow whole‐genome sequencing dataset on samples from three patients who underwent histological transformation to small‐cell lung cancer. Samples included in this dataset include normal buffy coat samples, plasma samples collected at diagnosis of NSCLC as well as prior to small‐cell transformation and after SCLC transformation and progression on cisplatin and irinotecan.	Illumina HiSeq 2500	17
EGAD00001004380	Glioblastoma patient derived Fast/Slow cycling cancer stem cell RNA sequencing. Consists of 3 patient cell lines and 6 files	Illumina HiSeq 2500	6
EGAD00001004384	Whole Genome Sequencing of 44 patients with Chronic Lymphocytic Leukemia. This dataset comprises 44 .bam files aligned to the hg19 build of the human genome from sequencing reads generated on an Illumina HiSeq instrument.	Illumina HiSeq 2500	44
EGAD00001004385		454 GS FLX Titanium AB 3730xL Genetic Analyzer Illumina MiSeq	171
EGAD00001004386	Whole Exome Sequencing reads consisting of BAM paired end reads from Follicular Lymphoma samples.	Illumina HiSeq 2500	7
EGAD00001004387	WGS of ovarian cancer organoids, tumor samples and blood references. Ovarian cancer (OC) is a heterogeneous disease usually diagnosed at a late stage. Experimental in vitro models that faithfully capture the hallmarks and tumor heterogeneity of OC are limited and hard to establish. We present a novel protocol that enables efficient derivation and long-term expansion of OC organoids. Utilizing this protocol, we have established 56 organoid lines from 32 patients, representing the spectrum of ovarian neoplasms, including non-malignant borderline tumors, as well as mucinous, clear-cell, endometrioid, low- and high-grade serous carcinomas. OC organoids recapitulate histological and genomic features of the pertinent lesion from which they were derived, illustrating intra- and inter-patient heterogeneity, and can be genetically modified. We show that OC organoids can be used for drug screening assays and capture different tumor subtype responses to the gold standard platinum-based chemotherapy, including acquisition of chemoresistance in recurrent disease. Finally, OC organoids can be xenografted, enabling in vivo drug sensitivity assays. Taken together, this demonstrates their potential application for research and personalized medicine.	HiSeq X Ten	111
EGAD00001004388	DDD DATAFREEZE 2017-12-15: 13,462 trios and probands only - phenotypic and family descriptions		1
EGAD00001004389	DDD DATAFREEZE 2017-12-15: 13,462 trios and probands only - exome sequence VCF files		1
EGAD00001004390	DDD DATAFREEZE 2017-12-15: 13,462 trios and probands only - exome sequence CRAM files		1
EGAD00001004391	This is a prospective, single arm phase IIa trial in which patients with early breast cancer will receive pre-operatively two doses of denosumab 120mg subcutaneously one week apart (maximum 12 days) followed by surgery. Tumor, normal breast tissue and blood samples will be collected at baseline and at surgery. Post-operative treatment will be at the discretion of the investigator.Primary objective: to determine if a short course of RANKL inhibition with denosumab can induce a decrease in tumor proliferation rates as determined by Ki67 immunohistochemistry (IHC) in newly diagnosed, early stage breast cancer in pre-menopausal women.	Illumina NovaSeq 6000	72
EGAD00001004393	26 samples from Cameroon generated for the H3Africa Chip Design Study. The dataset includes BAM, FASTQ and decompressed gVCF files.	Illumina HiSeq 2500	26
EGAD00001004394	Dataset consists of fastq files of Ribo-seq, polyA-RNA and total RNA sequencing of 80 samples (65 DCM cases and 15 controls)	Illumina HiSeq 2500 Illumina HiSeq 4000	80
EGAD00001004396	BAM files of individuals from the 1958BC aligned to hg17	Illumina HiSeq 2500	648
EGAD00001004397	We profiled the transcriptomes (RNA-sequencing) of 40 clinically significant invisible and visible tumors, all with ISUP Grade 2 disease and treated by radical prostatectomy. Twenty tumors were mpMRI invisible (PI-RADSv2: 1-2), while 20 tumors were visible (PI-RADsv2: 5).	Illumina HiSeq 3000	40
EGAD00001004398	Hotspot mutations in the spliceosome gene SF3B1 are reported in 20% of uveal melanomas. SF3B1 is involved in 3'-splice site (3'ss) recognition during RNA splicing; however, the molecular mechanisms of its mutation have remained unclear. Here we show, using RNA-Seq analyses of uveal melanoma, that the SF3B1 R625/K666 mutation results in deregulated splicing at a subset of junctions, mostly by the use of alternative 3'ss.	Illumina HiSeq 2500	76
EGAD00001004399	ctDNA and protein markers for earlier detection of pancreatic cancers	Illumina HiSeq 4000	14
EGAD00001004400	SNV calls generated using the MuTect-Battenberg-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study		1
EGAD00001004401	SNV calls generated using the MuTect-TITAN-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study		1
EGAD00001004402	SNV calls generated using the SomaticSniper-Battenberg-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study		1
EGAD00001004403	CNA calls generated using the MuTect-Battenberg-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study		1
EGAD00001004404	CNA calls generated using the MuTect-TITAN-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study		1
EGAD00001004405	CNA calls generated using the SomaticSniper-Battenberg-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study		1
EGAD00001004406	16S rRNA gene sequencing with Illumina MiSeq (V4 hypervariable region)	Illumina MiSeq	188
EGAD00001004408		Illumina HiSeq 2500 Illumina HiSeq 4000	117
EGAD00001004409	Aligned, merged and deduplicated BAM files from HiSeq whole genome sequencing of 134 samples: matched tumour-normal pairs from 67 mucosal melanoma cases		-
EGAD00001004410	DNA extracted from sorted CD19+ tumor cells (16 patients) was used for exome capture with the SureSelect V5 All Exon Kit following the standard protocols. Paired-end sequencing (2 x 100 bp) was performed using HiSeq2000 sequencing instruments. The files are in FASTQ format.	Illumina HiSeq 2000	16
EGAD00001004411	DNA extracted from sorted CD3+ cells (16 patients) was used for exome capture with the SureSelect V5 Mb All Exon Kit following the standard protocols. Paired-end sequencing (2 x 100 bp) was performed using HiSeq2000 sequencing instruments.The files are in FASTQ format.	Illumina HiSeq 2000	16
EGAD00001004412	RNA-Seq was performed on 12 samples of sorted CD19+ tumor cells. RNA-Seq libraries were prepared using the SureSelect Automated Strand Specific RNA Library Preparation Kit as per manufacturer’s instructions (Agilent technologies) and subjected to paired-end (2 x 100 bp) sequencing on HiSeq2000 (Illumina). The files are in FASTQ format.	Illumina HiSeq 2000	12
EGAD00001004413	There is currently a drive to establish cell based assay systems of greater human biological and disease relevance through the use of well characterised transformed cell lines, primary cells and complex cellular models (e.g. co-culture, 3D models). However, although the field is gaining valuable experience in running more non-standard & complex cell assays for target validation and compound pharmacology studies, there is the lack of a systematic approach to determine if this expansion in cell assay models is reflected in increased human biological and disease relevance. The increasing wealth of publically available transcriptomic, and epigenome (ENCODE and Epigenome Roadmap) data represents an ideal reference mechanism for determining the relationship between cell types used for target & compound studies to primary human cells and tissues from both healthy volunteers & patients. The CTTV020 epigenomes of cell line project aims to generate epigenetic and transcriptomic profiles of cell lines and compare these with existing and newly generated reference data sets from human tissue and cell types. The aim is to identify assay systems which will provide greater confidence in translating target biology and compound pharmacology to patients. Multiple cell types commonly used within research have been grouped according to biology. Examples include erythroid, lung epithelial, hepatocyte cell types and immortalised models of monocyte / macrophage biology. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2018-10-23.	Illumina HiSeq 2500	18
EGAD00001004414	There is currently a drive to establish cell based assay systems of greater human biological and disease relevance through the use of well characterised transformed cell lines, primary cells and complex cellular models (e.g. co-culture, 3D models). However, although the field is gaining valuable experience in running more non-standard & complex cell assays for target validation and compound pharmacology studies, there is the lack of a systematic approach to determine if this expansion in cell assay models is reflected in increased human biological and disease relevance. The increasing wealth of publically available transcriptomic, and epigenome (ENCODE and Epigenome Roadmap) data represents an ideal reference mechanism for determining the relationship between cell types used for target & compound studies to primary human cells and tissues from both healthy volunteers & patients. The CTTV020 epigenomes of cell line project aims to generate epigenetic and transcriptomic profiles of cell lines and compare these with existing and newly generated reference data sets from human tissue and cell types. The aim is to identify assay systems which will provide greater confidence in translating target biology and compound pharmacology to patients. Multiple cell types commonly used within research have been grouped according to biology. Examples include erythroid, lung epithelial, hepatocyte cell types and immortalised models of monocyte / macrophage biology. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2018-10-23.	Illumina HiSeq 2500	12
EGAD00001004415	There is currently a drive to establish cell based assay systems of greater human biological and disease relevance through the use of well characterised transformed cell lines, primary cells and complex cellular models (e.g. co-culture, 3D models). However, although the field is gaining valuable experience in running more non-standard & complex cell assays for target validation and compound pharmacology studies, there is the lack of a systematic approach to determine if this expansion in cell assay models is reflected in increased human biological and disease relevance. The increasing wealth of publically available transcriptomic, and epigenome (ENCODE and Epigenome Roadmap) data represents an ideal reference mechanism for determining the relationship between cell types used for target & compound studies to primary human cells and tissues from both healthy volunteers & patients. The CTTV020 epigenomes of cell line project aims to generate epigenetic and transcriptomic profiles of cell lines and compare these with existing and newly generated reference data sets from human tissue and cell types. The aim is to identify assay systems which will provide greater confidence in translating target biology and compound pharmacology to patients. Multiple cell types commonly used within research have been grouped according to biology. Examples include erythroid, lung epithelial, hepatocyte cell types and immortalised models of monocyte / macrophage biology. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2018-10-23.	Illumina HiSeq 2500	9
EGAD00001004416	There is currently a drive to establish cell based assay systems of greater human biological and disease relevance through the use of well characterised transformed cell lines, primary cells and complex cellular models (e.g. co-culture, 3D models). However, although the field is gaining valuable experience in running more non-standard & complex cell assays for target validation and compound pharmacology studies, there is the lack of a systematic approach to determine if this expansion in cell assay models is reflected in increased human biological and disease relevance. The increasing wealth of publically available transcriptomic, and epigenome (ENCODE and Epigenome Roadmap) data represents an ideal reference mechanism for determining the relationship between cell types used for target & compound studies to primary human cells and tissues from both healthy volunteers & patients. The CTTV020 epigenomes of cell line project aims to generate epigenetic and transcriptomic profiles of cell lines and compare these with existing and newly generated reference data sets from human tissue and cell types. The aim is to identify assay systems which will provide greater confidence in translating target biology and compound pharmacology to patients. Multiple cell types commonly used within research have been grouped according to biology. Examples include erythroid, lung epithelial, hepatocyte cell types and immortalised models of monocyte / macrophage biology. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2018-10-23.	Illumina HiSeq 2500	9
EGAD00001004417	Data supporting: "The landscape of selection in 551 Esophageal Adenocarcinomas defines genomic biomarkers for the clinic." Frankell et al. WGS (BAM files) 379 matched tumour-normal pairs	Illumina HiSeq 2000	1
EGAD00001004419	Summary statistics of a GWAS meta-analysis for severe acne. A total of 7,441,713 genotyped and imputed variants were used for 5,602 European severe acne cases and 21,120 matched population controls.		1
EGAD00001004420	Whole exome and targeted sequencing data from 11 glioblastoma multiforme patients. A total of 70 tumour specimens and 11 blood samples were used for whole exome sequencing (WES) using the Agilent SureSelectXT Human All Exon V5 Kit. Two custom targeted sequencing panels were designed using the using Agilent’s Haloplex (TES1) or Agilent SureSelect XT2 technology (TES2). Libraries were sequenced on an Illumina HiSeq2500	Illumina HiSeq 2500	194
EGAD00001004421	FASTQ files of the RNA-Seq data for both the normal and tumor samples for the study "Genomic landscape of lung adenocarcinoma in East Asians". For raw read count data as well as other metadata, please download from https://src.gisapps.org/OncoSG_public/study/summary?id=GIS031 by clicking the download icon next to the dataset title.	Illumina HiSeq 4000	260
EGAD00001004422	FASTQ files of the Exome-Seq data for both the normal and tumor samples for the study "Genomic landscape of lung adenocarcinoma in East Asians". For mutations and copy number variants called by this study, please download from https://src.gisapps.org/OncoSG_public/study/summary?id=GIS031 by clicking the download icon next to the dataset title.	Illumina HiSeq 4000	418
EGAD00001004423	Data supporting: "The landscape of selection in 551 Esophageal Adenocarcinomas defines genomic biomarkers for the clinic." Frankell et al. RNAseq (BAM files) 116 tumours	Illumina HiSeq 2000	1
EGAD00001004424	Prostate Cancer - RNA-Seq unmapped reads	Illumina HiSeq 2000	23
EGAD00001004425	FASTQ files from sequencing to < 0.4x depth of coverage of thirteen glioma patients. Indexed sequencing libraries were prepared using a commercially available kit (ThruPLEX-Plasma Seq, Rubicon Genomics). Libraries were pooled in equimolar amounts and sequenced on a HiSeq 4000 (Illumina) generating 150-bp paired-end reads.	Illumina HiSeq 4000	13
EGAD00001004426	Spiradenocarcinoma is a rare cutaneous sweat gland adnexal cancer with potential for aggressive behaviour. They are classified histologically into low- and high-grade tumours, with morphologically low-grade tumours thought to behave more favourably. However, limited information is available, with only 18 published cases. We have collected morphologically low-grade spiroadenocarcinomas (one with a lung metastasis) and high-grade spiroadenocarcinomas, as well as some spiradenomas (benign lesions), cylindromas (another type of malignant cutaneous sweat gland adnexal tumour) and hybrid spiradenoma-cylindromas. H&E-stained sections were reviewed, follow-up was obtained, and immunohistochemistry for Ki-67, p53 and, MYB has been performed. The tumours were solitary, measuring 0.8-7?cm (median: 2.7?cm), with a predilection for the head and neck of elderly patients (median age: 72 years; range 53-92) without gender bias. Histologically, the tumours were multinodular and located in deep dermis and subcutis. A pre-existing spiradenoma was present in all cases. The malignant component was characterized by expansile growth with loss of the dual cell population, up to moderate cytological atypia and increased mitotic activity (median: 10/10 HPF; range 1-28). Additional findings included squamoid differentiation (n=9), necrosis (n=7), and ulceration (n=5). P53 expression was variable and no significant differences were noted in the benign compared with the malignant parts of the tumours. In contrast, in the malignant components the Ki-67 proliferative index was slightly increased, and MYB expression was lost. Follow-up (median: 67 months; range: 13-132) available for 16 patients (84%) revealed a local recurrence rate of 19% but no metastases or disease-related mortality. Here we wish to exome sequence these cases to define the first genomic landscape for this malignancy. This dataset contains all the data available for this study on 2018-10-29.	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	164
EGAD00001004427	55 single read fastq files of low-pass WGS sequencing used to determine copy number aberrations and 34 paired read fastq files of 48 cancer gene exon sequencing	Illumina HiSeq 2000	58
EGAD00001004428	Peripheral T-cell lymphomas not otherwise specified (PTCL-NOS) represent a heterogeneous group of nodal and extra-nodal mature T-cell lymphomas, with a low prevalence in Western countries. PTCL-NOSs account for about 25% of all PTCLs and are currently diagnosed based on exclusion criteria, as this lymphomas lack unifying morphological, phenotypic and genomic features. Cytogenetic and FISH analysis of PTCL-NOS samples have not revealed recurrent pathogenetic abnormalities, while gene expression profiling has shown only partial ability to segregate cases representing homogeneous clinic-pathological entities. This underscores the need to look at PTCL-NOS with innovative and high-throughput approaches to identify recurrent genetic lesions that could further our understanding of the biology of this heterogeneous group of diseases, provide better diagnostic tools and perhaps new targets for innovative treatments. Our aim is to study ~15 patients affected by PTCL-NOS. Out study will be funded by a private, non-profit Italian cancer research fund (Associazione Italiana per la Ricerca sul Cancro, www.airc.it) based on a grant owned by Anna Dodero and Cristiana Carniti, hematologists at INT. Samples will be analysed by whole genome sequencing using Illumina X10 machines, on a 150bp-PE protocol. Data will be analysed using the pipeline available in Team 78, under the supervision of Peter Campbell, the WTSI faculty who will oversee the project, and by Francesco Maura, visiting scientist at the WTSI. . This dataset contains all the data available for this study on 2018-10-30.	HiSeq X Ten	27
EGAD00001004429	ChIP-Seq files for PCGP ATRX study paper titled "MYCN Amplification and ATRX Mutations are Incompatible in Neuroblastoma"	Illumina HiSeq 2000	121
EGAD00001004430		Illumina HiSeq 2500	56
EGAD00001004431	SNV calls generated using the SomaticSniper-TITAN-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study using single and multiple regions		1
EGAD00001004432	SNV calls generated using the SomaticSniper-Battenberg-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study using single and multiple regions		1
EGAD00001004433	SNV calls generated using the MuTect-TITAN-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study using single and multiple regions		1
EGAD00001004434	SNV calls generated using the MuTect-Battenberg-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study using single and multiple regions		1
EGAD00001004435	Childhood cerebellar tumours mirror conserved fetal transcriptional programs	Illumina HiSeq 2000	145
EGAD00001004436	Exome sequencing was performed on samples from patients 064, 105, and 8760, including the remission sample for 105. Exomes were captured using the Agilent SureSelect All Exon kit v5 kit and libraries sequenced on HiSeq 2000 or 2500 or 4000.	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	6
EGAD00001004437	RNA-seq data from TEX cells transduced with HIST1H3H WT, HIST1H3H K27M, HIST1H3F WT, HIST1H3F K27I and Luc2 control in triplicate. In addition, RNA-Seq was performed on untransduced TEX cells. Libraries [rRNA-depleted stranded (HMR)] were sequenced on an Illumina Hiseq 4000 platform to generate 100 bp paired-end reads.	Illumina HiSeq 4000	18
EGAD00001004439	mRNA sequencing of 50 undifferentiated sarcoma tumour samples and 5 adjacent muscle tissue samples. BAM files are provided, with metadata specifying which samples are tumour/normal.	Illumina HiSeq 2500	53
EGAD00001004440	CNA calls generated using the SomaticSniper-TITAN-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study using single and multiple regions		1
EGAD00001004441	CNA calls generated using the SomaticSniper-TITAN-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study using single and multiple regions		1
EGAD00001004442	CNA calls generated using the MuTect-TITAN-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study using single and multiple regions		1
EGAD00001004443	CNA calls generated using the MuTect-Battenberg-PhyloWGS from the CPC-GENE Subclonal Heterogeneity study using single and multiple regions		1
EGAD00001004446	WGS files for Mullighan PAX5_B-ALL paper titled "PAX5-driven Subtypes of B-cell Acute Lymphoblastic Leukemia"	Illumina HiSeq 2000	16
EGAD00001004447	WES files for Mullighan PAX5_B-ALL paper titled "PAX5-driven Subtypes of B-cell Acute Lymphoblastic Leukemia"	Illumina HiSeq 2000	128
EGAD00001004448	60 samples from Burkina Faso and Ghana generated for the H3Africa Chip Design Study. The dataset includes BAM, FASTQ and decompressed gVCF files.	Illumina HiSeq 2500	60
EGAD00001004449	16S sequencing data (dual-index) from 1054 Flemish Gut Flora Project (FGFP) samples	Illumina HiSeq 2500 unspecified	1054
EGAD00001004450	Mapped BAM files of 162 tumor/normal WES experiments.	unspecified	324
EGAD00001004451	Whole genome sequencing data of organoid cultures derived from human bone marrow-derived and cord blood-derived hematopoietic stem and multipotent progenitor cells to study the mutation accumulation.	HiSeq X Ten	30
EGAD00001004452	Tumor (CD138+ plasma cells) and non-tumor (peripheral blood white cells or stem cell harvest) DNA from patients with a plasma cell dyscrasias were sequenced. Whole genome sequencing using high molecular weight DNA was performed using the 10X Genomics Chromium platform on either a HiSeq4000 or NovaSeq (Illumina) using 100 to 150 bp paired-end reads. The dataset consists of 111 patients, and 223 samples in total (matched tumor and control per patient; one patient had 2 tumor samples sequenced). Diseases consisted of 2 MGUS patients, 8 SMM patients, 91 newly diagnosed myeloma patients, 1 previously treated patient, 4 relapsed MM patients, and 5 PCL patients. Paired RNA-seq data are available for 81 of the samples under study EGAS00001003411.	HiSeq X Ten	231
EGAD00001004453	We performed targeted DNA sequencing of primary uveal melanomas and their matched metastases from 35 patients, analyzing a total of 124 tissues. Sequencing was performed on an Illumina HiSeq 2500 instrument using a panel of 538 genes commonly involved in cancer. 124 BAM files were generated.	Illumina HiSeq 2500	124
EGAD00001004454	FGFP (Flemish Gut Flora Project, N=100) and TR-MDD (Treatment-Resistant Major Depression Disorder, N=7) shotgun sequencing samples	Illumina HiSeq 2500	157
EGAD00001004455	Whole exome and transcriptome sequencing of 12 melanoma patients (including technical replicates). linicalTrials.gov Identifier: NCT02035956. Paper: Personalized RNA mutanome vaccines mobilize poly-specific therapeutic immunity against cancer - Nature volume 547, pages 222–226	Illumina HiSeq 2500	72
EGAD00001004456	This dataset contains short-read whole-genome sequencing data for individuals with neurodevelopmental disorders and their relatives from the NIHR-BioResource Rare Disease Consortium.	Illumina HiSeq 2000	4
EGAD00001004457	Datasets Galaxy 929/938 describe the amplified single chromosome sequencing data.		2
EGAD00001004458	The study aims to find bacteria in neural tissue from patients with amyotrophic lateral sclerosis	Illumina MiSeq	34
EGAD00001004459	Each dataset cosist of WES data from 5 samples (1 patient): original leukemia initial diagnosis T-ALL, original leukemia relapse T-ALL, PDX derived of initial diagnosis T-ALL, PDX derived of relapse T-ALL, remission (normal control)	Illumina HiSeq 2000 Illumina HiSeq 2500 NextSeq 500	164
EGAD00001004461	RNAseq files (dataset 1 of 2) for Mullighan PAX5_B-ALL paper titled "PAX5-driven Subtypes of B-cell Acute Lymphoblastic Leukemia"	Illumina HiSeq 2000	1083
EGAD00001004462	20 whole genome seq	Illumina NovaSeq 6000	20
EGAD00001004463	RNAseq files (dataset 2 of 2) for Mullighan PAX5_B-ALL paper titled "PAX5-driven Subtypes of B-cell Acute Lymphoblastic Leukemia"	Illumina HiSeq 2000	204
EGAD00001004464	whole exome sequencing data	Illumina HiSeq 2000	523
EGAD00001004465	Gene expression comparison between human colonic epithelial cells cultured with Klebsiella pneumoniae (KP) derived from PSC patients versus KP JCM1662.	Illumina HiSeq 2500	4
EGAD00001004466	Low pass WGS: 48 samples (5 blood samples from 6 patient data): 22 Tumour cores and 26 normal/benign cores (Next Seq )	NextSeq 500	48
EGAD00001004467	WES: 48 samples (5 blood samples from 6 patient data): 22 Tumour cores and 26 normal/benign cores (HiSeq)	Illumina HiSeq 2500	48
EGAD00001004468	Total RNA Seq: 15 Samples (2 patients (MF1 and MF3)) (HiSeq) and Poly A RNA Seq: 27 Samples (4 patients Normal and Tumour ) (HiSeq)	Illumina HiSeq 2500	42
EGAD00001004469	829 bam files from exome sequencing of human tetralogy of fallot patients	Illumina HiSeq 2000	829
EGAD00001004470	Exome sequence data from microcephalic dwarfism patients with de novo DNMT3A variants	Illumina HiSeq 2500	6
EGAD00001004471	RNA-seq data generated from cells from control individuals and individuals with de novo DNMT3A variants causing microcephalic dwarfism.	Illumina HiSeq 2500	4
EGAD00001004472	RRBS sequence data from one control and one patient with de novo DNMT3A mutations resulting in microcephalic primordial dwarfism.	NextSeq 550	2
EGAD00001004473	ChIP-seq data from controls and patients with de novo DNMT3A mutations resulting in microcephalic primordial dwarfism.	Illumina HiSeq 2500 NextSeq 550	28
EGAD00001004474	Aligned, merged and deduplicated BAM files from HiSeq whole genome sequencing of 28 samples: matched tumour-normal pairs from 14 melanocytic nevi cases		-
EGAD00001004475	PacBio sequencing data of HKCI-2, HKCI-C1, HKCI-C2, HKCI-C3, HKCI-4, HKCI-9, HKCI-11, HKCI-5A and MIHA. All the data (9 samples) were saved in pacbio hdf5 format.	PacBio RS	9
EGAD00001004476	RNA sequence data for HKCI-2, HKCI-C1, HKCI-C2, HKCI-C3 and MIHA. RNA-seq data of all the 5 samples were stored in compressed fastq format.	Illumina HiSeq 2000	5
EGAD00001004478		Illumina HiSeq 2500	24
EGAD00001004479	There is a total of 4 sample data (2 WGS and 2 RNAseq) belong to 2 patient deposited in this study.		4
EGAD00001004480	In this work, we establish and characterized a low-passage cervix cancer cell line from a Brazilian patient with squamous cell carcinoma. The dataset contains three samples from the same patient (blood, tumor tissue and the primary cell line). The technology used was exome sequencing and the file type available is fastq files.	Illumina HiSeq 2500	3
EGAD00001004481	Transcriptomic sequences of small intestinal Plasma cell (PCs)s from Celiac disease patients. RNAseq data produced using Illumina paired-end (75bp) reads. Includes only raw data of sequences (fastq format). Samples from seven Celiac disease patients and four healthy controls. Samples from Celiac disease patients contain sub-groups of PCs that are either specific or not specific to autoantigen of the disease (TG2).	NextSeq 500	18
EGAD00001004482	Whole genome sequencing and whole exome sequencing of 13 pediatric osteosarcoma patients including 13 primary, 10 metastatic, and 3 relapsed tumors.	Illumina HiSeq 2000 Illumina HiSeq 2500	78
EGAD00001004483	Microarray analysis of mtDNA		5800
EGAD00001004484	Adeno-associated virus (AAV) is a defective mono-stranded DNA virus, endemic in human population (35-80%). Recurrent clonal AAV2 insertions are associated with the pathogenesis of rare human hepatocellular carcinoma (HCC) developed on normal liver. This study aimed to characterize the natural history of AAV infection in the liver and its consequence in tumor development. In silico analyses using viral capture data explored viral variants and new clonal insertions. Clonal AAV insertions were positive selected during HCC development on non-cirrhotic liver challenging the notion of AAV as a non-pathogenic virus.	Illumina HiSeq 2000	20
EGAD00001004485	This project focused on identifying rare coding variation that substantially increases risk of VEOIBD by exome sequencing of VEOIBD patients and some of their family members. Here you can find BAM files from an affected proband (P2) and his unaffected parents. In this study ALPI mutations were identified as a likely cause of the disease.	Illumina HiSeq 2500	3
EGAD00001004486	This dataset consists of aligned DNA sequencing data in BAM file format from cell-free DNA and white blood cells from 24 men with metastatic prostate cancer. One cell-free DNA sample and one white blood cell sample is available for each patient, resulting in 48 total BAM files in this dataset. The sequencing was performed using a hybrid capture-based targeted panel of 73 prostate cancer driver genes.	Illumina HiSeq 2500	48
EGAD00001004487	Whole transcriptome sequencing (WTS) of a longitudinal breast cancer (BC) cohort consisting of 146 cases (281 tumors, 109 pairs), including 52 (38%) that achieved pathologic complete responses (pCR) and 85 (62%) that harbored residual diseases at time of surgery.	Illumina HiSeq 2500	235
EGAD00001004488	Dataset of BAM files from patients with proven bacterial meningitis in Malawi. Samples consist of BAM files from PolyA RNA seq runs from patients classifed as admission whole blood or CSF on admission (pre-antibiotics) in the Emergency Department. Subsequent BAM files are identical runs from whole blood from the same patients, taken at either day 10 or day 40 post admission to hospital. All patients have disease, they are divided into survivors and non-survivors at the day 40 time point.	NextSeq 500	45
EGAD00001004489	This dataset contains matched RNA-Seq and miRNA-Seq fastq files from 109 match samples of 34 human Papillomavirus-negative Head and Neck cancer patients, including 72 lymph nodes, 29 tumor and 8 normal samples.	Illumina HiSeq 2500	218
EGAD00001004490	The dataset consists of samples from papillary thyroid cancer patients. A total of 11 DNA samples from blood/normal and cancer tissue are subjected to whole exome sequencing using Illumina. The fastq files generated were aligned with reference genome ‘hg19’, duplicates were marked, realignment around indels and quality recalibration were performed to produce good quality variants. The recalibrated “.bam” files are included with this dataset.		11
EGAD00001004491	RNA-seq of seven small intestinal neuroendocrine tumors, sequenced with illumina Nextseq 500.	NextSeq 500	7
EGAD00001004492	BAM files for high-throughput whole genome sequence data of 17 modern Aboriginal Australians	HiSeq X Ten	17
EGAD00001004493	Transcriptome of Ewing sarcoma tumors (ICGC project). Fastq files of 57 RNA-seq are available (2x101bp).	Illumina HiSeq 2500	57
EGAD00001004494	Illumina HiSeqXTen platform sequencing data of whole genome libraries prepared from 156 matched tumour-normal samples from 78 donors		156
EGAD00001004495	Whole genome sequencing data of tumor tissues, adjacent normal tissues, and peripheral blood from CRC patients.	Illumina HiSeq 4000	107
EGAD00001004496	Tumour and control from a patient with uveal melanoma with a MBD4 germline mutation. Samples were whole exome sequenced.	Illumina HiSeq 2000	2
EGAD00001004497	Single-cell RNA-seq profiling of immune cells sorted from human Melanoma tumors (and several matching PBMC samples). Contains de-multiplexed FASTQ files per plate (MARS-seq amplification batch, total 204 samples) and also de-multiplexed FASTQ files of single-cell TCRb-seq.	Illumina MiSeq NextSeq 500	204
EGAD00001004498		Illumina HiSeq 2500	9
EGAD00001004499	In this study, we aimed to identify somatic structural variation of T-cell acute lymphoblastic leukemias (T-ALLs_ from patient-derived xenografts (PDX) at the single-cell level. For this purpose, we performed strand-specific single-cell sequencing of PDX-derived T-ALL relapse samples from two juvenile patients (P1, P33). To validate structural variation detected via scTRIP, we profiled whole exome sequencing (WES) data from P33 (samples taken during initial disease, remission, relapse), and mate-pair sequencing data from P1 (relapse).	Illumina HiSeq 2000 NextSeq 500	124
EGAD00001004500	Mapped data for 10 Colon MSI cancer samples	Illumina HiSeq 2500	10
EGAD00001004501	Genomic and transcriptomic data from a cohort of 35 RAS wild-type colorectal cancers. All 35 cases were DNA sequenced at baseline (BL) before treatment with single agent cetuximab. Progressive disease (PD)-biopsies were taken shortly after radiological progression and successfully exome sequenced from 24/35 cases. mRNA sequencing is available for 25 Baseline and 15 PD samples. ctDNA from 9 cases that progressed after prolonged cetuximab benefit were also deep sequenced.	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500	155
EGAD00001004503	Whole exome sequencing data of 19 snap-frozen peritoneal mesothelioma (tumor) samples and 16 matched normal samples. Sequencing library was prepared using Ion AmpliSeq Exome RDY Library Preparation. Samples were sequenced on the Ion Proton System using the Ion PI Hi-Q Sequencing 200 Kit and Ion PI v3 chip.	Ion Torrent Proton	35
EGAD00001004504	RNA-seq data of 15 snap-frozen tissue of peritoneal mesothelioma. The strand specific RNA library prepared using TruSeq (Illumina) and pair-end sequencing performed in Illumina HiSeq 4000. The datasets contains paired fastq files for each of 15 tumor samples.	Illumina HiSeq 4000	15
EGAD00001004505	49 samples from Nigeria generated for the H3Africa Chip Design Study. The dataset includes BAM, FASTQ and decompressed gVCF files.	Illumina HiSeq 2500	49
EGAD00001004506	WES files for CHEN WTPDX paper titled "Forty-Five patient-derived xenografts capture the clinical and biological heterogeneity of Wilms tumor"	Illumina HiSeq 2000	107
EGAD00001004507	RNAseq files for CHEN WTPDX RNASEQ paper titled "Forty-Five patient-derived xenografts capture the clinical and biological heterogeneity of Wilms tumor"	Illumina HiSeq 2000	88
EGAD00001004509	Ovarian cancer (OC) is a heterogeneous disease usually diagnosed at a late stage. Experimental in vitro models that faithfully capture the hallmarks and tumor heterogeneity of OC are limited and hard to establish. We present a novel protocol that enables efficient derivation and long-term expansion of OC organoids. Utilizing this protocol, we have established 56 organoid lines from 32 patients, representing the spectrum of ovarian neoplasms, including non-malignant borderline tumors, as well as mucinous, clear-cell, endometrioid, low- and high-grade serous carcinomas. OC organoids recapitulate histological and genomic features of the pertinent lesion from which they were derived, illustrating intra- and inter-patient heterogeneity, and can be genetically modified. We show that OC organoids can be used for drug screening assays and capture different tumor subtype responses to the gold standard platinum-based chemotherapy, including acquisition of chemoresistance in recurrent disease. Finally, OC organoids can be xenografted, enabling in vivo drug sensitivity assays. Taken together, this demonstrates their potential application for research and personalized medicine.	NextSeq 500	50
EGAD00001004512	High throughput sequencing dataset of antibody repertoires from naive B-cells, taken from blood samples of 100 individuals from Norway. 48 healthy controls and 52 patients with celiac disease. Sequencing was performed using a 300*2 paired-end kit by Illumina MiSeq. The sequences were processed using pRESTO. Each fastq file in the dataset is the repertoire of a single individual.	Illumina MiSeq	100
EGAD00001004513	Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Cerebral Small Vessel Disease (CSVD) Rare Disease domain	Illumina HiSeq 2000	-
EGAD00001004514	Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Hypertrophic Cardiomyopathy (HCM) Rare Disease domain	Illumina HiSeq 2000	-
EGAD00001004515	Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project.Participants from the Intrahepatic Cholestasis of Pregnancy (ICP) Rare Disease domain	Illumina HiSeq 2000	2
EGAD00001004516	Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Neuropathic Pain Disorders (NPD) Rare Disease domain	Illumina HiSeq 2000	-
EGAD00001004517	Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Primary Membranoproliferative Glomerulonephritis (PMG) Rare Disease domain	Illumina HiSeq 2000	-
EGAD00001004518	Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Steroid Resistant Nephrotic Syndrome (SRNS) Rare Disease domain	Illumina HiSeq 2000	-
EGAD00001004519	Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Bleeding, Thrombotic and Platelet Disorders (BPD) Rare Disease domain	Illumina HiSeq 2000	1
EGAD00001004520	Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Inherited Retinal Disorders (IRD) Rare Disease domain	Illumina HiSeq 2000	-
EGAD00001004521	Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Multiple Primary Malignant Tumours (MPMT) Rare Disease domain	Illumina HiSeq 2000	-
EGAD00001004522	Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Neurological and Developmental Disorders (NDD) Rare Disease domain	Illumina HiSeq 2000	-
EGAD00001004523	Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project.Participants from the Primary Immune Disorders (PID) Rare Disease domain	Illumina HiSeq 2000	-
EGAD00001004524	Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Stem cell and Myeloid Disorders (SMD) Rare Disease domain	Illumina HiSeq 2000	-
EGAD00001004525	Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Pulmonary Arterial Hypertension (PAH) Rare Disease domain	Illumina HiSeq 2000	-
EGAD00001004526	We set out to determine ctDNA abundance at de novo mCSPC diagnosis and whether ctDNA provides complementary clinically relevant information to a prostate biopsy. We collected and sequenced 77 plasma cell-free DNA samples from 53 newly diagnosed patients with mCSPC. Targeted sequencing was also performed on DNA from 48 diagnostic prostate tissue samples.	Illumina HiSeq 2500	178
EGAD00001004528	This dataset contains 31 pancreatic organoid samples used in the 'organoid data of pancreatic cancers ' study	HiSeq X Ten	31
EGAD00001004529	This dataset contains 53 blood samples used as controls in study EGAXXXXX and EGAXXXXX	HiSeq X Ten	53
EGAD00001004530	This dataset includes somatic small variant calling files derived from fifteen metastatic samples from cutaneous squamous cell carcinoma matched to normal blood samples. These samples were whole-genome sequenced by HiSeq X Ten and the resulting reads were mapped against the human genome (hg37) using BWA-MEM 0.7.10-r789. Somatic variant calling was then performed using strelka 1 (version 2.0.17).		13
EGAD00001004532	In this dataset there are 55 whole genome sequencing samples of epithelial ovarian carcinoma (bam files).	Illumina HiSeq 2500	55
EGAD00001004533	48 samples from Botswana generated for the H3Africa Chip Design Study. The dataset includes BAM, FASTQ and decompressed gVCF files.	Illumina HiSeq 2500	48
EGAD00001004534	Whole exome sequencing of 17 tumors from 12 different individuals with biallelic germline NTHL1 mutations from 9 different tissue types. Provided are 17 bam files which are mapped to human genome version GRCh37.	Illumina HiSeq 4000 NextSeq 500	17
EGAD00001004535	Dataset of 27 whole genome sequencing files (BAM), which cover 12 individuals out of which four suffer from COPD. From 9 individuals, there are sequencing data from blood and from lung brushings at one single site available, from another 3 there is additionally a lung brushing sequencing file from a second site available. Comparison of blood with lung brushings allows the calling of somatic mutations within a tissue.	Illumina MiSeq	27
EGAD00001004537	Primary pediatric osteosarcoma samples were collected and profiled using WGS. When possible, germline and tumor samples were collected. For some patients, multiple tumor tissues were collected and sequenced. Some of the samples were used to derive PDTX models, which were also profiled with WGS. Paired end sequencing was performed on Illumina HiSeq instruments and FASTQ files reported.	unspecified	75
EGAD00001004538	Primary pediatric osteosarcoma samples were collected and profiled using RNAseq. When possible, germline and tumor samples were collected. For some patients, multiple tumor tissues were collected and sequenced. Some of the samples were used to derive PDTX models, which were also profiled with RNAseq. Paired end sequencing was performed on Illumina HiSeq instruments and FASTQ files reported.	unspecified	30
EGAD00001004539	Whole exome sequencing (WES) libraries were prepared from 200ng of genomic DNA using the Agilent SureSelect XT Target Enrichment System for Illumina Paired-End Multiplexed Sequencing Library coupled with the Agilent SureSelect XT Human all exon v6 capture reagent. Libraries were sequenced on a NextSeq 550 sequencer using the High output 300 cycles kit generating 150bp paired end single-indexed reads. Alignment against b37 using Novoalign (version 3.02.08).	Illumina HiSeq 2000	48
EGAD00001004541	RNA-sequencing of human hepatocellular carcinoma biopsies (n=14), 44 HCC xenografts derived from 11 HCC biopsies and 3 lymphoma xenografts derived from 3 HCC biopsies. RNA-sequencing was performed using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero Gold (lllumina). SR126 sequencing was performed on an Illumina HiSeq 2500 using v4 SBS chemistry according to the manufacturer’s guidelines.	Illumina HiSeq 2500	61
EGAD00001004542	The dataset includes whole exome sequencing (WES) data on 57 matched esophageal tumor-normal pair. The Agilent Sure-Select Human All Exon V4 plus UTRs reagent was used to capture the target exons and UTRs and Illumina HiSeq 2000 instrument was used to sequence the target region with approximately 72-fold coverage.	Illumina HiSeq 2000	114
EGAD00001004543	RNA from CD138+ plasma cells from patients with a plasma cell dyscrasia were sequenced using the TruSeq Stranded Total RNA Ribo-zero Gold kit (Illumina). Paired-end 75bp reads were generated on a NextSeq500 or HiSeq4000 (Illumina). The dataset consists of 1 MGUS sample, 5 SMM samples, 69 newly diagnosed myeloma samples, 1 relapsed myeloma sample, 1 previously treated myeloma samples, and 4 PCL samples. Matching whole genome sequencing data are available for these samples under study EGAS00001003164.	NextSeq 500	83
EGAD00001004544	This Dataset is currently hosted by the European Nucleotide Archive. To access the data contained within the Dataset please follow the link below: https://www.ebi.ac.uk/ena/browser/view/PRJEB39323 Dataset consists of 20 snRNA-seq bam files from 10X v2. 5 samples from postmortem white matter tissue from non-neurological controls and15 samples from different MS lesions from the white matter tissue of 4 postmortem progressive MS patients.	Illumina HiSeq 2500	20
EGAD00001004545	65 paired tumor and normal whole-genome sequencing samples from urothelial bladder carcinomas (UBC, the most common type of bladder cancer) are used to uncover the whole-genome mutational landscape of UBC. Recurrent mutations in noncoding regions affecting gene regulatory elements and structural variations leading to gene disruptions are prevalent in this type of cancer.		65
EGAD00001004547	All normal somatic cells are thought to acquire mutations. However, characterisation of the patterns and consequences of somatic mutation in normal tissues is limited. Uterine endometrium is a dynamic tissue undergoing cyclical shedding and reconstitution lined by a gland-forming epithelium. Whole genome sequencing of normal endometrial glands showed that most are clonal cell populations derived from a recent common ancestor, with mutation burdens differing from other normal cell types and many fold lower than endometrial cancers. Mutational signatures found ubiquitously account for most mutations. Many, in some women all, endometrial glands are colonised by cell clones carrying driver mutations in cancer genes, often with multiple drivers. Total and driver mutation burdens increase with age, but are also influenced by other factors, including body mass index and parity, and clones with drivers often originate during early decades of life. The somatic mutational landscapes of normal cells differ between cell types and are revealing the procession of neoplastic change leading to cancer.	HiSeq X Ten	6
EGAD00001004548	Integration of Genomic and Transcriptional Features in Pancreatic Cancer Reveals Increased Cell Cycle Progression in Metastases - RNA-Seq mapped and unmapped reads	Illumina HiSeq 2500	75
EGAD00001004550	This dataset contains Whole Exome Sequencing of 47 MSI colorectal cancers (CRCs) and paired adjacent normal mucosa	Illumina HiSeq 2000	94
EGAD00001004551	Integration of Genomic and Transcriptional Features in Pancreatic Cancer Reveals Increased Cell Cycle Progression in Metastases - WGS mapped reads		-
EGAD00001004552	10X genomics chromium single-cell RNA-sequencing of (i) patient derived triple negative breast cancer xenograft (ii) primary tumour and ascites ovarian cancer cell lines at tumour recurrence.	NextSeq 550	3
EGAD00001004553	Direct library preparation+ single-cell DNA-sequencing of (i) patient derived triple negative breast cancer xenograft (ii) primary tumour and ascites ovarian cancer cell lines at tumour recurrence.	Illumina HiSeq 2500	980
EGAD00001004554	Uveal melanoma (UM) is the most common primary intraocular malignancy in adults. Despite improvement of diagnosis and treatment of the primary tumor, there is no effective treatment of metastatic disease and approximately half of patients will die within one year or less following metastases detection. Tumor heterogeneity has been proposed as a key factor of drug resistance. However, it has been scarcely studied in UM. The present project aims searching for specific drivers of the metastatic progression, describing the genomic and transcriptomic landscape of metastatic UM, exploring tumor heterogeneity and investigating its role in drug resistance. Thus whole exome sequencing and transcriptomics have been performed on constitutional, primary tumor and metastatic samples from 28 UM patients.	Illumina HiSeq 2500	110
EGAD00001004555	The aim of our project is to decipher the genomic of advanced hepatocellular carcinoma using whole exome sequencing. To this purpose, we aim to compare genetic landscape of advanced hepatocellular carcinoma with early tumor in order to understand the mechanisms of tumor progression. This work will also help to identify new therapeutic targets potentially useful to treat patients at advanced stage. This dataset contain whole exome sequencing aligned reads for 41 tumor with matched normal samples	Illumina HiSeq 2000 Illumina HiSeq 4000	39
EGAD00001004556	In total, 186 FH+ ESCC cases were sequenced using whole-exome sequencing, then 1935 ESCC cases and 1186 geographically-matched healthy controls were sequenced using 7 Mb custom designed Roche SeqCap kit which targeted about 600 genes. The libraries were constructed and then sequenced in Illumina platform.	Illumina HiSeq 2500	3289
EGAD00001004557	The dataset includes BAM, FASTQ and decompressed gVCF files for 50 samples from Benin generated for the H3Africa Chip Design Study.	Illumina HiSeq 2500	50
EGAD00001004558	Genotyping of 43 cases of invasive GAS infection by Illumina HumanCore-24 array and Illumina Global Screening Array.		43
EGAD00001004559	WGBS files for PCGP NBL_MYCN_ATRX paper titled "MYCN Amplification and ATRX Mutations are Incompatible in Neuroblastoma"	Illumina HiSeq 2000	24
EGAD00001004561	Plasma DNA libraries were constructed from 4 mL of plasma without library enrichment, namely without PCR amplification. Paired-end massively parallel sequencing was performed	Illumina HiSeq 2000	169
EGAD00001004563	This dataset contains whole genome sequencing data from 21 primary and relapsed IDH-wt glioblastomas and matched blood controls. Tumors were sequenced at a target coverage of 150x, blood controls at 80x.	HiSeq X Ten	63
EGAD00001004564	This dataset contains strand-specific RNA sequencing data from 16 primary/relapsed sample pairs of IDH-wt glioblastomas	Illumina HiSeq 2000	32
EGAD00001004565	This dataset contains gene panel sequencing data from 43 sample pairs of primary and relapsed IDH-wt glioblastomas. The gene panel covers 50 glioma-associated genes. 14 of the sequenced sample pairs were sequenced with whole genome sequencing also and are accessible under EGAD00001004563.	Ion Torrent Proton	86
EGAD00001004566	WES files for Newman MAP3K8 melanoma paper titled "Clinical genome sequencing uncovers potentially targetable truncations and fusions of MAP3K8 in spitzoid and other melanomas"	Illumina HiSeq 2000	2
EGAD00001004567	RNASeq files for MAP3K8 melanoma paper titled "Clinical genome sequencing uncovers potentially targetable truncations and fusions of MAP3K8 in spitzoid and other melanomas"	Illumina HiSeq 2000	2
EGAD00001004568	We performed whole genome bisulfite sequencing of plasma DNA in 11 colorectal cancer patients to study the colonic DNA in plasma.	Illumina HiSeq 2500	17
EGAD00001004569	Exome sequencing of 317 rainforest hunter-gatherers (RHG) and neighbouring farmers (AGR) from Central Africa was performed based on the Nextera Rapid Capture Expanded Exome Kit (62-Mb content) with the Illumina HiSeq 2500. The population sample includes the Baka of south-eastern Cameroon and northern Gabon (wRHG), the Bantu-speaking Nzebi and Bapunu sedentary agriculturalists of Gabon (wAGR), and the BaTwa (eRHG) and BaKiga (eAGR) from Uganda. After QC filters, exomes of 300 unrelated individuals were obtained at high coverage (mean depth 68x), including 406,270 variants.	Illumina HiSeq 2500	317
EGAD00001004570	Repeated clinical malaria episodes are associated with modification of the immune system in children.This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/. . This dataset contains all the data available for this study on 2019-01-17.	Illumina HiSeq 2500	113
EGAD00001004571	The BLUEPRINT project is a large-scale project investigating epigenetic mechanisms involved in blood formation, in health and disease. The human variation workpackage (WP10) of the project seeks to characterize the effect of common sequence variation on the epigenome status of a cell. To do this, the project will use highly purified blood cells to minimise "experimental noise" and therefore enhance the power to discover modest effects. Two peripheral blood cell types, the CD14+CD16- monocyte (an important central orchestrator of adaptive immunity and a bridge between innate and adaptive immunity) and the CD65+CD9- neutrophilic granulocyte (the frontline cell for innate immunity) have been selected for this purpose. The two types of cells will be obtained at high purity from adult blood (AB) of 200 healthy males and females, respectively. Cells will be purified by using already validated and fully operational protocols that are based on density gradient centrifugation of the buffy coat obtained from whole blood, followed by magnetic bead-based purification using monoclonal antibodies against Cluster of Differentiation (CD) lineage-specific cell surface markers. This data set contains functional genomics data for gene expression and chromatin state.	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina MiSeq	172
EGAD00001004572			29
EGAD00001004573	Patients with germline mutations in CYLD can develop hundreds of benign skin tumours called cylindromas. The development of multiple tumours within a single patient at sun-protected and sun-exposed sites, varying tumour histological patterns and grades of malignancy allow for the testing of several genetic hypothesis in relation to cutaneous carcinogenesis. By adopting the unprecedented approach of whole genome sequencing of multiple benign skin tumours within individuals in multigenerational families, we set out to study the impact of mutational diversity on models such as multistep carcinogenesis as well as non-sequential carcinogenesis. Using non-negative matrix factorisation (NMF) to discover mutational signatures, we found distinct mutational signatures in identical benign tumours (N=2 patients; n=11 tumours) at sun exposed and sun protected skin tumours within a mother and her daughter. We found recurrent mutations in epigenetic modifying genes in CCS tumours which are known to have an oncogenic dependency on Wnt signalling. We also demonstrate that cutaneous tumours that metastasize to the lung carry a a UV signature, supporting the origin from the skin. Distinct malignant tumours, such as BCC and malignant spiradenocarcinoma carried unique driver mutations. These findings add new dimensions to the existing paradigms of UV-induced skin cancer and highlight the utility of studying rare disease to gain novel insights into genetic mechanisms of tumour formation.	HiSeq X Ten	13
EGAD00001004574	The dataset consists of sequenced cell free DNA (cfDNA) samples from colorectal cancer patients. The samples were sequenced on an Illumina MiSeq machine using a custom amplicon sequencing approach. These amplicons were designed to cover the most common mutation hotspots in colorectal cancer. The data include 138 cfDNA samples from 34 different patients. For each patient several samples are available derived from blood drawn at different time points during treatment. In addition the data include samples from 22 histology slides and 30 samples derived from HT29/HCT116 cell lines that were used as controls.	Illumina MiSeq	189
EGAD00001004575	Whole genome sequncing data of original/SHANK2 modified/SHANK2 knockout. Note that the SHANK2 knockout sample is a different sample from 1_0441_003. Please refer to other paper for the data.	Illumina Genome Analyzer IIx Illumina HiSeq 2500	3
EGAD00001004576	ATRT whole exome sequencing	Illumina HiSeq 2000 unspecified	32
EGAD00001004577	Metabolic reprogramming is linked to cancer cell growth and proliferation, metastasis, and therapeutic resistance in a multitude of cancers. Targeting dysregulated metabolic pathways to overcome resistance, an urgent clinical need in all relapsed/refractory cancers, remains difficult. Through genomics analysis of clinical specimens, we show that metabolic reprogramming towards oxidative phosphorylation (OXPHOS) and glutaminolysis is associated with therapeutic resistance to the Bruton’s tyrosine kinase inhibitor ibrutinib in mantle cell lymphoma (MCL), an incurable B-cell lymphoma with poor clinical outcomes. Inhibition of OXPHOS with a novel, clinically applicable small molecule, IACS-010759, which targets complex I of the mitochondrial electron transport chain, results in significant growth inhibition in vitro and in vivo in ibrutinib-resistant patient-derived cancer models. This work suggests that targeting metabolic pathways to subvert therapeutic resistance is a clinically viable approach to treat highly refractory malignancies.	Illumina HiSeq 4000	26
EGAD00001004578	Chronic liver injury predisposes to cirrhosis and hepatocellular carcinoma, but how somatic mutations accumulate in liver disease is unexplored. We sequenced whole genomes of 400 microdissections of 100-500 hepatocytes from 5 normal and 6 cirrhotic livers. Compared to normal liver, cirrhotic liver had higher mutation burden, especially structural variants, including chromothripsis. Cirrhotic nodules were oligoclonal; sometimes entirely derived from a single, recent common ancestor. Clonal expansions millimeters in diameter occurred in cirrhosis in the absence of known driver mutations. Endogenous mutational processes predominated, although signatures of polycyclic aromatic hydrocarbon and aristolochic acid exposure occurred in some samples. Up to 10-fold within-patient variation in activity of exogenous signatures existed between adjacent cirrhotic nodules, with both clone-specific and microenvironmental forces shaping this heterogeneity. Synchronous hepatocellular carcinomas drew from the same repertoire of mutational signatures as background cirrhotic liver, but with higher burden. Somatic mutations chronicle the exposures, toxicity, regeneration and clonal structure of liver tissue as it progresses from health to disease.	HiSeq X Ten	577
EGAD00001004579	WGS files for Newman MAP3K8 melanoma paper titled "Clinical genome sequencing uncovers potentially targetable truncations and fusions of MAP3K8 in spitzoid and other melanomas"	Illumina HiSeq 2000	2
EGAD00001004580	Whole Exome Sequencing on PDAC PDX1 parental sample and 12 clones.	Illumina HiSeq 2000	13
EGAD00001004581	This includes variant calls (single nucleotide variants and small insertions/deletions) from 8086 (mostly British Pakistani/British Bangladeshi) individuals from the following studies: 1. 3781 British Pakistani/British Bangladeshi adults from East London Genes and Health 2. 2791 British South Asian mothers from Born in Bradford 3. 1428 British South Asian adults from Birmingham 4. 86 individuals (mixed ancestries) from families with rare diseases, from Queen Mary University London All of the Birmingham and most of the Born in Bradford samples were previously sequenced as part of PMID: 26940866. Mapping was done with bwa-mem and variant calling was carried out with GATK HaplotypeCaller. We removed variant sites for which the following was true: SNPs: "QD < 2.0 \|\| FS > 30 \|\| MQ < 40.0 \|\| MQRankSum < -12.5 \|\| ReadPosRankSum < -8.0" Indels: "QD < 2.0 \|\| FS > 30 \|\| ReadPosRankSum < -20.0"		-
EGAD00001004582	This dataset contains DNA sequencing data from 95 colorectal cancer and matched-normal samples. The dataset contains targeted deep sequencing of selected regulatory elements in 95 cancer and matched-normal samples, and data for one sample that was additionally whole-genome sequenced (cancer and matched-normal).	Illumina HiSeq 2000	192
EGAD00001004583	Whole-genome-sequencing (WGS) of human tumours has revealed distinct mutation patterns that hint at the causative origins of cancer. We examined mutational-signatures in 324 WGS of human induced pluripotent stem cells (iPSCs) following exposure to known or suspected environmental carcinogens. 79 agents were tested with or without metabolic activation at concentrations that produced measurable cytotoxicity; 41 yielded characteristic substitution mutational signatures. Some exhibit similarity with signatures found in human tumours. Additionally, 6 agents produced double-substitution signatures and 8 produced indel signatures. Investigating mutation asymmetries across genome topography reveals fully functional mismatch and transcription-coupled repair pathways in iPSCs. Primary adducts induced by environmental carcinogens can be resolved by disparate repair/replicative pathways, resulting in an assortment of signature outcomes even for a single mutagen. This compendium of experimentally-induced mutational-signatures permits further exploration of roles of environmental agents in cancer aetiology, and underscores how human stem cell DNA is directly vulnerable to environmental agents.	HiSeq X Ten	324
EGAD00001004584	RNA-seq of 24 M-CSF differentiated human peripheral monocyte-derived macrophages (MDMs) activated with short exposure (3hours) to LPS, or long exposure (24 hours) to LPS, LPS with IFNγ, IFNγ, IL-4, IL-10, and dexamethasone.	Illumina HiSeq 4000	24
EGAD00001004585		Illumina HiSeq 2500 NextSeq 500	40
EGAD00001004586	Variants and WGS data for Gardner et al. 2018 (biorxiv 471375). One VCF each for Alu, L1, and SVA. Flat text file and WGS for processed pseudogenes.	HiSeq X Ten	1
EGAD00001004588	Whole genome sequencing data of ccRCCs were utilized for somatic variations calling.		82
EGAD00001004589	54 WGS Ewing's sarcoma samples sequenced at The Hospital for Sick Children Toronto (Adam Shlien's lab) and published on Science 2018. Reference Anderson et al. "Rearrangement bursts generate canonical gene fusions in bone and soft tissue tumors"	Illumina HiSeq 2500	54
EGAD00001004590	DNA-seq from plasma of 14 liver transplantation patients	Illumina HiSeq 4000	14
EGAD00001004591	TRACERx 100: RNAseq data from the first 100 TRACERx tumours (164 tumor regions from 64 patients)	Illumina HiSeq 4000	164
EGAD00001004592	Raw data used in the analysis of chromosomally integrated HHV6 genomes in parent-infant pairs, generated by means of full viral genome sequencing by SureSelect target enrichment, in the context of a larger study investigating the relationship between HHV6 and adverse pregnancy outcome.	Illumina MiSeq	24
EGAD00001004593	Precision medicine trials in glioblastoma should be conducted at tumor recurrence. However, second surgery for recurrent GBMs is not routinely performed and therefore molecular data is predominantly derived from primary samples. This study aims to establish the frequency of driver changes at tumor recurrence.	Illumina HiSeq 2500	377
EGAD00001004594	The dataset includes multi-region exome sequencing (MSeq) of four resected treatment naïve mismatch repair deficient gastro-esophageal cancers. Paired-end sequencing was performed on the Illumina HiSeq 2500 or NovaSeq 6000 with a target depth of 200X. Seven primary tumor regions along with tumor-adjacent non malignant tissue were subjected to MSeq. An additional two lymph node metastases were also included from each of two cases.	Illumina HiSeq 2500 Illumina NovaSeq 6000	35
EGAD00001004595	VALCAP files for Ma et al. (2019) Genome Biology (accepted) titled “Analysis of error profiles in deep next-generation sequencing data"	HiSeq X Ten	47
EGAD00001004596	Genome and transcriptome sequence data from a non-small cell lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004597	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004598	Genome and transcriptome sequence data from a mucinous colloid carcinoma of the pancreas patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004599	Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004600	Genome and transcriptome sequence data from a non-small cell adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004601	Genome and transcriptome sequence data from a metastatic leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004602	Genome and transcriptome sequence data from a metastatic synovial sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004603	Genome and transcriptome sequence data from a metastatic squamous cell carcinoma of the cheek patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004604	Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004605	Genome and transcriptome sequence data from a metastatic choroidal melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004606	Genome and transcriptome sequence data from a metastatic colorectal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study	MinION PromethION	1
EGAD00001004607	Genome and transcriptome sequence data from a metastatic colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004608	Genome and transcriptome sequence data from a clear cell carcinoma of the left ovary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004609	Genome and transcriptome sequence data from a low grade chondrosarcoma (bronchus) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004610	Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004611	Genome and transcriptome sequence data from a follicular lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004612	Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004613	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004614	Genome and transcriptome sequence data from a metastatic follicular thyroid carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004615	Genome and transcriptome sequence data from a ganglioglioma of the left temporal lobe patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004616	Genome and transcriptome sequence data from a metastatic squamous cell carcinoma of the cervix patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004617	Genome and transcriptome sequence data from a anorectal gastrointestinal stromal tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004618	Genome and transcriptome sequence data from a metastatic mmmt of the endometrium patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004619	Genome and transcriptome sequence data from a metastatic adenocarcinoma of the stomach patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004620	Genome and transcriptome sequence data from a metastatic non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004621	Genome and transcriptome sequence data from a metastatic basal cell carcinoma of frontal scalp patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004622	Genome and transcriptome sequence data from a metastatic rectosigmoid cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004623	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004624	Genome and transcriptome sequence data from a double-hit lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004625	Genome and transcriptome sequence data from a metastatic gastric cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004626	Genome and transcriptome sequence data from a metastatic non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004627	Genome and transcriptome sequence data from a metastatic adenocarcinoma of the rectosigmoid patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004628	Genome and transcriptome sequence data from a osteosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004629	Genome and transcriptome sequence data from a leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004630	Genome and transcriptome sequence data from a metastatic basal cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004631	Genome and transcriptome sequence data from a metastatic breast cancer ER+ patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004632	Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004633	Genome and transcriptome sequence data from a metastatic renal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004634	Genome and transcriptome sequence data from a metastatic adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004635	Genome and transcriptome sequence data from a extramedullary spinal ependymoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004636	Genome and transcriptome sequence data from a metastatic endometrioid/mucinous ovarian carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004637	Genome and transcriptome sequence data from a metastatic clear cell carcinoma of gynecological origin patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004638	Genome and transcriptome sequence data from a high grade serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004639	Genome and transcriptome sequence data from a metastatic leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004640	Genome and transcriptome sequence data from a metastatic adenocarcinoma of the pancreas patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004641	Genome and transcriptome sequence data from a metastatic alveolar soft part sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004642	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004643	Genome and transcriptome sequence data from a lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004644	Genome and transcriptome sequence data from a metastatic lung adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004645	Genome and transcriptome sequence data from a metastatic fibrosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004646	Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004647	Genome and transcriptome sequence data from a metastatic uveal melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004648	Genome and transcriptome sequence data from a high grade sarcoma of the epithelioid/spindle cell patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004649	Genome and transcriptome sequence data from a cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004650	Genome and transcriptome sequence data from a lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004651	Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004652	Genome and transcriptome sequence data from a metastatic adenocarcinoma of the stomach patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004653	Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004654	Genome and transcriptome sequence data from a breast invasive ductal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004655	Genome and transcriptome sequence data from a locally advanced oropharyngeal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004656	Genome and transcriptome sequence data from a primary unknown patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004657	Genome and transcriptome sequence data from a metastatic ampullar carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004658	Genome and transcriptome sequence data from a metastatic osteosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004659	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004660	Genome and transcriptome sequence data from a metastatic primitive neuro-ectodermal tumor of the testicle patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004661	Genome and transcriptome sequence data from a metastatic clear cell carcinoma of the ovary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004662	Genome and transcriptome sequence data from a metastatic leiomyosarcoma of pancreas patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004663	Genome and transcriptome sequence data from a metastatic adenocarcinoma of the pancreas patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004664	Genome and transcriptome sequence data from a neuroendocrine carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004665	Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004666	Genome and transcriptome sequence data from a metastatic pancreatic neck adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004667	Genome and transcriptome sequence data from a metastatic uveal melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004668	Genome and transcriptome sequence data from a metastatic adenocarcinoma of the esophagus patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004669	Genome and transcriptome sequence data from a non-small cell lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004670	Genome and transcriptome sequence data from a metastatic spindle cell sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004671	Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004672	Genome and transcriptome sequence data from a non-small cell lung cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004673	Genome and transcriptome sequence data from a metastatic adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004674	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004675	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004676	Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004677	Genome and transcriptome sequence data from a metastatic colon adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004678	Genome and transcriptome sequence data from a metastatic thymic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004679	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004680	Genome and transcriptome sequence data from a metastatic adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004681	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004682	Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004683	Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004684	Genome and transcriptome sequence data from a metastatic adenocarcinoma of the GE junction patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004685	Genome and transcriptome sequence data from a metastatic rectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004686	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004687	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004688	Genome and transcriptome sequence data from a metastatic breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004689	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004690	Genome and transcriptome sequence data from a angiosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004691	Genome and transcriptome sequence data from a metastatic transverse colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004692	Genome and transcriptome sequence data from a metastatic non-small cell lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004693	Genome and transcriptome sequence data from a metastatic carcinoma to paraspinal mass with primary unknown patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004694	Genome and transcriptome sequence data from a metastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004695	Genome and transcriptome sequence data from a cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study	MinION PromethION	1
EGAD00001004696	Genome and transcriptome sequence data from a peripheral T-cell lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004697	Genome and transcriptome sequence data from a low-grade serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004698	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004699	Genome and transcriptome sequence data from a lung adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004700	Genome and transcriptome sequence data from a metastatic lung adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004701	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004702	Genome and transcriptome sequence data from a high grade serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004703	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004704	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004705	Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004706	Genome and transcriptome sequence data from a adenocarcinoma of the liver patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004707	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004708	Genome and transcriptome sequence data from a metastatic adenocarcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004709	Genome and transcriptome sequence data from a metastatic high-grade adenocarcinoma of the fallopian tubes patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004710	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004711	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004713	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004714	Genome and transcriptome sequence data from a metastatic non-small cell lung adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004715	Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004716	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004717	Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004718	Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004719	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 1340 samples; filetype=bam	Illumina HiSeq 2500	1340
EGAD00001004720	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 2057 samples; filetype=bam	Illumina HiSeq 2500	2057
EGAD00001004721	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 1970 samples; filetype=bam	Illumina HiSeq 2500	1970
EGAD00001004722	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 2091 samples; filetype=bam	Illumina HiSeq 2500	2091
EGAD00001004723	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 1267 samples; filetype=bam	Illumina HiSeq 2500	1267
EGAD00001004724	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 230 samples; filetype=bam	Illumina HiSeq 2500	230
EGAD00001004725	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 232 samples; filetype=bam	Illumina HiSeq 2500	232
EGAD00001004726	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 239 samples; filetype=bam	Illumina HiSeq 2500	239
EGAD00001004727	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 692 samples; filetype=bam	Illumina HiSeq 2500	692
EGAD00001004728	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 612 samples; filetype=bam	Illumina HiSeq 2500	612
EGAD00001004729	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 1700 samples; filetype=bam	Illumina HiSeq 2500	1700
EGAD00001004730	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 628 samples; filetype=bam	Illumina HiSeq 2500	628
EGAD00001004731	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 596 samples; filetype=bam	Illumina HiSeq 2500	596
EGAD00001004732	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 1735 samples; filetype=bam	Illumina HiSeq 2500	1735
EGAD00001004733	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 585 samples; filetype=bam	NextSeq 550	585
EGAD00001004734	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 766 samples; filetype=bam	NextSeq 550	766
EGAD00001004735	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 2055 samples; filetype=bam	Illumina HiSeq 2500	2055
EGAD00001004736	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 620 samples; filetype=bam	Illumina HiSeq 2500	620
EGAD00001004737	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 624 samples; filetype=bam	NextSeq 550	624
EGAD00001004738	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 481 samples; filetype=bam	NextSeq 550	481
EGAD00001004739	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 378 samples; filetype=bam	Illumina HiSeq 2500	378
EGAD00001004740	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 735 samples; filetype=bam	Illumina HiSeq 2500	735
EGAD00001004741	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 718 samples; filetype=bam	Illumina HiSeq 2500	718
EGAD00001004742	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 493 samples; filetype=bam	NextSeq 550	493
EGAD00001004743	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 1222 samples; filetype=bam	Illumina HiSeq 2500	1222
EGAD00001004744	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 522 samples; filetype=bam	Illumina HiSeq 2500	522
EGAD00001004745	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 488 samples; filetype=bam	Illumina HiSeq 2500	488
EGAD00001004746	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 509 samples; filetype=bam	NextSeq 550	509
EGAD00001004747	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 604 samples; filetype=bam	NextSeq 550	604
EGAD00001004748	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 626 samples; filetype=bam	NextSeq 550	626
EGAD00001004749	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 635 samples; filetype=bam	Illumina HiSeq 2500	635
EGAD00001004750	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 1522 samples; filetype=bam	HiSeq X Five	1522
EGAD00001004751	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 465 samples; filetype=bam	NextSeq 550	465
EGAD00001004752	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 606 samples; filetype=bam	Illumina HiSeq 2500	606
EGAD00001004753	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 615 samples; filetype=bam	NextSeq 550	615
EGAD00001004754	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 636 samples; filetype=bam	NextSeq 550	636
EGAD00001004755	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 968 samples; filetype=bam	HiSeq X Five	968
EGAD00001004756	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 480 samples; filetype=bam	NextSeq 550	480
EGAD00001004757	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 561 samples; filetype=bam	NextSeq 550	561
EGAD00001004758	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 844 samples; filetype=bam	HiSeq X Five	844
EGAD00001004759	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 928 samples; filetype=bam	HiSeq X Five	928
EGAD00001004760	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 635 samples	HiSeq X Five	635
EGAD00001004761	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 1072 samples	HiSeq X Five	1072
EGAD00001004762	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 1436 samples; filetype=bam	HiSeq X Five	1436
EGAD00001004763	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 589 samples; filetype=bam	HiSeq X Five	589
EGAD00001004764	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 656 samples; filetype=bam	HiSeq X Five	656
EGAD00001004765	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 648 samples; filetype=bam	HiSeq X Five	648
EGAD00001004766	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 375 samples; filetype=bam	HiSeq X Five	375
EGAD00001004767	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 755 samples; filetype=bam	HiSeq X Five	755
EGAD00001004768	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 492 samples; filetype=bam	HiSeq X Five	492
EGAD00001004769	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 531 samples; filetype=bam	HiSeq X Five	531
EGAD00001004770	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 1222 samples; filetype=bam	HiSeq X Five	1063
EGAD00001004771	Transposase-based amplification-free single-cell genome direct library preparation in nanowell chips; 522 samples; filetype=bam	HiSeq X Five Illumina HiSeq 2500	742
EGAD00001004772	This dataset contains all the .bam files used for the study.	Illumina NovaSeq 6000	1238
EGAD00001004773	10 bams of WGS data from HiSeqXTen platform; 10 bams of RNA-seq data from HiSeq2500 platform; 7 bams of TruSeq Methyl Capture EPIC sequencing data from HiSeq4000 platform		20
EGAD00001004774	We investigated the somatic genetic basis of Wilms’ tumour and found complex phylogenetic relations between tumours.	HiSeq X Ten	203
EGAD00001004775	Data supporting: "Patient-specific detection of cancer genes reveals recurrently perturbed processes in esophageal adenocarcinoma." Mourikis et al. WGS (BAM files) 521 samples	Illumina HiSeq 2000	1
EGAD00001004776	Data supporting: "Patient-specific detection of cancer genes reveals recurrently perturbed processes in esophageal adenocarcinoma." Mourikis et al. RNAseq (BAM files) 137 samples	Illumina HiSeq 2000	1
EGAD00001004777	RNA sequencing of peripheral immune cells from patients +/- an IBD risk variant. Peripheral immune cells +/- in vitro test compound treatment. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2019-02-15.	Illumina HiSeq 2000 Illumina HiSeq 2500	71
EGAD00001004778	Raw reads from single-cell RNA-sequencing of peripheral blood of five TET2 mutation carriers as well as three non-carrier family members. Single-cells were captured into 10x barcoded gel beads and RNA-sequencing library preparation was done using Chromium Single Cell 3' v2 chemistry (10x Genomics, Pleasanton, CA, USA). Sequencing was performed as recommended with 98bp length of read 2 using HiSeq4000 sequencer.	Illumina HiSeq 4000	8
EGAD00001004779	Raw reads from whole-genome bisulfite sequencing. Whole-genome bisulfite sequencing library preparations and Illumina sequencing of DNA samples from TET2 mutation carriers (Ly9, Ly11, Ly14, Id1) and their age-matched controls (Ly8, Ly10, Ly13, Id2, Id3) was done as a service at BGI (BGI Tech Solutions Co., Ltd., China). Bisulfite treatment was done with EZ DNA Methylation-Gold Kit (Zymo Research, CA, USA) for 300-400bp size-range fragments with methylated adapters in 5' and 3' ends. Sequencing was done with the HiSeq X-Ten platform using paired-end 150 base-pair read length.	HiSeq X Ten	9
EGAD00001004780	Bam files from deep exome sequencing of blood DNA samples from five TET2 mutation carriers (Ly1, Ly2, Ly9, Ly11, Ly14) and three wild-type family members (Ly8, Ly10, Ly13) extracted at multiple time points. Library preparations were performed with SeqCap EZ Exome v3 (Roche, Switzerland) using six different index primers per sample for which paired-end Illumina sequencing was done with 75bp read length and HiSeq4000 sequencer. After alignment (bwa version 0.7.12), base recalibration (GATK 3.5), realignment around indels (GATK 3.5) and duplicate removal (MarkDuplicates; Picard Tools version 1.79), data from libraries with six different indexes were merged.	Illumina HiSeq 4000	15
EGAD00001004781	Raw reads from ChIP- [Anti-Histone H3 (acetyl K27) (Abcam, ab4729)] and input sequencing of EBV transformed lymphoblastoid cells from three carriers of TET2 mutation (Ly9, Ly11 and Ly14) and two wild-type (Ly8 and Ly10) family members using Illumina HiSeq Rapid paired-end 60 bp sequencing.	Illumina HiSeq 2500	5
EGAD00001004782	Bam file from exome sequencing of FFPE sample from Ly3 using SeqCap EZ Human Exome Library (Roche Nimblegen, Inc., WI, USA) and Illumina HiSeq2000 sequencer.	Illumina HiSeq 2000	1
EGAD00001004783	Variant calls from whole-genome sequencing of Ly1-07 using Complete Genomics paired-end sequencing service.	Complete Genomics	1
EGAD00001004784	Raw reads from RNA-sequencing of monocyte-derived macrophages from three individuals with heterozygous TET2 loss (Ly9, Ly11, Ly14) and two wild-type controls (Ly8 and an unrelated control). Libraries were prepared using ScriptSeq RNA-Seq Library Preparation Kit and Illumina sequenced with paired-end 75bp reads.	Illumina HiSeq 2000	15
EGAD00001004785	Raw reads from targeted bisulfite sequencing. The SureSelect Methyl-Seq target enrichment system (Agilent Technologies, Inc., CA, USA) was used to prepare bisulfite sequencing libraries from blood DNA samples of lymphoma patients (Ly1, Ly2), healthy family members (Ly8, Ly9, Ly10, Ly11, Ly12, Ly13 and Ly14), baseline controls (Control1-5), DNMT3A mutation carriers (Id5, Id7, Id9, Id11) and their age-matched controls (Id6, Id8, Id10, Id12). In addition, blood DNA sample of a patient (HLRCC_N7) with germline fumarate hydratase (FH) mutation is included. Illumina paired-end sequencing for targeted libraries from Ly1, Ly2, Ly8, Ly9, Ly10 and Ly11 was done at Karolinska Institutet using 100 base-pair read length and the HiSeq2000 platform. Illumina paired-end sequencing for targeted libraries from Ly12, Ly13, Ly14, and DNMT3A mutation carriers and their age-matched controls was done as a service at BGI (BGI Tech Solutions Co., Ltd., China) using 126 base-pair read length and the HiSeq2500 platform.	Illumina HiSeq 2000 Illumina HiSeq 2500	25
EGAD00001004786	Whole genome sequencing (WGS) detects all mutations in a cancer. “Mutational signatures” are patterns of mutations that report the DNA damage and subsequent DNA repair processes that have occurred in cancers. We present a patient with Xeroderma Pigmentosum that developed metastatic angiosarcoma, unresponsive to all lines of sarcoma therapy. Primary tumour WGS revealed a hypermutated tumour, including clonal ultraviolet light-induced mutational patterns (Signature 7) and subclonal signatures of activating mutations of DNA Polymerase-epsilon (POLE)(Signature 10). These signatures are associated with response to immune-checkpoint blockade. Immunohistochemistry confirmed high PD-L1 expression in metastatic deposits. The patient was commenced on anti-PD-L1 therapy and has responded.	HiSeq X Ten	2
EGAD00001004787	The study includes NGS-based methylC-capture sequencing (MCC-Seq) on 199 visceral adipose tissue and 206 whole-blood DNA samples derived from obese individuals (BMI >40 kg m-2) in the IUCPQ cohort. We generated 100bp paired-end reads using the Illumina HiSeq2000 or 2500 systems.	Illumina HiSeq 2500	345
EGAD00001004788	This dataset contains whole-transcriptome sequencing data of 113 myeloproliferative neoplasms (MPN) patients and 15 controls. Patients were diagnosed with essential thrombocythemia, polycythemia vera, primary myelofibrosis, and secondary acute myeloid leukemia. The data were pooled from 5 different sequencing experiments as indicated using an Illumina HiSeq2000 machine. All samples were sequenced paired-end. Sequenced samples were processed with custom workflows for discovery of fusion genes, SNVs and Indels calling, and identification of splicing abnormalities.	Illumina HiSeq 2000	145
EGAD00001004790	This dataset contains the imputed genotypes for the gencord samples. Genotyping was done using Illumina OMNI2.5M. Imputation was done using SHAPEIT2/IMPUTE2 with 1000 genomes project phase 3 reference panel.		251
EGAD00001004791	The dataset contains RNA-seq data of 96 EOPC patients and 9 controls. For some patients multiple tissue samples were sequenced ("multi-area" samples). The RNA extraction and sequencing protocol was earlier described in Weischenfeldt et al, Cancer Cell, 2013.	Illumina HiSeq 2000	1
EGAD00001004792	1. Utra-deep exome sequencing data, illumina pair-end reads, fastq, 190 samples 2. WES data, illumina pair-end reads, fastq, 120 samples 3. RNA-seq data, illumina, pair-end reads, fastq, 20 samples	Illumina HiSeq 2000 unspecified	330
EGAD00001004793	This dataset includes the whole-genome sequencing data from a study entitled "Tracing Oncogene Rearrangements in the Mutational History of Lung Adenocarcinoma". Whole-genome sequencing libraries were generated by PCR-free methods, and sequencing run was made in HiSeq X Ten machines. PCR duplicates-marked, indel-realigned, and base-recalibrarted BAM files are provided in our dataset.	HiSeq X Ten	98
EGAD00001004794	This dataset includes the RNA-seq data from a study entitled "Tracing Oncogene Rearrangements in the Mutational History of Lung Adenocarcinoma". PolyA tails were captured by Oligo-dT beads, and sequencing run was made in HiSeq 2500 machines. Paired-end FASTQ files are provided in our dataset.	Illumina HiSeq 2500	34
EGAD00001004795	Subcutaneous panniculitis-like T-cell lymphoma (SPTCL) is a rare subtype of peripheral T-cell lymphoma affecting younger cases and associated with hemophagocytic lymphohistiocytosis. To clarify the molecular pathogenesis of SPTCL, we analyzed paired tumor and germline DNAs from 13 patients by whole exome sequencing.	Illumina HiSeq 2500	26
EGAD00001004796	The TARGET Study consists of 200 BAM files, for the 100 patients discussed in the publication. Each patient has a normal control BAM file and a ctDNA BAM file.	NextSeq 500	200
EGAD00001004797	Twenty-seven Tibetan samples from China were whole-genome sequenced to investigate high-altitude adaptation, population genetics and demographic history.	HiSeq X Ten	43
EGAD00001004798	TRACERx 100: RRBS data from a subset of the first 100 TRACERx tumours	Illumina HiSeq 2500	98
EGAD00001004799	WES sequence data from 15 samples, RNA-seq sequence data from 19 samples, all sequence data are raw pair-end sequence data in fastq format, sequenced by Illumina platform.	unspecified	34
EGAD00001004800	Stage-1 meta-analysis with GC correction		5
EGAD00001004802	ETMR RIPSeq	Illumina HiSeq 2000	2
EGAD00001004803	RNA was prepared using the IlluminaTruSeq RNA sample preparation kit for poly-adenylated mRNA with an average of 97.64 million reads per sample respectively	Illumina HiSeq 2000	5
EGAD00001004805	ATACSeq library amplification was performed on 5 ETMR tumour samples using the NEBnext High Fidelity 2xPCR Master Mix (New England Biolabs, Cat#M0541S) according to the manufacturer’s protocol. ATAC-seq libraries were sequenced using single-end 50 bp reads on the Illumina HiSeq 2000 platform. ATAC-seq peaks analysed was conducted as published previously(Torchia et al. 2016).	Illumina HiSeq 2000	8
EGAD00001004806	Exome sequencing data for two sibs with juvenile idiopathic arthitis and one unaffected sib	Illumina Genome Analyzer II	3
EGAD00001004808	Exome sequencing performed on DNA from 94 anterior ischemic stroke cases	Illumina HiSeq 2500	94
EGAD00001004809	H3K27Ac ChIP-seq DNA libraries for 5 ETMR samples were prepared using NEBNext ChIP-seq Illumina Sequencing library preparation kit. ChIP-seqlibraries were sequenced using single-end 50 bp reads on the Illumina HiSeq 2000 platform. ChIP-seq peaks analysed was conducted as published previously(Torchia et al. 2016).	Illumina HiSeq 2000	10
EGAD00001004810	This dataset is related to publications Costa et al. Cancer Cell 2018 and Givel et al. Nat. Commun. 2018 which describe the identification of 4 Cancer Associated Fibroblasts (CAF) in breast and ovarian cancer. This dataset contains transcriptomic profiles obtained by RNA-Seq of 34 CAF-S3 samples from breast and ovarian Tumors.	Illumina HiSeq 2500	34
EGAD00001004811	whole-genome sequencing of 168GCs identifies hot-spot tandem duplications	HiSeq X Ten	1
EGAD00001004812	40 samples; filetype=bam	Illumina HiSeq 2500	40
EGAD00001004813	48 samples; filetype=bam	Illumina HiSeq 2500	48
EGAD00001004814	50 samples; filetype=bam	Illumina HiSeq 2500	50
EGAD00001004815	45 samples; filetype=bam	Illumina HiSeq 2500	45
EGAD00001004816	61 samples; filetype=bam	Illumina HiSeq 2500	61
EGAD00001004817	68 samples; filetype=bam	Illumina HiSeq 2500	68
EGAD00001004818	71 samples; filetype=bam	Illumina HiSeq 2500	71
EGAD00001004819	48 samples; filetype=bam	Illumina HiSeq 2500	48
EGAD00001004820	52 samples; filetype=bam	Illumina HiSeq 2500	52
EGAD00001004821	60 samples; filetype=bam	Illumina HiSeq 2500	60
EGAD00001004822	55 samples; filetype=bam	Illumina HiSeq 2500	55
EGAD00001004823	This dataset contains 26 mapped bam files. The samples were generated with 3 different protocols for deriving pancreatic progenitors from hPSC. Three parallel differentiations were performed, all done in a hPSC NKX6.1-GFP reporter line. For each protocol there are three cellular populations: total (presort), GFP+ and GFP- . In summary: 3 differentiations x 3 protocols x 3 cellular populations. We prepared Smart-Seq2 RNA-seq libraries for the 27 samples, 1 sample failed library preparation and is not therefore included in this dataset.	Illumina HiSeq 4000	26
EGAD00001004824	This dataset contains 27 mapped bam files. The samples were generated with 3 different protocols for deriving pancreatic progenitors from hPSC. Three parallel differentiations were performed, all done in a hPSC NKX6.1-GFP reporter line. For each protocol there are three cellular populations: total (presort), GFP+ and GFP- . In summary: 3 differentiations x 3 protocols x 3 cellular populations. We prepared ATAC-seq libraries for the 27 samples, and sequenced them on Illumina HiSeq4000.	Illumina HiSeq 4000	27
EGAD00001004825	Case series of the rare tumor entity chordoma. 9 cases sequenced with Whole Exome Sequencing (WES) and 2 cases sequenced with Whole Genome Sequencing (WGS) were recruited from the personalized oncology program NCT-MASTER/DKTK-MASTER at the German Cancer Research Center. One of the WES patients was re-sequenced at a later time point when he relapsed, this resequencing was done by WGS. Therefore there are 11 patients, one of which with two samples, all of which were sequenced with matched normal controls, amounting to a total number of 24 NGS samples.	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000	24
EGAD00001004826	This dataset consists of Illumina HiSeq 2000 Whole Exome Sequencing of 84 colorectal samples: 42 Tumor tissue samples and 42 Normal tissue samples (adjacent to tumor sites). 2x75bp paired-end sequencing reads: 2 fastq files per sample	Illumina HiSeq 2000	84
EGAD00001004827	This dataset consists of SOLiD small RNA-seq of 250 colorectal samples: 100 tumor tissue samples, 100 normal tissue samples (adjacent to tumor sites) and 50 matched control samples of healthy individuals. CSfasta and qual files converted to single fastq files prior to uploading.	AB SOLiD System	250
EGAD00001004828	This dataset maps gene expression regulation in human primary regulatory CD4+ T cells (Tregs). It includes whole genome sequence data for ChM-seq (118 H3K4me3, 118 H3K27ac and 6 inputs). The final quality filtered set included 91 individuals with H3K27ac ChM-seq and 88 with H3K4me3 ChM-seq.	Illumina HiSeq 2500 Illumina MiSeq	242
EGAD00001004829	This dataset includes whole genome sequence data for ATAC-seq (42 samples) of human stimulated and cultured CD4+ Treg cells.	Illumina HiSeq 2500	49
EGAD00001004830	This dataset maps gene expression regulation in human primary regulatory CD4+ T cells (Tregs). It includes whole transcriptome data for141 samples. The final quality filtered set included 123 individuals with RNA-seq data.	Illumina HiSeq 2500	141
EGAD00001004831	We isolated T cells and monocotyes from healthy platelet donors and cultured them in resting and stimulated conditions with addition of a range of cytokines. We performed ATAC sequencing to assess the chromatin accessability in different cytokines treated cells. These cellular profiles were used to map risk variants to the cytokine-induced cell states relevant for autoimmune diseases. . This dataset contains all the data available for this study on 2019-03-11.	Illumina HiSeq 2500 Illumina MiSeq	183
EGAD00001004832	The dataset includes whole genome sequencing (WGS) data on ten matched esophageal tumor-normal pairs. WGS was performed by the cancer sequencing service of CG with an average read coverage of approximately 50-fold. Illumina HiSeq2000 instrument was used to perform the sequencing.	Illumina HiSeq 2000	18
EGAD00001004833	Dataset consisting of three WGS BAM files, representing the two dual lung metastases with matched germline control, for a 37 year old female patient, with primary adrenocortical carcinoma. Work conducted at Garvan Institute of Medical Research, Sydney, Australia.	HiSeq X Ten	3
EGAD00001004834	This dataset includes cram files from 3,001 samples. These cram files include all read pairs where at least one of the reads aligns within 1kb of the C9orf72 repeat expansion. Additionally, these cram files also contain reads that are aligned to any of 29 pre-determined off target locations where the aligners are known to mis-align reads associated with this repeat expansion. These samples were sequenced using a combination of 2x100bp reads on an Illumina HiSeq2000 and 2x150bp reads on an Illumina HiSeqX sequencer and aligned using the Isaac aligner.	HiSeq X Ten Illumina HiSeq 2000	3001
EGAD00001004836	ATRT ChIPSeq H3K27ac	Illumina HiSeq 2500	3
EGAD00001004837	ATRT RNASeq	Illumina HiSeq 2500	4
EGAD00001004838	We have sequenced four samples to identify variants in KMT2A gene. Three samples were sequenced with WGS, one was sequenced by WES (Patient2).	HiSeq X Five Illumina HiSeq 2500	4
EGAD00001004839	PAGE Dataset Apr 2018 (Ref: PAGE Lancet 2019)		1812
EGAD00001004840	PAGE Dataset Apr 2018 (Ref: PAGE Lancet 2019)		1813
EGAD00001004841	PAGE Dataset Apr 2018 (Ref: PAGE Lancet 2019)		610
EGAD00001004842	PAGE2 Dataset Nov 2017 (Ref: PAGE2 GIM 2018)		81
EGAD00001004843	PAGE2 Dataset Nov 2017 (Ref: PAGE2 GIM 2018)		81
EGAD00001004844	PAGE2 Dataset Nov 2017 (Ref: PAGE2 GIM 2018)		27
EGAD00001004845	Massively-parallel DNA sequencing of 113 advanced thyroid cancers and Massively-parallel RNA sequencing of 25 advanced thyroid cancers	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500	187
EGAD00001004846	Series of 56 paired presentation, relapse and control samples from newly diagnosed, uniformly treated myeloma patients. Deep of treatment response and maintenance allocation (active observation or lenaldiomide) was determined for all. All samples underwent whole exome sequencing with additional baits to cover the myc and immunoglobulin loci. There are 168 (56 presentation, 56 relapse and 56 control) samples in this study. 131 are available as part of this dataset. The remaining 37 are available with dataset accession id EGAD00001001358.	Illumina HiSeq 2500	131
EGAD00001004847	ATRT ATACSeq	Illumina HiSeq 2000	18
EGAD00001004848	Colorectal cancer panel sequencing of 100 adenoma and carcinoma samples . Cancer Hot Spot panel sequencing of 7 samples from one poly.	Ion Torrent PGM	100
EGAD00001004849	Shotgun sequencing data from subject as baseline and after 4 weeks of daily doses of low, medium or high CFU Eubacterium hallii L2-7. The strain contained in the drink "Ela" was also sequenced.	Illumina HiSeq 4000	53
EGAD00001004850	The dataset contains RNAseq data from 5 subsets of NK cells isolated from human lung: 1) CD69+CD49a+CD103+CD16-CD56bright NK cells 2) CD69+CD49a+CD103-CD16-CD56bright NK cells 3) CD69+CD49a-CD103-CD16-CD56bright NK cells 4) CD69-CD49a-CD103-CD16-CD56bright NK cells 5) CD56dimCD16+NKG2A+CD57- NK cells The dataset contains paired data for the subsets from 2 donors with 2 biological replicates/donor and subset.	Illumina HiSeq 2500	19
EGAD00001004851	Sequencing data from primary mucinous ovarian carcinomas, benign and borderline mucinous tumours and extra-ovarian mucinous metastases.	HiSeq X Ten Illumina HiSeq 2000	130
EGAD00001004852	We isolated T cells and monocotyes from healthy platelet donors and cultured them in resting and stimulated conditions with addition of a range of cytokines. We performed K27Ac ChM sequencing to assess the chromatin activity in different cytokines treated cells. These cellular profiles were used to map risk variants to the cytokine-induced cell states relevant for autoimmune diseases. . This dataset contains all the data available for this study on 2019-03-19.	Illumina HiSeq 2500 Illumina MiSeq	192
EGAD00001004853	Genentech gallbladder cancer study - exome	Illumina HiSeq 2500	392
EGAD00001004854	Genentech gallbladder cancer study - RNA-seq	Illumina HiSeq 2500	120
EGAD00001004855	Genentech gallbladder cancer study - whole genome sequencing	Illumina HiSeq 2500	361
EGAD00001004856	Paired end Illumina whole exome sequencing of 9 GBM trios (blood, primary and recurrent tumour).	Illumina HiSeq 2500	27
EGAD00001004857	Paired-end whole exome sequencing of 50 TNBC breast cancer metastasis samples and matched normal samples obtained from 50 unique patients assayed at study baseline (directly after patient randomization). The included raw sequencing data (fastq-format) were generated using Illumina HiSeq2500 instruments.	Illumina HiSeq 2500	65
EGAD00001004858	Transcriptome sequencing of 97 matched TNBC breast cancer metastasis samples obtained from 50 unique patients assayed at two timepoints: at baseline (directly after patient randomization), and post-induction treatment or control waiting period. The included raw transcriptome sequencing data (fastq-format) were generated using Illumina HiSeq2500 instruments.	Illumina HiSeq 2500	97
EGAD00001004859	Set of 133 bam files from patients affected with Lupus. BAM alignments for exonic variants present in 76 Lupus-related genes. VCF file describing the variants.	Illumina HiSeq 2000	133
EGAD00001004860	Placental biopsies were collected within 30 minutes of birth and flash frozen in RNAlater(ThermoFisher). For each biopsy,total placental RNA was extracted from approximately 5 mg of tissue using the “mirVana miRNA Isolation Kit” (Ambion) followed by DNase treatment (“DNA-free DNA Removal Kit”, Ambion). RNA quality was assessed with the Agilent Bioanalyzer and all the samples with RIN values ≥ 7.0 were used in the downstream experiments. Total RNA-libraries were prepared from 300-500ng of total placental RNA with the “TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero Human/Mouse/Rat”(Illumina). Small RNA-libraries were prepared from 150ng of total placental RNA with the “NEBNext Multiplex Small RNA Library Prep Kit for Illumina”(New England Biolabs)and concentrated using the “QIAquick PCR purification kit”(Qiagen). Paired libraries were combined and size selected using the Pippin Prep and 3% Agarose Gel Cassette with marker F (Sage Science),pooled and sequenced (single-end, 50bp) using a Single End V4 cluster kit and HiSeq4000 instrument.	Illumina HiSeq 4000	1
EGAD00001004861	Somatic mutation frequencies in patients with therapy-related myeloid neoplasms (129 patients, 181 samples including bone marrow, mesenchymal stromal cells and hair DNA) and primary myelodysplastic syndrome (108 patients, 215 samples including bone marrow, mesenchymal stromal cells and hair DNA) is assessed by deep sequencing of selected genes, using a Fluidigm Access Array, a Nimblegen capture panel and an Ion AmpliSeq panel. The dataset consists of paired fastq files obtained by either Hiseq (2x101bp) or Nextseq (2x150bp) Illumina sequencing. The mutational burden is found to be similar in both, however the distribution of variants is different. Correlation of the mutational spectrum with prognosis is also observed.	Illumina HiSeq 2000 NextSeq 500	396
EGAD00001004862	Glioblastoma multiforme (GBM) is clinically highly aggressive as a result of evolutionary dynamics induced by cross-talk between cancer cells and a heterogeneous group of immune cells in tumor microenvironment. The brain harbors limited numbers of immune cells with few lymphocytes and macrophages; thus, innate‐like lymphocytes, such as γδ T cells, have important roles in antitumor immunity. Here, we characterized GBM‐infiltrating γδ T cells, which may have roles in regulating the GBM tumor microenvironment and cancer cell gene expression. V(D)J repertoires of tumor‐infiltrating and blood‐circulating γδ T cells from four patients were analyzed by next-generation sequencing-based T-cell receptor (TCR) sequencing in addition to mutation and immune profiles in four GBM cases. In all tumor tissues, abundant innate and effector/memory lymphocytes were detected, accompanied by large numbers of tumor‐associated macrophages and closely located tumor‐infiltrating γδ T cells, which appear to have anti-tumor activity. The immune-related gene expression analysis using the TCGA database showed that the signature gene expression extent of γδ T cells were more associated with those of cytotoxic T and Th1 cells and M1 macrophages than those of Th2 cells and M2 macrophages. Although the most abundant γδ T cells were Vγ9Vδ2 T cells in both tumor tissues and blood, the repertoire of intratumoral Vγ9Vδ2 T cells was distinct from that of peripheral blood Vγ9Vδ2 T cells and was dominated by Vγ9Jγ2 sequences, not by canonical Vγ9JγP sequences that are mostly commonly found in blood γδ T cells. Collectively, unique GBM‐specific TCR clonotypes were identified by comparing TCR repertoires of peripheral blood and intra‐tumoral γδ T cells. These findings will be helpful for the elucidation of tumor-specific antigens and development of anticancer immunotherapies using tumor-infiltrating γδ T cells.	Illumina HiSeq 2500	18
EGAD00001004863	This dataset contains Whole Genome Sequencing, RNA-sequencing and ATAC-sequencing data obtained from PBMCs derived from blood samples of one patient with complex genomic rearrangements and the biological parents. The patient has multiple congenital anomalies and delayed development. Data access is closed.	HiSeq X Ten NextSeq 500	9
EGAD00001004864	This dataset contains Whole Genome Sequencing and, if available, RNA-sequencing and/or ATAC-sequencing data obtained from PBMCs derived from blood samples of two patients with intellectual disability and/or multiple congenital anomalies and eight parents included in the University Medical Center Utrecht (The Netherlands). Data access is closed.	HiSeq X Ten NextSeq 500	15
EGAD00001004865	This dataset contains Whole Genome Sequencing and, if available, RNA-sequencing and/or ATAC-sequencing data obtained from PBMCs derived from blood samples of 34 patients with intellectual disability and/or multiple congenital anomalies and their biological parents (58) included in the University Medical Center Utrecht (The Netherlands).	HiSeq X Ten	15
EGAD00001004866	This dataset contains Whole Genome Sequencing and, if available, RNA-sequencing and/or ATAC-sequencing data of 17 lymphoblastoid cell lines derived from patients with complex genomic rearrangements. Patients have phenotypes in category of intellectual disability and/or multiple congenital anomalies. Data access is closed.	HiSeq X Ten	15
EGAD00001004867	This dataset contains all the data available for this study on 2019-03-26.	HiSeq X Ten	60
EGAD00001004869	Illumina platform sequencing data for matched tumour-normal DNA samples from 77 melanoma patients participating in a study investigating response to immunotherapy. Selected cases also have RNA sequencing of the tumour.		-
EGAD00001004871	ChIP-seq data for Lymphoblastoid Cell Lines (LCL) and Fibroblasts (FIB) from the Gencord Cohort: - 160 LCLs assayed for H3K27ac, H3K4me1 and H3K4me3, - 78 FIB assayed for H3K4me3 and 79 FIB assayed for H3K27ac and H3K4me1 This dataset was generated as part of the following study: Delaneau et al (2019). Chromatin 3D interactions mediate genetic effects on gene expression.	Illumina HiSeq 2000	239
EGAD00001004872	RNA-seq data for 168 Lymphoblastoid Cell Lines (LCL) and 78 Fibroblasts (FIB) from the Gencord Cohort. This dataset was generated as part of the following study: Delaneau et al (2019). Chromatin 3D interactions mediate genetic effects on gene expression.	Illumina HiSeq 2000	246
EGAD00001004873	Whole exome sequencing BAM files of 50 metastatic solid tumour and matched blood germline DNA prior to pembrolizumab treatment.	Illumina HiSeq 2500	100
EGAD00001004874	This dataset consists of amplicon sequencing of fibrocystic breast tissues, subsequent cancer tissue and germline control of 17 patients. The target genes include all exons of 27 protein-coding genes and 2 non-coding genes, as well as mutation hotspots in three cancer genes, frequently mutated in breast cancer. Ion AmpliSeq libraries were generated and sequenced on the Ion S5 XL system.	Ion Torrent S5 XL	51
EGAD00001004875	Aligned RNA-seq sequences in this dataset are from the Proteogenomic Landscape of Curable Prostate Cancer study		1
EGAD00001004876	In this project we have sequenced the exome of skin moles (melanocytic naevi) and also normal skin from young and old people. We are interested in looking at the clonality of these lesions and the burden of UV mutations . This dataset contains all the data available for this study on 2019-04-01.	Illumina HiSeq 2000	14
EGAD00001004877	Targeted analysis of chondrosarcoma cancer genes . This dataset contains all the data available for this study on 2019-04-01.	Illumina HiSeq 2500	445
EGAD00001004878	R&D project to develop low input library construction methods. . This dataset contains all the data available for this study on 2019-04-01.	HiSeq X Ten Illumina HiSeq 2500	-
EGAD00001004879	Evolution of the cancer epigenome in myeloproliferative neoplasms. . This dataset contains all the data available for this study on 2019-04-01.	HiSeq X Ten	17
EGAD00001004880	We will sequence at 15X coverage the genomes of 1536 IBD patients. These samples are currently onsite at Sanger and made available for sequencing via our collaboration with the UK IBD Genetics consortium. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2019-04-01.	HiSeq X Ten	3124
EGAD00001004881	Genome-wide CRISPR/Cas9 library screen was performed in isogenic cell lines. Three biological replicates were used. Cells were harvested at initiation of screen and at specific time points following cell culture. Using this approach we aim to identify genes specifically important in cell survival in engineered cell lines. Please perform 72, instead of 36, PCR reactions as the screen performed at 200x coverage. . This dataset contains all the data available for this study on 2019-04-01.	Illumina HiSeq 2500	18
EGAD00001004882	This dataset comprises over 850 individuals recruited in Uttar Pradesh, India, including cases of rheumatic heart disease based on echocardiographic diagnosis and controls recruited on the basis of normal echocardiograms. For this analysis all available samples were genotyped using the Illumina HumanCore-24 BeadChip platform.		940
EGAD00001004884	The data consists of 47 exome-sequenced synchronous colorectal cancers from 23 patients. The exomes of corresponding normal samples were used to remove germline variants. All patients are Finnish (white Caucasian). All except one patient (sync_11 who belongs to a LS family) were assumed sporadic. The sequence data was produced with Illumina HiSeq 4000.		47
EGAD00001004885	Whole exome sequencing of human and mouse sarcoma samples for creation of personalized therapy options. Tissues were sequenced directly; no interventions or alterations were made to the tissue samples	Illumina HiSeq 4000	4
EGAD00001004886	Whole Genome Sequencing of Normal Singaporean Volunteers	HiSeq X Ten	175
EGAD00001004887	We performed multiregion whole exome sequencing of a total of 37 samples from five consecutive patients (normal tissue, n=5; primary tumors, n=16; tumor thrombi, n=16) to >35-fold target coverage. Matching primary tumor and venous tumor thrombus samples were analyzed. Four patients had a clear cell RCC, one patient had a poorly differentiated type II papillary RCC (RCC-VTT-04). The latter patient had a friable thrombus, the others were of solid consistency.	Illumina HiSeq 2500	37
EGAD00001004888	We will use targeted exome sequencing to examine normal appearing epithelium and whole exome and whole genome sequencing of microdissected clones identified by immunostaining Some of the samples will be of low DNA concentration and therefore may require extra rounds of amplification during library prep. . This dataset contains all the data available for this study on 2019-04-03.	Illumina HiSeq 2500	5
EGAD00001004889	Mutational signatures have been shown to be attributable to specific genetic contexts, such as mutations in DNA repair genes. DNMT3A is a DNA methyltransferase that helps maintain the DNA methylation pattern in a site-specific manner and may participate in DNA repair or the stress response. We have identified an adult individual who is a germline mosaic for a DNMT3A mutation. We have obtained clonal lymphoblastoid cells (LCLs) from the subject representing both WT and mutant lines grown in the same individual for >50 years. These clones represent a unique opportunity to examine the mutational impact of the DNMT3A mutation in a well-controlled setting. Our goal is to perform WGS on whole blood, representing the pool, as well as several WT and several mutant clones, in order to investigate the contribution of DNMT3A to mutation rates and signatures. . This dataset contains all the data available for this study on 2019-04-03.	HiSeq X Ten	9
EGAD00001004890	The aim of this study is to investigate the somatic mutations in twins with BRCA1/2 negative breast cancer with no strong family history. . This dataset contains all the data available for this study on 2019-04-03.	HiSeq X Ten Illumina HiSeq 4000	26
EGAD00001004891	Drug resistant population of PC9(human non-small cell lung cancer) or A375 (human melanoma) cell lines were used for this study. By exome sequencing, we will analyse mutations of cells in drug tolerent state and after drug holiday. . This dataset contains all the data available for this study on 2019-04-03.	Illumina HiSeq 2500	18
EGAD00001004892	High-coverage whole genome sequences using Hiseq X for 4 individuals to investigate their Y chrosmosmes' relationship to the known phylogeny. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2019-04-03.	HiSeq X Ten	5
EGAD00001004893	The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, the International Agency for Research on Cancer is coordinating the recruitment of 5000 individuals with cancer (colorectal, renal, pancreatic, oesophageal adenocarcinoma or oesophageal squamous cancers) across 5 continents to explore whether different mutational signatures explain marked variation in incidence. In brief, through an international network of collaborators around the world, biological materials are collected, along with demographic, histological, clinical and questionnaire data. Whole genome sequences of tumour-germline DNA pairs are generated at the Wellcome Trust Sanger Institute (Illumina HiSeqX, 40X and 20X depth respectively). Somatic mutational signatures are subsequently extracted by non-negative matrix factorisation methods and correlated with risk factors data. Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development. . This dataset contains all the data available for this study on 2019-04-03.	HiSeq X Ten	36
EGAD00001004894	In this study, samples fom a window study of the PARP inhibitor rucaparib in patients with primary triple negative or BRCA1/2 related breast cancer (RIO trial) will be investigated. Samples will undergo whole genome sequence and analysis, including use of HR Predict. . This dataset contains all the data available for this study on 2019-04-03.	HiSeq X Ten	60
EGAD00001004895	Recent advances in genomics have demonstrated that clonal haemopoiesis driven by leukaemia associated somatic mutations is a relatively common phenomenon that increases in frequency with advancing age. Whilst individuals with clonal haemopoiesis have an increased risk of developing haematological malignancies, they also have an increased mortality from other causes. Additionally, certain mutations are almost exclusively seen in individuals aged 70 years or older, whilst others are seen in individuals with non-haematological cancers including breast and ovarian. Recently, clonal haemopoiesis was found to be associated with a significantly increased risk of atherosclerotic cardiovascular disease. This association is thought to be causative with clonally-derived macrophages showing elevated expression of several chemokine and cytokine genes that contribute to atherosclerosis. Another vascular pathology, abdominal aortic aneurysm (AAA), increases with age and shares risk factors with atherosclerosis (including smoking, male sex, high cholesterol). However, the impact of these risk factors and the overlap between AAA and atherosclerosis is poorly understood. To investigate a possible link between clonal haemopoiesis and AAA, we will study DNA samples from 300 patients with AAA and up to 200 controls for evidence of clonal haemopoiesis. This will be done using target DNA enrichment with biotinylated RNA baits followed by high throughput sequencing. . This dataset contains all the data available for this study on 2019-04-03.	Illumina HiSeq 2500	472
EGAD00001004896	In order to reconstruct the evolutionary history of metastatic colorectal cancer, we performed whole-exome sequencing of 12 metastatic colorectal cancer patients for whom the primary tumor and matched distant metastases to the brain (n=10) and liver (n=2). For 8 of the 12 patients, multiple regions (n=3-7) of the primary tumor and distant metastases were sequenced.	Illumina HiSeq 2000 Illumina HiSeq 2500	163
EGAD00001004897	Genome and transcriptome sequence data from a locally advanced breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004898	Genome and transcriptome sequence data from a metastatic rectal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004899	Genome and transcriptome sequence data from a invasive ductal carcinoma of right breast patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004900	Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004901	Genome and transcriptome sequence data from a metastatic leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004902	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004903	Genome and transcriptome sequence data from a squamous cell lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004904	Genome and transcriptome sequence data from a metastatic gastric cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study	PromethION	1
EGAD00001004905	Genome and transcriptome sequence data from a metastatic adrenocortical carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004906	Genome and transcriptome sequence data from a diffuse large B-cell lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004907	Genome and transcriptome sequence data from a T-cell prolymphocytic leukemia patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004908	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004909	Genome and transcriptome sequence data from a metastatic serous ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004910	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004911	Genome and transcriptome sequence data from a metastatic colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004912	Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004913	Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004914	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004915	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004916	Genome and transcriptome sequence data from a metastatic leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004917	Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004918	Genome and transcriptome sequence data from a metastatic rectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004919	Genome and transcriptome sequence data from a abdominal sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004920	Genome and transcriptome sequence data from a meningioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004921	Genome and transcriptome sequence data from a metastatic adenoid cystic carcinoma of the breast patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004922	Genome and transcriptome sequence data from a leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004923	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study	PromethION	1
EGAD00001004924	Genome and transcriptome sequence data from a metastatic neuroendocrine tumor, lung primary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004925	Genome and transcriptome sequence data from a metastatic choroidal melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004926	Genome and transcriptome sequence data from a metastatic pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004927	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004928	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004929	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004930	Genome and transcriptome sequence data from a metastatic ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004931	Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004932	Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004933	Genome and transcriptome sequence data from a metastatic sarcomatoid carcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004934	Genome and transcriptome sequence data from a squamous cell carcinoma of the alveolar ridge patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004935	Genome and transcriptome sequence data from a oligometastatic colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004936	Genome and transcriptome sequence data from a metastatic rectal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001004937	This dataset includes 10 RNA-sequencing (RNA-seq) data for 9 primary tumors and 1 cell line from adult T-cell leukemia/lymphoma (ATL).	Illumina HiSeq 4000	10
EGAD00001004938	Hepatocellular carcinoma (HCC) is a heterogeneous aggressive malignancy with low efficacy of current therapies at advanced stages. We integrated molecular and pharmacological profiling of a large panel of liver cancer cell lines (LCCL) to assess their clinical relevance as HCC preclinical models and identify new effective therapies and biomarkers of response. Here, we performed multi-omic analysis including whole-exome, RNA and microRNA sequencing in a series 34 LCCL. Molecular profiles of LCCL and primary HCC were compared and we searched for molecular features associated with drug response. Our panel of LCCL faithfully recapitulated the most aggressive molecular “proliferation class” of HCC.	Illumina HiSeq 2000 Illumina HiSeq 4000	34
EGAD00001004939	Sequencing data from patients with Ovarian cancer. Data utilised in the 'Enhanced detection of circulating tumor DNA by fragment size analysis' manuscript (Mouliere et al, 2018)	Illumina HiSeq 4000	118
EGAD00001004940	This dataset comprises 2570 whole genome sequenced samples from the Medical Genome Reference Bank. https://sgc.garvan.org.au/initiatives/mgrb The files are provided in cram format, aligned to hs37d5 with decoys, with no further processing applied. The dataset also contains phenotype information for each sample.		2570
EGAD00001004941	Recent work in the Campbell group has revealed somatic mutations present in normal, non-cancerous human skin. A subset of the mutations conferred selective advantages to the host cells, leading to clonal expansions and raising the risk for future cancer development. Capturing such somatic mutations in normal tissue is important to advance our understanding about carcinogenesis and could provide prospective medical insights. In this project, our goal is to detect somatic mutations in normal (pre-cancerous) liver tissue. Using Laser Microdissection technology, we will dissect individual liver lobules from patient samples and submit these to sequencing. For each patient sample, we aim to sequence multiple lobules to characterise the mutagenic burden. Samples will be taken from patients with different liver disease aetiologies, including alcoholism and obesity, with a view on distinguishing the prevalent mutation types occurring in each disease context. We will perform targeted sequencing, initially using the WTSI cancer panel. Later we aim to use a novel bait set that captures both cancer genes as well as genes relevant to the non-cancerous samples (ie. genes implicated in hereditary disorders, immune sequences). . This dataset contains all the data available for this study on 2019-04-08.	Illumina HiSeq 2500	63
EGAD00001004942	Gastroschisis (MIM 230750) is a herniation of the intestines through a defect of the abdominal wall lateral to the umbilicus (usually on the right side), and it is not covered by a membrane [Ledbetter, 2012]. Gastroschisis is a congenital anomaly with increasing incidence, easy prenatal diagnosis and extremely variable postnatal outcomes. On the basis of clinical manifestations, epidemiologic charateristics, and the presence and type of additional malformations, gastroschisis could be considered a heterogeneous condition with no gene/s discovered yet. This congenital anomaly affects approximately 1-3 infancts per 10,000 live births [Calzolari et al.1995;Parker et al.,2010] Current knowledge about causative mutations/variants. To date, no single gene has been linked to gastroschisis. Some publications have tried to link this malformation to variants in genes (such as AEBP1 (adipocyte enhancer binding protein) gene [Feldkamp et al,. 2012] or the VEGF-NOS3 pathway [Lammer et al., 2008]. Previously, a Scribble mutant mouse model (circletail) was reported to exhibit gastroschisis, however recent studies demonstrated that the Scribble knockout fetus exhibits exomphalos phenotype of gastroschisis [Carnagham et al., 2013]. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2019-04-08.	Illumina HiSeq 2500	30
EGAD00001004943	Organoids are self-organizing 3D structures grown from stem cells that recapitulate essential aspects of organ structure and function. Here we describe a method to establish long-term culture conditions of human airway epithelial organoids that contain all major cell populations and allow personalized human disease modelling. We collected macroscopically inconspicuous lung tissue from non-small-cell lung cancer (NSCLC) patients undergoing medically indicated surgery and isolated epithelial cells to engineer 3D organoids. We exploit the potential to derive sub-clones from AOs to demonstrate the feasibility of CRISPR gene editing. Finally, we show that AOs readily allow modelling of viral infections such as RSV and for the first time demonstrate the possibility to study neutrophil-epithelium interaction in an organoid model. Taken together, we anticipate that human AOs will find broad applications in the study of adult human airway epithelium in health and disease.	NextSeq 500	4
EGAD00001004944	The dataset is composed of 62 samples (31 subjects before and after probiotic-like bacteria treatment). Sequencing was performed using Illumina HiSeq 2500. Fastq files are provided.	Illumina MiSeq	61
EGAD00001004945	This dataset contains 70 human LV H3K27ac ChIP-seq paired-end FASTQ files. The sequencing was performed using Illumina Hiseq 4000.	Illumina HiSeq 4000	70
EGAD00001004946	Whole exome sequencing data for 18 mucoepidermoid carcinoma samples. The samples were used for Illumina TruSeq library construction and captured using Agilent V4 exome panel. The PE fastq files are provided.	Illumina HiSeq 2000	18
EGAD00001004948	The collection and use of tissue for this study had Melbourne Health institutional review board approval and patients provided written informed consent (Melbourne Health Local Project Number: 2016.087). Following the prostatectomy of 13 patients, ranging from 52 to 78 years of age and from CAPRA-S risk score of 0 (attributed to benign tissue samples, harvested from a site far from a low grade, low volume cancer) to 7 (Supplementary file 2), a four millimeter tissue core was collected from the prostate tumour site, conditional to histopathological verification66,67. If not otherwise specified, all procedures were carried out at 4 °C. Tissue blocks were washed in Phosphate-buffered saline (PBS) solution for 2 minutes and minced for 2 minutes with a scalpel. Homogenised tissue was added to a solution (total volume of 7 ml) composed by of 1 mg/ml collagenase IV (Worthington Biochemical Corp, USA), 0.02 mg/ml DNase 1 (New England Biolabs, USA), 0.2 mg/ml dispase (Merck, USA). The tissue homogenised was serially digested at 37 °C at 180 rpm, through three steps of 5, 10 and 10 minutes of duration, with the final 3 minutes dedicated to sedimentation at 0 rpm. After each digestion step, the supernatant was aspirated and filtered through a 70 μm strainer into a pre-chilled tube, diluting the solution with 15 ml of 2% bovine serum PBS to quench the enzymatic reaction. The resulting cumulative solution was then centrifuged at 1500 rpm for five minutes, with the supernatant collected and the cell pellet resuspended into 1 ml 2% PBS-serum prior to labelling (Fig. S1).	NextSeq 500	52
EGAD00001004949	The dataset contains exome sequencing data of seven healthy family members, all in FASTQ format. The samples were taken from peripheral blood mononuclear cells. Furthermore, corresponding proteomics data are available as well.	Illumina HiSeq 2500	7
EGAD00001004950	March 2019 data update for cord blood CD34+CD38-, CMP, GMP, MEP, monocyte, erythroid precursor, B cell and primary AML total blasts reference epigenomes generated at Centre for Epigenome Mapping Technologies, Genome Sciences Center, B.C. Cancer Agency as part of the International Human Epigenome Consortium. This dataset contains data for samples: CEMT0158 CEMT0159 CEMT0160 CEMT0161 CEMT0162 CEMT0163 CEMT0164 CEMT0165 CEMT0166 CEMT0167 CEMT0168 CEMT0169 CEMT0170 CEMT0171 CEMT0172 CEMT0189	HiSeq X Ten Illumina HiSeq 2500	16
EGAD00001004951	Whole-genome sequencing of human individuals from Polynesian and Native American populations, as well as 10x Genomics Chromium data from Polynesian, Native American and Aboriginal Australian populations, allowing for experimental phasing of haplotypes. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2019-04-11.	HiSeq X Ten	68
EGAD00001004952	Small-molecule inhibitors targeting the most commonly activated pathway in melanoma, MAPK pathway (either alone or in combination) are already given to melanoma patients for few years, and initially reduce tumour burden dramatically, eventually melanomas become resistant and tumours progress while on treatment. Resistance to this treatment occurs by acquisition of additional mutations or other alterations that affect the mitogen-activated protein kinase (MAPK) pathway by either direct or indirect signalling. Many resistance mechanisms somehow lead to reactivation of extracellular signal-regulated kinase (ERK), thereby restoring signalling of the oncogenic BRAF/MEK/ERK pathway. In addition, PI3K pathway activation contributes to resistance to BRAF inhibition. Less frequent but equally important to the phenomenon of targeted drug resistance is the observation that B15-20% of BRAF mutant melanoma patients fail to respond to BRAF inhibition already early on treatment, owing to intrinsic resistance. These patients have little therapeutic options, unless immunotherapy can be given. To better understand the resistance mechanisms in MAPK inhibitor-treated melanoma patients and melanoma biology, our lab generated a big panel of MAPK inhibitor resistant melanoma cell lines by continuous drug exposure. The understanding of the genetic landscape and gene expression as well as cross resistance to other treatment regimens, and other aspects of melanoma biology such as phenotype switch, will allow us to better exploit new therapeutic strategies for melanoma patients. . This dataset contains all the data available for this study on 2019-04-11.	Illumina HiSeq 4000	34
EGAD00001004953	Single cell + bulk genomics study for immune and hematopoietic organs during human fetal development . This dataset contains all the data available for this study on 2019-04-11.	HiSeq X Ten	5
EGAD00001004954	We aim to describe the transcriptomic landscape of infant spindle cell tumours. . This dataset contains all the data available for this study on 2019-04-11.	Illumina HiSeq 4000	38
EGAD00001004955	The aim of this project is to test whether HPV oncogenes and /or interferon induce the APOBEC mutational signature in vitro, and to test the role of APOBEC3A in this process. . This dataset contains all the data available for this study on 2019-04-11.	HiSeq X Ten	20
EGAD00001004956	16S rDNA amplicon sequencing of 196 human fecal samples of an Inulin cross-over trial in healthy, mildly constipated individuals.	Illumina MiSeq	196
EGAD00001004957	Aligned whole genome sequences in this dataset are from CPCG-GENE Normal/Tumor pairs used in the 666PG study. Raw sequences data was aligned against the human reference (GRCh37 + decoy) using bwa-mem and GATK for indel-realignment and recalibration		608
EGAD00001004958	Highly recurrent U1 snRNA mutations drive alternative splicing in HH medulloblastoma	Illumina HiSeq 2000 Illumina HiSeq 2500	234
EGAD00001004959	This dataset contains all samples used in study 'Whole genome characterisation of 5-FU treated organoids'	HiSeq X Ten	3
EGAD00001004960	multi-region exome sequencing of 116 pulmonary nodules including lung preneoplasia atypical adenomatous hyperplasia (AAH, N=22), adenocarcinoma in situ (AIS, N=27), minimally invasive adenocarcinoma (MIA, N=54) and invasive lung adenocarcinoma (ADC, N=13).	Illumina HiSeq 2500	320
EGAD00001004961	We sequenced a total of 2 H3.3K27WT (pcGBM2, G477; 3 replicates total) and 2 H3.3K27M (DIPGVI, DIPGXIII; 6 replicates total) patient-derived cell lines as well as Crispr/Cas9 H3.3K27M-KO clones for one of the cell lines (3 replicates total; DIPGXIII-KO) using ATAC-Seq. P-XX designates passages of replicates. These samples can be found at GEO under accession number GSE128744. This repository contains 1 replicate of G477 to be released under controlled access.	Illumina HiSeq 2000	1
EGAD00001004962	Amplicon sequencing of 10 patients		95
EGAD00001004963	Whole exome paired-end sequencing data was performed on a trio (patient + parents) who has primary immunodeficiency to identify the genetic cause of the immunodeficiency. Analysis revealed a novel homozygous mutation in IL2RB.	unspecified	3
EGAD00001004964	Whole exome sequencing data for matched normal and endometriosis samples. Samples were prepared using Agilent Sureselect capture kit, and sequenced on an Illumina HiSeq 2500. The submitted files are in BAM format.	Illumina HiSeq 2500	50
EGAD00001004965	We performed RNA sequencing in whole-blood from the same 65 individuals from the PIVUS study at ages 70 and 80 (130 samples) to quantify how gene expression, alternative splicing, and their genetic regulation are altered during this 10-year period of advanced aging. Each individual has four fastq files, two for each age. Consecutive sample IDs refer to the sample individual, e.g. PIVUS003 and PIVUS004 are the age 70 and age 80 samples of the first individual.	NextSeq 500	130
EGAD00001004966	Dataset of adenoma and colon cancer multi-region sequencing. Publication: Nat Ecol Evol. 2018 Oct;2(10):1661-1672. doi: 10.1038/s41559-018-0642-z. Epub 2018 Aug 31.	Illumina HiSeq 2500 unspecified	139
EGAD00001004967	The dataset is referenced by EGA Study ID EGAS00001003605, which includes the short-reads data for 59 samples. All short-reads data files are in fastq format.	unspecified	59
EGAD00001004968	High-resolution (Sub-5 kbp resolution) Hi-C datasets generated using glioblastoma primary cultures from 3 different adult patients.	Illumina HiSeq 2500 NextSeq 500	6
EGAD00001004969		Illumina MiSeq	172
EGAD00001004971	Androgen deprivation therapy treated patients (n=11) were recruited from an open label neoadjuvant phase II study in which patients with high-risk disease received a ‘supercastration’ regimen consisting of degarelix 240/80 mg subcutaneously every four weeks; abiraterone acetate 500 mg orally daily titrating upwards every two weeks by 250 mg to a final dose of 1000 mg daily; bicalutamide 50 mg orally daily; and prednisolone 5 mg orally twice daily for a total of 6 months (Australian New Zealand Clinical Trials Registry 12612000772842). Untreated patients with similar pre-treatment characteristics were obtained from a prospective prostatectomy biorepository22,23. Prior to ligation of the dorsal venous complex and prostate pedicles, the anterior prostate was defatted and the specimen was removed immediately, placed in a sterile container and transferred on ice for long-term storage in the vapour phase of liquid nitrogen. A total of 50–100 µg of adipose tissue was separated from fresh frozen samples stored at −160°C. RNA was isolated using the Qiagen RNeasy Lipid Tissue Mini Kit and eluted in 35 µL nuclease-free water. 0.5–1 µg of total RNA was used as the input for cDNA library synthesis using TruSeq RNA Sample Prep Kit v2 (Illumina), and libraries were constructed according to manufacturer’s instructions. Samples were sequenced on a HiSeq 2500 (Illumina) using 101 base paired-end chemistry, aiming for 50 million mapped paired-end reads per sample.	Illumina HiSeq 2500	11
EGAD00001004972	Exome sequencing of ~1,800 patients with non-syndromic Congenical Heart Defects (CHD) to identify genes enriched for damaging rare variants that increase risk of CHD . This dataset contains all the data available for this study on 2019-04-24.	Illumina HiSeq 2500	1
EGAD00001004975	Exome sequencing of ~1,800 patients with non-syndromic Congenical Heart Defects (CHD) to identify genes enriched for damaging rare variants that increase risk of CHD . This dataset contains all the data available for this study on 2019-04-24.	Illumina HiSeq 2500	1
EGAD00001004977	Bone marrow mononuclear cells from patients diagnosed with B cell precursor acute lymphoblastic leukemia were obtained at three sequential time points: first diagnosis, remission after chemotherapy and relapse. Genomic DNA was isolated and targeted gene panel sequencing was performed using a customized biotinylated RNA oligo pool (SureSelect, Agilent, Santa Clara, California) to hybridize the target regions comprising 362 kbp on a HiSeq2000. Target regions were selected to validate mutations previously identified in these samples using whole exome sequencing.	Illumina HiSeq 2000	150
EGAD00001004978	The study includes NGS-based WGBS on one sperm DNA sample pooled from 30 participants, and methylC-capture sequencing (MCC-Seq) on the same pooled sperm sample as well as 45 sperm DNA samples derived from both fertile and infertile individuals in two cohorts (Toronto, a fertile cohort; Montreal, an idiopathic infertility cohort). All the data were generated with 100bp paired-end reads using the Illumina HiSeq2000 or 4000 systems.	Illumina HiSeq 2000 Illumina HiSeq 4000	47
EGAD00001004979	The dataset reports the 16S rRNA gene sequencing of the fecal microbiota of donors from the Milieu Intérieur Cohort. The Milieu Intérieur cohort includes a total of 1,000 healthy individuals of western European ancestry, recruited in France as part of the Milieu Intérieur project. To assess their fecal microbiota composition, 16S rRNA profiles were generated from stool samples of 863 of the 1,000 donors. Human stool samples were produced at home no more than 24 hours before the scheduled medical visit and collected in a double-lined sealable bag maintaining strict anaerobic conditions. Upon reception at the clinical site, the fresh stool samples were aliquoted and stored immediately at -80°C. DNA was extracted from stool and barcoding PCR was carried out using indexed primers targeting the V3-V5 region of the 16S rRNA gene. Equal volumes of normalized PCR reaction were pooled and thoroughly mixed. The amplicon libraries were sequenced on Illumina MiSeq.	Illumina MiSeq	1311
EGAD00001004980	Comparative analysis of transcriptomes of skin fibroblasts in 17 Type 2 Diabetic individuals by RNA sequencing.	Illumina HiSeq 2000	1
EGAD00001004981	BAM files (Illumina HiSeq 2000) with whole genome sequencing data of 49 individuals of European/Romanian descent, and 50 individuals of Roma (Romani/Rroma) ethnic background from Romania.	Illumina HiSeq 2000	99
EGAD00001004982	VCF files with whole genome sequencing data of 49 individuals of European/Romanian descent, and 50 individuals of Roma (Romani/Rroma) ethnic background from Romania.		99
EGAD00001004984	Fastq files resulting from whole exome sequencing of trios of samples from 6 breast cancer patients: normal breast, pre-NAC biopsy and post-NAC surgical resection.	Illumina HiSeq 3000	17
EGAD00001004985	In this study we use expression data from breast cancer tumors to define immune clusters in breast cancer. Immune clusters have gradual levels of immune infiltration. In the intermediate immune infiltration cluster, we found a worse prognosis which is independent of known clinicopathological features. We also found the immune clusters associated with treatment response. Further using gene expression data and deconvolution algorithms to dissect the immune contexture of the clusters.	NextSeq 500	97
EGAD00001004987	This dataset pertains to mitochondrial DNA amplicon sequencing of paired DNA samples from gingivo-buccal oral cancer patients. DNA was isolated from the tumor and blood tissues of 89 patients (178 samples). The sequencing libraries were prepared from whole mitochondrial amplicons using Nextera XT DNA Library Preparation Kit (Illumina) and sequenced in Illumina HiSeq platform. The uploaded BAM files are generated by aligning paired-end reads to the mitochondrial reference sequence (rCRS) using BWA-MEM.	Illumina HiSeq 2500 Ion Torrent PGM	178
EGAD00001004988	Collection of RNA-seq, Illumina, paired-end fastq files for 370 archival tissues from a subset of patients with high grade serous ovarian carcinoma enrolled in the phase 3 ICON7 trial. Clinical data and digital pathology information for CD8 is also available.	Illumina HiSeq 2500	370
EGAD00001004989	This dataset contains a total of 10 families and 51 samples presented in 'Non-invasive prenatal diagnosis by genome-wide haplotyping of cell-free plasma DNA' study, in which 9 of them are cfDNA samples and the rest of samples are gDNA samples. All the samples are targeted sequencing data with a custom 45Mb capture library . Raw pair-end fastq files for each sample is available.	Illumina HiSeq 4000 NextSeq 500	51
EGAD00001004990	Raw RNA-seq data for WT and 2 Gorlin NES cells, tumors derived from MYCN mis-expressed WT NES cells, tumors derived from Gorlin NES cells, and tumors derived from Gorlin NES cells transduced with mutant DDX3X (R351W and R534S) and CRISPR/Cas9 targeting GSE1 (each sample has 3 replicates/tumors except Gorlin NES cell tumors have 4). Raw whole exome sequencing data for WT and Gorlin 1 NES cells, tumors derived from MYCN mis-expressed WT NES cells, and tumors derived from Gorlin NES cells (each sample has 3 replicates/tumors except Gorlin NES cell tumors have 4). Raw data for amplicon sequencing of GSE1 and KDM3B at regions targeted by CRISPR/Cas9 in Gorlin NES cells.	HiSeq X Ten Illumina HiSeq 4000 Illumina MiSeq	44
EGAD00001004991	Paired-end RNA-seq FASTQ files from 21 newborn screening dried blood spot (DBS) samples. These DBS samples were obtained from extremely low gestional age newborns, where 10 of them were affected by a fetal inflammatory response (FIR) before birth, and 11 were unaffected. Total RNA was sequenced using an Illumina NextSeq-500 instrument. The sample preparation protocol included the depletion of rRNA and globin mRNA using the Globin Zero Gold rRNA Removal Kit from Illumina. Libraries were prepared using the NebNext Ultra TM II Directionl RNA LIbrary Prep Kit (New England Biolabs). Each sample was sequenced in 4 lanes, leading to 8 FASTQ files per sample and a total of 21x8=168 FASTQ files. There is an additional number of 8 FASTQ files corresponding to sample BS13, which was downsampled to 1/4 of its original depth (see BS13_README file for details).	NextSeq 500	21
EGAD00001004992	H3K27me3 ChIP-Seq of 6 samples and H3K27ac ChIP-Seq of 4 samples with respective input controls	Illumina HiSeq 2500	20
EGAD00001004993	Ribodepletion RNA-Seq of 8 samples and polyA RNA-Seq of 22 samples	Illumina HiSeq 2500	30
EGAD00001004994	WXS of 3 samples	Illumina HiSeq 2500	3
EGAD00001004996	Total RNA was extracted using RNAble (Eurobio), cleaned-up with RNeasy columns (Qiagen) and sequenced. The libraries were prepared at the Genomics Platform of the Cochin Institute, following the TruSeq Stranded mRNA protocol (Illumina), starting from 1 µg of high quality total RNA. Paired end (2 × 75 bp) sequencing was performed on a Nextseq 500 platform (Illumina). FASTQ sequences were aligned on hg19 (GRCh37) human reference genome with STAR (v.2.5.2a)	NextSeq 500	134
EGAD00001004997	Whole‐exome sequencing was performed using NimbleGen MedExome capture (Roche NimbleGen, Madison, WI, USA) from 1 μg of high quality genomic DNA, followed by sequencing of libraries using paired-end mode (2x 75bp) on a Nextseq 500 platform (Illumina, San Diego, CA, USA), at the Genomics Platform of the Cochin Institute. Reads were aligned on hg19 (GRCh37) using BWA V0.7.17.	NextSeq 500	86
EGAD00001004998	Small RNA (<100 bases in length) were purified from total RNA using miRNeasy kit (Qiagen), then sequenced. Libraries were prepared at the Genomics Platform of the Cochin Institute, following the TruSeq small RNA protocol (Illumina), starting from 1 µg of high quality total RNA. Single read (1 × 75 bp) sequencing was performed on a Nextseq 500 platform (Illumina). FASTQ sequences were aligned on miRBase v.2052, then counted with STAR (v.2.5.2a).	NextSeq 500	111
EGAD00001004999	Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients.	Illumina HiSeq 4000	41
EGAD00001005000	Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients.	HiSeq X Ten	42
EGAD00001005001	Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients.	Illumina HiSeq 4000	40
EGAD00001005002	This dataset maps gene expression regulation in human primary regulatory CD4+ T cells (Tregs). It includes whole genome sequence data for ATAC-seq (114 samples) The final quality filtered set included 73 individuals with ATAC-seq.	Illumina HiSeq 2500 Illumina MiSeq	114
EGAD00001005003	Isolation of bacteria in infected brains in patients with Parkinson's disease. Here we used next generation sequencing of 16S ribosomal RNA gene PCR amplicons (NGS 16S amplicon analysis).	Illumina MiSeq	22
EGAD00001005006	PAGE Dataset Mar 2019		2595
EGAD00001005007	PAGE Dataset Mar 2019		2595
EGAD00001005008	PAGE Dataset Mar 2019		875
EGAD00001005009	Paired-end RNA-seq BAM files from 21 newborn screening dried blood spot (DBS) samples. These DBS samples were obtained from extremely low gestional age newborns, where 10 of them were affected by a fetal inflammatory response (FIR) before birth, and 11 were unaffected. Total RNA was sequenced using an Illumina NextSeq-500 instrument. The sample preparation protocol included the depletion of rRNA and globin mRNA using the Globin Zero Gold rRNA Removal Kit from Illumina. Libraries were prepared using the NebNext Ultra TM II Directionl RNA LIbrary Prep Kit (New England Biolabs). There is one BAM file per sample and there is an additional BAM file, corresponding to sample BS13, which was downsampled to 1/4 of its original depth (see BS13_README file for details).		21
EGAD00001005010	This dataset was conceived to characterize the epigenomic landscape of representative iBCP-ALL subtypes. To do so, we performed Whole Genome DNA bisulfite-sequencing on 2 MLL-AF4, 2 MLL-AF9 and 2 non-MLL rearranged leukemias, and also 2 pools of BCPs obtained from fetal liver.	HiSeq X Ten	8
EGAD00001005011	Whole genome sequencing and genotyping of samples from BFMOS (Mossi from Burkina faso)		111
EGAD00001005012	Whole genome sequencing and genotyping of samples from CMBAN (Bantu from Cameroon)		76
EGAD00001005013	Whole genome sequencing and genotyping of samples from CMSBA (Semibantu from Cameroon)		63
EGAD00001005014	Whole genome sequencing and genotyping of samples from TZWAS (Wasaamba from Tanzania)		174
EGAD00001005015	Whole genome sequencing and genotyping of samples from TZCHA (Chagga from Tanzania)		156
EGAD00001005016	Whole genome sequencing and genotyping of samples from TZPAR (Pare from Tanzania)		148
EGAD00001005017	B-cell acute lymphoblastic leukemia (B-cell ALL) is the most common cancer in childhood. Studying identical twins with B-cell ALL provides a unique and tractable model for deciphering the developmental timing of pre- and post-natal mutations contributing to clonal evolution. To date, this has mainly focused on major cytogenetic subgroups of childhood B-cell ALL, including MLL fusions, ETV6-RUNX1, hyperdiploidy, and BCR-ABL1. However, formal demonstration of the prenatal origin and “backtracking” the natural history of the leukemia remains understudied in “B-other”/Normal Karyotype (NK) B-cell ALL. To characterize the genetic landscape of this particular leukemia subtype, we performed whole genome DNA- and B-cell receptor (BCR)- on a pair of 8-month-old monozygotic twins diagnosed with concordant “B-other”/NK B-cell ALL.	HiSeq X Ten	4
EGAD00001005018	B-cell acute lymphoblastic leukemia (B-cell ALL) is the most common cancer in childhood. Studying identical twins with B-cell ALL provides a unique and tractable model for deciphering the developmental timing of pre- and post-natal mutations contributing to clonal evolution. To date, this has mainly focused on major cytogenetic subgroups of childhood B-cell ALL, including MLL fusions, ETV6-RUNX1, hyperdiploidy, and BCR-ABL1. However, formal demonstration of the prenatal origin and “backtracking” the natural history of the leukemia remains understudied in “B-other”/Normal Karyotype (NK) B-cell ALL. To characterize the epigenetic landscape of this particular leukemia subtype, we performed DNA bisulfite-sequencing on a pair of 8-month-old monozygotic twins diagnosed with concordant “B-other”/NK B-cell ALL.	HiSeq X Ten	2
EGAD00001005019	Whole genome sequencing of 92 individuals from 44 African indigenous populations. Sequences made with Illumina HiSeq 2000 sequencing system; data uploaded in BAM format.	Illumina HiSeq 2000	43
EGAD00001005020	The incidence of brain metastases in breast cancer (BCBM) patients is increasing. These patients have a very poor prognosis and therefore identification of blood-based biomarkers, such as circulating tumor cells (CTCs) and understanding the genomic heterogeneity could help to personalize treatment options. In this study, DNA from individual CTCs as well as corresponding primary tumors and brain metastases were analyzed by next generation sequencing (NGS) in order to evaluate copy number aberrations and single nucleotide variations (SNVs).	Illumina HiSeq 2000	28
EGAD00001005021	Bam files for 16 meningioma tumor samples; ChIPseq performed on Illumina HiSeq 2000	Illumina HiSeq 2000	16
EGAD00001005022	ONT Minion reads to provide 30x coverage for a patient with ataxia-pancytopenia syndrome.	MinION	1
EGAD00001005023	The COMPARE study enrolled 29,066 British blood between donors between February 2016 and March 2017, the study aim is to find the optimum technology for haemoglobin screening (ISRCTN 90871183). All participants were at the time of recruitment active blood donors. The 4,796 participants in this dataset have consented to join the NIHR BioResource. Genotyping data was produced using the Thermo Fisher Scientific Axiom Genotyping platform. The UK Biobank version 2 array design was used, content on this array has been added to allow for accurate DNA based identification of human blood group antigens.		-
EGAD00001005025	Isolation of fungi in infected neural tissues in patients with Parkinson's disease. Here we used next generation sequencing of Internal Transcribed Spacer (ITS) regions, by PCR amplicons (NGS ITS amplicon analysis).	Illumina MiSeq	22
EGAD00001005026	The Donor InSight III study, undertaken by Sanquin research, recruited 3,046 Dutch blood donors between 2015 and 2016. The purpose of the study was to gain more insight into characteristics of donors, their motivations and health. All participants were at the time of recruitment active blood donors. Genotyping data was produced using the Thermo Fisher Scientific Axiom Genotyping platform. The UK Biobank version 2 array design was used, content on this array has been added to allow for accurate DNA based identification of human blood group antigens.		-
EGAD00001005027	Amplicon sequencing from 45 samples - Amplicons of 16s v3-v4	Illumina MiSeq	45
EGAD00001005028	Analysis of mutational signatures is becoming routine in cancer genomics, with implications for pathogenesis, classification, prognosis, and even treatment decisions. However, the field lacks a consensus on analysis and result interpretation. Using whole-genome sequencing of multiple myeloma (MM), chronic lymphocytic leukemia (CLL) and acute myeloid leukemia, we compare the performance of public signature analysis tools. We describe caveats and pitfalls of de novo signature extraction and fitting approaches, reporting on common inaccuracies: erroneous signature assignment, identification of localized hyper-mutational processes, overcalling of signatures. We provide reproducible solutions to solve these issues and use orthogonal approaches to validate our results. We show how a comprehensive mutational signature analysis may provide relevant biological insights, reporting evidence of c-AID activity among unmutated CLL cases or the absence of BRCA1/BRCA2-mediated homologous recombination deficiency in a MM cohort. Finally, we propose a general analysis framework to ensure production of accurate and reproducible mutational signature data.	HiSeq X Ten	5
EGAD00001005029	Paired immunoglobulin heavy and light chain sequences were obtained from 803 single IgA plasma cells isolated from duodenal biopsies of five celiac disease patients. The cells were specific to discrete antigenic regions of the enzyme TG2, which is the main autoantigen in celiac disease.	Illumina MiSeq	10
EGAD00001005030	Brain metastasis (BM) of colorectal cancer (CRC) is rare but lethal and lacks effective therapies or a good understanding of its genomic landscapes. We conduct an analysis of whole-exome sequencing (WES, Illumina HiSeq 2500 sequencing platform) on 11 patient-matched BMs, primary CRC tumours, and adjacent normal tissues; and whole-genome sequencing (WGS, Illumina HiSeq X Ten platform) on 8 patient-matched BMs, primary CRC tumors, and adjacent normal tissues to uncover the whole-genome mutational landscape of colorectal cancer with brain metastasis.		38
EGAD00001005031	RNA sequencing of frozen tumor biopsies from patients with blastic plasmacytoid dendritic cell neoplasm. 4 samples. Illumina HiSeq 4000.	Illumina HiSeq 4000	4
EGAD00001005032	Whole-genome sequencing of frozen tumor biopsies from patients with blastic plasmacytoid dendritic cell neoplasm. 10 samples. Illumina HiSeq X-Ten.	HiSeq X Ten	10
EGAD00001005033	The reference human genome is still incomplete, and several non-reference sequences have derived from diverse populations. With the available of whole genome sequencing data from multiple individuals, we could construct the pan-genome sequence. Here we provide high quality genome sequencing (~30x coverage) from 185 Han Chinese individuals. All samples were sequenced using Illumina HiSeq X10 sequencer and paired-end 150-bp reads were produced.	HiSeq X Ten	185
EGAD00001005034	Illumina BAMs for a patient with ataxia-pancytopenia syndrome and both of their parents.	Illumina HiSeq 2500	3
EGAD00001005035	Tumor mutational burden (TMB) has emerged as a predictive biomarker of response to immune checkpoint inhibitors. Standardization of TMB measurement is essential for implementing diagnostic tools to guide treatment. Here we evaluate bioinformatic TMB analysis by whole exome sequencing (WES) in formalin-fixed, paraffin-embedded samples. In CheckMate 026, TMB was retrospectively assessed in 312 patients with non-small cell lung cancer (58% of the intent-to-treat population) who received first-line nivolumab treatment or chemotherapy. We examined the sensitivity of TMB assessment to bioinformatic filtering methods and assessed concordance between TMB data derived by WES and the FoundationOne CDx™ assay. TMB scores comprising synonymous, indel, frameshift, and nonsense mutations (all mutations) were 3.1-fold higher than data including missense mutations only, but values were highly correlated (Spearman’s r = 0.99). Scores from CheckMate 026 samples including missense mutations only were similar to those generated from data in The Cancer Genome Atlas, but those including all mutations were generally higher. Using databases for germline subtraction (instead of matched controls) showed a trend for race-dependent increases in TMB scores. Parameter variation can therefore impact TMB calculations, highlighting the need for standardization. Encouragingly, WES and FoundationOne CDx outputs were highly correlated (Spearman’s r = 0.90) and differences could be accounted for by empirical calibration, suggesting that reliable TMB assessment across assays, platforms and centers is achievable.	Illumina HiSeq 2500	368
EGAD00001005037	This dataset contains 200 RNA-seq bam files (142 SLE, 58 healthy individuals). RNA libraries were prepared with the Illumina TruSeq sample preparation kit and were sequenced on Illumina HiSeq2000. 49 bp paired-end reads were mapped to the GRCh37 reference human genome using the GEM mapper.This dataset was generated as part of the following study: Panousis et al (2019). Combined genetic and transcriptome analysis of patients with SLE: Distinct, targetable signatures for susceptibility and severity	Illumina HiSeq 2000	200
EGAD00001005038	This dataset contains the imputed genotypes for 197 individuals. All individuals were genotyped with the Illumina HumanCoreExome-24 array. The individuals were phased with SHAPEIT and imputed to the 1000 Genomes Project Phase III using IMPUTE2. This dataset was generated as part of the following study: Panousis et al (2019). Combined genetic and transcriptome analysis of patients with SLE: Distinct, targetable signatures for susceptibility and severity		197
EGAD00001005039	This dataset contains the RPKM and raw read counts of expression for all the individuals. This dataset was generated as part of the following study: Panousis et al (2019). Combined genetic and transcriptome analysis of patients with SLE: Distinct, targetable signatures for susceptibility and severity.		200
EGAD00001005040	This dataset contains the clinical phenotypes/covariates information for all the individuals. This dataset was generated as part of the following study: Panousis et al (2019). Combined genetic and transcriptome analysis of patients with SLE: Distinct, targetable signatures for susceptibility and severity.		200
EGAD00001005041	This dataset contains the eQTL summary statistics (nominal pass, significant eQTLs, best associated variant per gene). eQTL mapping was performed with fastQTL. This dataset was generated as part of the following study: Panousis et al (2019). Combined genetic and transcriptome analysis of patients with SLE: Distinct, targetable signatures for susceptibility and severity.		142
EGAD00001005042	This dataset contains the sQTL summary statistics (nominal pass, significant sQTLs). sQTL mapping was performed with QTLtools. This dataset was generated as part of the following study: Panousis et al (2019). Combined genetic and transcriptome analysis of patients with SLE: Distinct, targetable signatures for susceptibility and severity.		142
EGAD00001005044	The majority of embryos that are created through IVF do not implant. It seems plausible that rates of implantation would improve if we had a better understanding of molecular factors affecting embryo competence. Currently, the process of selecting an embryo for uterine transfer utilizes an ad-hoc combination of morphological criteria, the kinetics of development, and genetic testing for aneuploidy. However, no single criterion can ensure selection of a viable embryo. In contrast, RNA-sequencing of embryos could yield highly dimensional data, which may provide additional insight and illuminate the discrepancies among current selection criteria. Indeed, recent advances enabling the production of RNA-sequencing (RNA-seq) libraries from single cells have facilitated the application of this technique to the study of some transcriptional events in early human development. However, these studies have not assessed the quality of their constituent embryos relative to commonly used embryological criteria. Here, we perform proof-of-principle advancement to clinical selection procedures by generating high quality RNA-seq libraries from a trophectoderm biopsy as well as the remaining whole embryo. We combine state-of-the-art embryological methods with low-input RNA-seq to develop the first transcriptome-wide approach for use in future predictive embryology studies. Specifically, we demonstrate the capacity of RNA-seq as a promising tool in preimplantation screening by showing that biopsies of an embryo can capture valuable information content available in the whole embryo from which they are derived. Furthermore, we show that this technique can be used to generate a RNA-based digital karyotype, and to identify candidate competence-associated genes. Together, these data establish the foundation for a future RNA-based diagnostic in IVF.	Illumina HiSeq 2500	54
EGAD00001005046	The BAM files for WES and RNA seq used in the article "Molecular Profiling Reveals Unique Immune and Metabolic Features of Melanoma Brain Metastases." on cancer Discovery 2019. PMID: 30787016 PMCID: PMC6497554. Authors : Grant M Fischer, ..., Michael A Davies.	Illumina HiSeq 2000 Illumina MiSeq	199
EGAD00001005047	In order to characterize the T cell receptor (TCR) repertoire of DQ2.2-glut-L1-specific T cells, we performed high-throughput DNA sequencing of rearranged TCR-α and TCR-β genes of the single HLA-DQ2.2:DQ2.2-glut-L1 tetramer binding CD4+ T cells isolated from six T-cell lines (TCLs) of four Celiac disease patients.	Illumina MiSeq	6
EGAD00001005048	In order to characterize the T cell receptor (TCR) repertoire of DQ2.5-hor-3-specific T cells, we performed high-throughput DNA sequencing of rearranged TCR-α and TCR-β genes of the single HLA-DQ2.5:DQ2.5-hor-3- tetramer binding CD4+ T cells isolated from biopsies of celiac disease patients. We also sequenced the TCR of the T-cell clones (TCCs) that were generated by cloning by limited dilution and antigen-free expansion of HLA-DQ2.5:DQ2.5-hor-3-tetramer binding CD4+ T cells from biopsies of celiac disease patients.	Illumina MiSeq	14
EGAD00001005049	5000 cells of each subset of CD8 T cells (CD103-KLRG1+, CD103-KLRG1- and CD103+ from LP and CD103+ IELs) were sorted into tubes. A modified SMART protocol was used in first-strand cDNA synthesis, and TCRalpha / TCRbeta genes were amplified in two rounds of semi-nested PCR reaction, following the method described in detail in Risnes et al., 2018.	Illumina MiSeq	40
EGAD00001005050	Single-cell TCRalpha-beta sequencing of LP CD103+ CD8 T cells from the grafted/native duodenum of two donors (Ptx#1 and Ptx#2) before and 1 year after transplantation. Single cells were sorted into 96-well plates. Paired TCRalpha and TCRbeta sequences were obtained after three nested PCR with multiplexed primers covering all TCRalpha and TCRbeta V genes, as described before (Risnes et al., 2018), and original protocol in (Han et al., 2014).	Illumina MiSeq	6
EGAD00001005052	This dataset contains RNA-sequencing of Bone marrow-derived CD34+ cells from Healthy Controls (n=2) and SLE patients (n=8). SLE patients are divided into two categories based on severity: patients with moderate/mild disease (n=3) and patients with severe disease (n=5). Libraries were generated using the Illumina TruSeq Sample Preparation kit v2. Single-end 75-bp mRNA sequencing was performed on Illumina NextSeq 500. The raw fastq files are uploaded.	NextSeq 500	10
EGAD00001005053	This dataset contains 4 batches of Indonesian RNA-seq data from Mentawai, New Guinea and Sumba islands. One RNA-seq batch was prepared without Globin depletion, and three batches were Globin depleted using the Illumina Globin-Zero Gold Kit. There are 179 runs in total, including 119 unique samples. Dataset includes multiple batch control samples.	Illumina HiSeq 2500	179
EGAD00001005054	We performed single cell RNA sequencing (scRNA-seq) for 208,506 cells derived from 58 lung adenocarcinomas from 44 patients, which covers primary tumour, lymph node and brain metastases, and pleural effusion in addition to normal lung tissues and lymph nodes. The extensive single cell profiles depicted a complex cellular atlas and dynamics during lung adenocarcinoma progression which includes cancer, stromal, and immune cells in the surrounding tumor microenvironments.	Illumina HiSeq 2500	80
EGAD00001005055	The goals of this study is to investigate the prevalence and heritability of age-related clonal haemopoeisis (ARCH) in healthy elderly individuals.We will use a bespoke bait set to pull down DNA regions of interest in whole blood samples combined with HiSeq at a deep level . By correlating findings from each individual to their respective twin we hope to elucidate whether heritable traits influence the development of ARCH. a. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2019-05-31.	Illumina HiSeq 2500	1
EGAD00001005056	To better understand the pattern of genetic changes over time, we performed whole exome sequencing of sequential bone marrow samples from 9 patients taken overtime including some paired SMM/newly diagnosed MM/Relapse MM samples. Samples from 9 patients (9 controls and 53 tumors) underwent whole exome sequencing with an additional capture for the IGH, IHK, IGL, and MYC loci. DNA was obtained from either CD138+ cells from the bone marrow of smoldering myeloma patients through time (tumor) or from stem cell harvests or peripheral blood cells from the same patient (control). 100 ng of DNA was fragmented, end-repaired, and adapters ligated using NimbleGen's MedExome. After PCR amplification hybridized libraries underwent further amplification before being sequenced on a NextSeq500 (Illumina) using 75 bp paired end reads.	NextSeq 500	62
EGAD00001005057		Illumina HiSeq 4000	6
EGAD00001005058	Identify and track clonal evolution of clones in consecutive human chronic lymphocytic leukemia samples identified by whole exome sequencing.	Illumina Genome Analyzer II	79
EGAD00001005059	This dataset reports whole genome sequences for 82 individuals from different populations from Mentawai, New Guinea, Sumatra and Sumba islands.	HiSeq X Ten	82
EGAD00001005060	May 2019 data update (fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	HiSeq X Ten Illumina HiSeq 2500	8
EGAD00001005061	Bam files for 124 samples (62 tumor vs blood pairs); Whole Genome Sequencing performed on Illumina HiSeq X Ten	HiSeq X Ten	124
EGAD00001005062	We performed shallow coverage whole genome sequencing on 147 glioma samples and analyzed their copy number profile. The coverage of each sample is about 2. Data are presented as VCF files which describe the copy number segments and their log2 ratio.		147
EGAD00001005063	Tumor and matching normal tissues were collected from 8 patients and organoids were derived from each tissue. Whole exome libraries were prepared for tumor tissue, normal tissue, tumor organoids and normal organoids, and paired-end sequencing was performed using Illumina Novaseq 6000 system. Only tumor and matching normal tissues were sequenced for the patients without available organoids.	Illumina NovaSeq 6000	24
EGAD00001005064	Bronchoscopies were collected from healthy and asthma volunteers. Cohort inclusion criteria for all subjects were: age between 40 – 65 years and history of smoking < 10 pack years. For the asthmatics, inclusion criteria were: age of onset of asthmatic symptoms ≤12 years, documented history of asthma, use of inhaled corticosteroids with(out) β2-agonists due to respiratory symptoms and a positive provocation test (i.e. PC20 methacholine ≤8mg/ml with 2-minute protocol). For the non-asthmatic controls, the following criteria were essential for inclusion: absent history of asthma, no use of asthma-related medication, a negative provocation test (i.e. PC20 methacholine >8 mg/ml and adenosine 5'-monophosphate >320 mg/ml with 2-minute protocol), no pulmonary obstruction (i.e. FEV1/FVC ≥70%) and absence of lung function impairment (i.e. FEV1 ≥80% predicted). Asthmatics stopped inhaled corticosteroid use 6 weeks prior to all tests. All subjects were clinically characterised with pulmonary function and provocation tests, blood samples were drawn, and finally subjects underwent a bronchoscopy under sedation. If a subject developed upper respiratory symptoms, bronchoscopy was postponed for ≥6 weeks. Fibreoptic bronchoscopy was performed using a standardised protocol during conscious sedation. Six macroscopically adequate endobronchial biopsies were collected for this study, located between the 3rd and 6th generation of the right lower and middle lobe. Extracted biopsies were processed directly thereafter, with a maximum of one hour delay. The medical ethics committee of the Groningen University Medical Center Groningen approved the study, and all subjects gave their written informed consent.	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	4087
EGAD00001005065	Human lung tissue was obtained from deceased organ donors from whom organs were being retrieved for transplantation. Informed consent for the use of tissue was obtained from the donors’ families (REC reference: 15/EE/0152 NRES Committee East of England - Cambridge South). Fresh tissue from the peripheral parenchyma of the left lower lobe or lower right lobe of the lung was excised within 60 minutes of circulatory arrest and preserved in University of Wisconsin (UW) organ preservation solution (Belzer UW® Cold Storage Solution, Bridge to Life, USA) until processing.	Illumina HiSeq 4000	11
EGAD00001005066	Four iPSC line data were sequenced by WGS. One of them has gene MYBPC3 modified.	HiSeq X Five	4
EGAD00001005069	Whole genome and transcriptome sequencing of a pancreatic tumor harboring a RASGRP1 gene fusion	HiSeq X Ten Illumina HiSeq 4000	2
EGAD00001005070	Non-deduplicated bam files comprising Illumina HiSeq2500 SE100 low coverage whole genome data for 30 pre-treatment (BL) cfDNA samples and 20 matched post-treatment (PD) cfDNA samples.	Illumina HiSeq 2500	50
EGAD00001005071	We showed that mice in which Dnase1l3 had been deleted showed aberrations in the fragmentation of plasma DNA. We also observed a change in the ranked frequencies of end motifs of plasma DNA caused by the Dnase1l3 deletion.	NextSeq 500	41
EGAD00001005072	We have been applying whole genome and transcriptome sequencing across metastases collected during post mortem. Herein we show the findings for the first such patient.	HiSeq X Ten	3
EGAD00001005073	Despite multiple large-scale sequencing studies offering substantial insight into the genomic landscape of cutaneous melanoma, the molecular events surrounding disease progression and the resulting molecular heterogeneity between metastases have not been fully elucidated. We have been applying whole genome and transcriptome sequencing across metastases collected during post mortem. Herein we show the findings for the first such patient. This is a targeted pulldown validation in support of the whole-genome sequencing analysis of the metastatic tumours and targeted pulldown of the primary tumour.	Illumina HiSeq 2500	15
EGAD00001005074	This dataset includes 48 bam files, including those of 24 tumors and 24 paired normal samples.	Illumina HiSeq 2500	48
EGAD00001005075	Deep sequencing of viral samples (average ~9,000x coverage) from patients chronically infected with Hepatitis B (HBV). Whole HBV genome sequencing of 1467 patients (1102 in discovery and 365 in validation cohort) chronically infected with HBV at baseline. The patient population contained HBV genotypes A (98), B (285), C (716), D (356), E (7), and F (5) with 977 HBeAg-positive and 490 HBeAg-negative patients.	Illumina MiSeq	1467
EGAD00001005076	This dataset is for TrypanoGEN Phase 1: Variant discovery, and includes 233 samples sequenced to approximately 10X coverage. Samples are from Guinea, Cote D’Ivoire, DRC and Uganda using Illumina HiSeq 2500.		233
EGAD00001005077	This dataset contains 3 GBM stem cell samples profiled by RNA-seq. Two of those samples have been profiled with WGS as well (with matched blood WGS data available)	HiSeq X Five Illumina HiSeq 2500 NextSeq 500	7
EGAD00001005078	This dataset includes 139 bam files of mRNA sequencing. All subjects are tumor samples of pediatric acute myeloid leukemia.	Illumina HiSeq 2000	139
EGAD00001005079	We want to investigate mosaic mutations as a cause of childhood IBD . This dataset contains all the data available for this study on 2019-06-10.	Illumina HiSeq 4000	28
EGAD00001005080	This study involves mutagenizing a range of different cell lines with ENU to identify those mutations which engender resistance to targeted treatment. . This dataset contains all the data available for this study on 2019-06-10.	Illumina HiSeq 2500	16
EGAD00001005081	This study involves mutagenizing 11-18 with ENU to identify those mutations which engender resistance to targeted treatment. . This dataset contains all the data available for this study on 2019-06-10.	Illumina HiSeq 2500	120
EGAD00001005082	Exome Sequencing in a set of Asian Head and Neck cancer cell lines, to identify mutations that can be used to genomically classify the cell lines. . This dataset contains all the data available for this study on 2019-06-10.	Illumina HiSeq 2500	21
EGAD00001005083	300-Obese cohort, Nijmegen, the Netherlands. Dataset contains gut microbiome data generated by metagenomic sequencing.	Illumina HiSeq 2000	297
EGAD00001005084	Whole genome sequencing of participants from the INTERVAL study. . This dataset contains all the data available for this study on 2019-06-12.	HiSeq X Ten	5112
EGAD00001005085	15x Whole Genome Sequencing of 15,000 individuals from the INTERVAL study cohort, phase II. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2019-06-12.	HiSeq X Ten	5592
EGAD00001005086	15x Whole Genome Sequencing of 15,000 individuals from the INTERVAL study cohort, phase III. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2019-06-12.	HiSeq X Ten	-
EGAD00001005087	Multi-omic data for lung neuroendocrine neoplasms, including the first multi-omic sequencing data for the understudied lung atypical carcinoids. The data includes Whole-exomes, whole-genomes, RNA-seq, and EPIC 850K methylation array data.	HiSeq X Five Illumina HiSeq 2000	23
EGAD00001005088		NextSeq 500	236
EGAD00001005089	RNAseq of U251 and two ID1 gene knockouts	Illumina HiSeq 2000	3
EGAD00001005092	Whole transcriptome (n=4) and targetted RNA sequencing (n=15) .bam files of infantile glioma samples reported on at the Hospital for Sick Children	Illumina HiSeq 2500 NextSeq 550	19
EGAD00001005093		Illumina HiSeq 4000	118
EGAD00001005095	This dataset comprises 1440 whole genome sequenced samples from the Medical Genome Reference Bank. https://sgc.garvan.org.au/initiatives/mgrb The files are provided in cram format, aligned to hs37d5 with decoys, with no further processing applied. The dataset also contains phenotype information for each sample.		1440
EGAD00001005097	Cancer and germline exomes, and cancer RNA-seq consisiting of FASTQ paired-end reads from melanoma and lung cancer samples	Illumina HiSeq 2500	22
EGAD00001005098	Dataset consists of forty vcf files, outcome of variant calling with caveman algorithm of matched bam file (TUMOR and NORMAL) of RRMM patients. Bam files were obtained from whole exome sequencing!		40
EGAD00001005099	The dataset consists of two patient-derived xenograft model of myxoid liposarcoma one sensitive and one resistant to trabectedin. Both models underwent the same treatment schedule with trabectedin (tree time points and one treatment with doxorubicin). We performed genomic profiling using Agilent OneSeq assay.	NextSeq 500	41
EGAD00001005100	Biopsies from visceral adipose tissue from the omental depot (OAT) were obtained from five obese individuals and one lean donor with participant informed consent obtained after the nature and possible consequences of the studies were explained under protocols approved by the Institutional Review Boards of the Perelman School of Medicine at the University of Pennsylvania, the Children’s Hospital of Philadelphia, or the Tel Aviv Sourasky Medical Center. The obese donors underwent bariatric surgery, the lean donor underwent cholecystectomy. OAT samples were placed in 1 mL of DMEM, and finely minced under sterile conditions before digestion in 50 mL of DMEM with 3 mg/1 mL collagenase IV (Gibco). Samples were incubated at 37°C in a rotating oven for 20-60 min. Adipocyte and stromal vascular fractions (SVF) were separated by centrifugation, and red blood cells (RBCs) were removed from the SVF by histopaque gradient (Sigma). Single-cell RNA-sequencing libraries were prepared using the MARS-seq pipeline, and sequenced on the MiSeq 500 or HiSeq 2500 Sequencing System (Illumina).	Illumina MiSeq	23
EGAD00001005101	Biopsies from visceral adipose tissue from the omental depot (OAT) were obtained from an obese individual with participant informed consent obtained after the nature and possible consequences of the studies were explained under protocols approved by the Institutional Review Boards of the Perelman School of Medicine at the University of Pennsylvania, the Children’s Hospital of Philadelphia, or the Tel Aviv Sourasky Medical Center. The obese donor underwent bariatric surgery, the lean donor underwent cholecystectomy. OAT samples were placed in 1 mL of DMEM, and finely minced under sterile conditions before digestion in 50 mL of DMEM with 3 mg/1 mL collagenase IV (Gibco). Samples were incubated at 37°C in a rotating oven for 20-60 min. Adipocyte and stromal vascular fractions (SVF) were separated by centrifugation, and red blood cells (RBCs) were removed from the SVF by histopaque gradient (Sigma). Single-cell RNA-sequencing libraries were prepared using the Chromium platform (10x genomics), and sequenced on the MiSeq 500 or HiSeq 2500 Sequencing System (Illumina).	Illumina HiSeq 2500	2
EGAD00001005103	RNA sequencing was performed on 54 bone marrow samples at diagnosis of paediatric patients with B lymphoblastic leukemia.	Illumina HiSeq 2500 NextSeq 500	54
EGAD00001005105	Sample set of 74 whole-exome sequencing samples from sporadic Burkitt lymphoma patients from the UK. 33 of these samples have matched constitutional data, giving a total number of 107 samples	unspecified	107
EGAD00001005107	To identify novel causes of hereditary thrombocytopenia, we performed a genetic association analysis of whole-genome sequencing (WGS) data from 13,037 individuals enrolled in the NIHR BioResource, including 233 cases with isolated thrombocytopenia. We found an association between rare variants in the transcription factor (TF)-encoding gene IKZF5 and thrombocytopenia. We report five causal missense variants in or near IKZF5 zinc fingers (Znfs), of which two occurred de novo and three co-segregated in three pedigrees. A canonical DNA-Znf binding model predicts that three of the variants alter DNA recognition. Expression studies showed that chromatin binding was disrupted in mutant compared to wild-type (WT) IKZF5 and electron microscopy (EM) revealed a reduced quantity of alpha granules in normally sized platelets. Proplatelet formation (PPF) was reduced in megakaryocytes (MKs) from seven cases relative to six controls. Comparison of RNA-seq data from platelets, monocytes, neutrophils and CD4+ T-cells from three cases and 14 healthy controls showed 1,194 differentially expressed genes (DEGs) in platelets but only four DEGs in each of the other blood cell types. In conclusion, IKZF5 is a novel transcriptional regulator of megakaryopoiesis and the eighth transcription factor associated with dominant thrombocytopenia in humans.	Illumina HiSeq 4000	51
EGAD00001005109	This dataset contains primary raw data of whole-genome sequencing, whole-genome bisulfite sequencing, ATAC-seq, ChIP-seq (ChIP-mentation) of histone variants and modifications, as well as RNA-seq of giant cell tumor of bone tissue and primary cell line samples.	HiSeq X Ten Illumina HiSeq 2000	29
EGAD00001005111	Most patients with late stage high-grade serous ovarian cancer (HGSOC) initially respond to chemotherapy but inevitably relapse and develop resistance, highlighting the need for novel therapies to improve patient outcomes. The MEK/ERK pathway is activated in a large subset of HGSOC, thus making it an attractive therapeutic target. Here, we systematically evaluated the extent of MEK/ERK pathway activation and efficacy of pathway inhibition in a large panel of well-annotated HGSOC patient-derived xenograft (PDX) models. The vast majority of models were nonresponsive to the MEK inhibitor cobimetinib (GDC-0973) despite effective pathway inhibition. Proteomic analyses of adaptive responses to GDC-0973 revealed that GDC-0973 upregulated the pro-apoptotic protein BIM, thus priming the cells for apoptosis regulated by BCL2-family proteins. Indeed, combination of both MEK inhibitor and dual BCL-2/XL inhibitor (ABT-263) significantly reduced cell number, increased cell death and displayed synergy in vitro in most models. In vivo, the GDC-0973 and ABT-263 combination was well tolerated and resulted in greater tumor growth inhibition than single agents. Detailed proteomic and correlation analyses identified two subsets of responsive models – those with high BIM at baseline that was increased with MEK inhibition and those with low basal Bim and high pERK levels. Models with low BIM and low pERK were non-responsive. Our findings demonstrate that combined MEK and BCL-2/XL inhibition has therapeutic activity in HGSOC models and provide a mechanistic rationale for clinical evaluation of this drug combination as well as the assessment of the extent to which BIM and/or pERK levels predict drug combination effectiveness in chemoresistant HGSOC.	Illumina HiSeq 2500 Illumina MiSeq	14
EGAD00001005112	The study focus was differential expression in bronchial biopsies between persistent asthma, asthma in remission and healthy controls using RNAseq. There were 184 samples that passed QC. RNA samples were processed using the TruSeq Stranded Total RNA Sample Preparation Kit (Illumina, San Diego, CA), using an automated procedure in a Caliper Sciclone NGS Workstation (PerkinElmer, Waltham, MA). In this procedure, all cytoplasmic and mitochondria rRNA was removed (RiboZero Gold kit). The obtained cDNA fragment libraries were loaded in pools of multiple samples unto an Illumina HiSeq2500 sequencer using default parameters for paired-end sequencing (2 × 100 bp). Data are available as 221 pairs of FASTQ-files. Note that several samples are associated with multiple sequence runs.	Illumina HiSeq 2500	184
EGAD00001005113	50 samples of 16 individuals with Gastrointestinal Tumor. Patients were sequenced in various combinations of WGS, Exome and RNA sequencing.	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000	50
EGAD00001005114	A targeted gene panel that covers coding, noncoding, and short tandem repeat regions improves the diagnosis of patients with neurodegenerative diseases	NextSeq 500	136
EGAD00001005115	Additional unpublished RNA-seq data generated in conjunction with our mulit-tissue ChiP-seq project.	Illumina HiSeq 2500	2
EGAD00001005116	Panel-based next-generation sequencing data of 150 human surgical liver samples from Caucasian donors. The panel was designed for 340 ADME (absorption, distribution, metabolism and excretion) and ADME-related genes. NGS was carried out on the Illumina HiSeq2500 system (Illumina Inc., San Diego, CA, United States) at high depth with 2 × 100 bps paired-end reads. Variants were called using samtools and varscan (2.3.5). Data on n=15,727 filtered variants for the 150 patients are comprised in one vcf file.		150
EGAD00001005118	The dataset contains Bam files from DigiPico runs as well as bulk sequencing data used in the "A highly accurate platform for clone-specific mutation discovery enables the study of active mutational processes" publication.	HiSeq X Ten Illumina HiSeq 4000 NextSeq 550	22
EGAD00001005120	Whole genome sequencing of AML blood or bone marrow at presentation and remission for 5 patients. Relapse samples are included for 2 patients, totaling 12 WGS BAM files.	NextSeq 500	12
EGAD00001005121	MASQ targeted amplicon sequencing of AML blood or bone marrow at presentation, and relapse, when available, for 5 patients. Remission samples of both blood and bone marrow are included for 5 patients. Multiple batches (b1,b2) are used for 2 patients. There are 25 assays of AML data. MASQ data demonstrating sensitivity, input range, and batch size are also included, as 12 assays. All data is provided in paired FASTQ files.	NextSeq 500	37
EGAD00001005122	Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Leber Hereditary Optic Neuropathy (LHON) Rare Disease domain	Illumina HiSeq 2000	-
EGAD00001005123	Short read whole genome sequencing (WGS) CRAM files for the NIHR BioResource Rare Diseases WGS project – Participants from the Ehler-Danlos (ED) and ED-like Syndromes (EDS) Rare Disease domain.	Illumina HiSeq 2000	-
EGAD00001005124	ChIP-seq and RNA-seq of glioblastoma initiating cells and their differentiated counterparts with and without inhibition/knockdown of KDMs.	Illumina HiSeq 2000	90
EGAD00001005125	PARN sequences of Patients 1 and 2 carrying mutatoins as decribed in Benyelles et al., EMBO Molecular Medicine 2019)	Illumina HiSeq 2500	2
EGAD00001005126	All sequencing was performed within the DNAlink (Korea) by using the Solexa sequencing technology (Illumina, San Diego, CA). 1 ug of genomic DNA was sheared to an average size of 150 bp by using the Covaris System. The libraries were prepared by using TruSeq DNA Sample Prep Kit (Illumina). The purified DNA library was hybridized with the SureSelect Human All Exon V3 probe set (Agilent Technologies) to capture 50 Mb of targeted exons following the manufacturer’s instructions. Exome capture was carried out using the Agilent SureSelect Human All Exon 50Mb Kit. The captured exome libraries were sequenced on the Illumina HiSeq2000 using the manufacturer’s recommended protocols.	Illumina HiSeq 2000	224
EGAD00001005127	Single cell transcriptome atlas of immune cells in human small intestine and in Celiac disease	NextSeq 500	45
EGAD00001005128	We profiled two human fetal brainstem specimens at 17 and 19 post-conception weeks by single-cell RNA-seq using 10X Chromium Single Cell 3'. The BAM files are provided.	Illumina HiSeq 4000	2
EGAD00001005129	We profiled 11 patient tumor samples by single-cell and single-nuclei RNA-seq using 10X Chromium 3'. These include samples from the following entities: WNT-subtype medulloblastoma (N=3), embryonal tumors with multilayered rosettes (N=3), and atypical teratoid-rhabdoid tumors (N=5). The BAM files are provided.	Illumina HiSeq 4000 unspecified	11
EGAD00001005130	We profiled 43 normal human adult brain and 11 normal human fetal brain specimens by bulk RNA-seq. The raw fastqs are provided.	unspecified	54
EGAD00001005131	We profiled 186 patient tumor samples by bulk RNA-seq. These includes 38 embryonal brain tumors, 101 high-grade gliomas, 24 low-grade gliomas, 10 medulloblastoma and 13 matched normals. The raw fastqs are provided.	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000 unspecified	186
EGAD00001005133	RNA-sequencing was carried out on ascetic fluid-isolated mesothelial cells from low-grade serous ovarian cancer patients, high-grade serous ovarian cancer patients, chemotherapy-treated high-grade serous ovarian cancer patients and control mesothelial cells obtained from non-oncologic patients to identify differentially expressed genes associated to mesothelial-to-mesenchymal transition process. The dataset contains 18 samples: - Control mesothelial cells: 4 samples - Group 1, high-grade serous ovarian cancer patients: 3 samples - Group 2, chemotherapy-treated high-grade serous ovarian cancer patients: 5 samples - Group 3, low-grade serous ovarian cancer patients: 6 samples	NextSeq 500	18
EGAD00001005134	We investigated the somatic genetic basis of Wilms' tumour and found complex phylogenetic relations between tumours	HiSeq X Ten	20
EGAD00001005135	We investigated the somatic genetic basis of Wilms' tumour and found complex phylogenetic relations between tumours	Illumina HiSeq 2500	59
EGAD00001005136	We investigated the somatic genetic basis of Wilms' tumour and found complex phylogenetic relations between tumours	Illumina HiSeq 4000	15
EGAD00001005137	whole genome sequencing data of parent blood samples. Single cell full-length RNA-seq and PBAT-Seq data of in vitro culture D6 to D14 human embryo.	Illumina HiSeq 4000	638
EGAD00001005138	Exome sequencing of 277 rainforest hunter-gatherers (RHG) and neighbouring farmers (AGR) from Central Africa was performed based on the Nextera Rapid Capture Expanded Exome Kit (62-Mb content) with the Illumina HiSeq 2500. After QC filters, exomes of 266 unrelated individuals were obtained at high coverage (Lopez et al., Curr Biol 2019).	Illumina HiSeq 2500	277
EGAD00001005139	Exome sequencing of 20 rainforest hunter-gatherers (RHG) and 20 neighbouring farmers (AGR) from western central Africa was performed using 101-bp paired-end reads on Illumina HiSeq 2000. All individuals presented very low rates of missing values ranging from 0.5% to 4%, and a mean depth of coverage of 6.5× (ranging from 4× to 13×)(Lopez et al., Curr Biol 2019).	Illumina HiSeq 2000	40
EGAD00001005140	In order to elucidate the biological pathways altered by sphingolipid modulation with N-(4-hydroxyphenyl) retinamide (4HPR) treatment in human HSPC that may contribute to the restraint in proliferation while promoting persistence of HSC self-renewal, as well as determine the mechanism of synergy in enhancement of HSC self-renewal with CB CD34+ agonists UM171 and StemRegenin 1 (SR1), we performed RNA-sequencing (RNA-Seq) of 3 pools of lin-CB cells following 2 or 4 days with DMSO, 4HPR, UM171+SR1 or 3-Factor (4HPR+UM171+SR1). We identified modulation of sphingolipid metabolism regulates self-renewal through activating coordinated stress pathways that coalesce on endoplasmic reticulum stress and autophagy programs.	Illumina HiSeq 2500	25
EGAD00001005141	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 3 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005142	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 3 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005143	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 3 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005144	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 3 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005145	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005146	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 24 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005147	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005148	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005149	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 3 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005150	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 36 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005151	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 36 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005152	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 24 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005153	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 24 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005154	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 3 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005155	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 3 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005156	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 24 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005157	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 24 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005158	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005159	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005160	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005161	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005162	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005163	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 3 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005164	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 3 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005165	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 3 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005166	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005167	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005168	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005169	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005170	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005171	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005172	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005173	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005174	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005175	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005176	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005177	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005178	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005179	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005180	Single-cell RNA-sequencing data generated using the 10X genomics chromium platform, comprising patient derived xenograft, cell line, and primary tumour data. This includes different digestion conditions (37C collagenase vs 6C cold protease) and FACS sorting for live, dying, and dead cells. 12 samples; filetype=fastq	Illumina HiSeq 2000	1
EGAD00001005181	Whole genome sequencing dataset of 54 tumor/normal samples of 25 cHCC-ICC cases.	Illumina HiSeq 4000	54
EGAD00001005182	The RNA-seq data of 97 tumor samples from 77 cHCC-ICC cases, data of two other HCC tumor samples were also included.	Illumina HiSeq 4000	99
EGAD00001005183	The whole exome sequencing data of 291 tumor/normal samples from 121 cHCC-ICC cases.	Illumina HiSeq 4000	291
EGAD00001005184	Long-read WGS of three cancer cell lines	PromethION	3
EGAD00001005185	RNA sequencing on cetuximab treated, untreated and release samples of one metastatic colorectal xenograft. 3 cetuximab treated samples, 3 placebo and 3 release - paired fastq.	Illumina HiSeq 2500	9
EGAD00001005186	WES on cetuximab treated, untreated and release samples of two metastatic colorectal xenografts. First case: 3 cetuximab treated samples, 3 placebo and 3 release. Second case: 2 cetuximab treated samples, 2 placebo and 3 release. Paired fastq.	Illumina HiSeq 2500	16
EGAD00001005187	RNA was isolated from purified human CD8 cells that were incubated with anti-HER2/CD3 TDB in the presence of SK-BR-3 cells. Sequencing libraries were generated and submitted for transcriptome profiling by high-throughput sequencing. Experiments were performed in triplicates for anti-HER2/CD3 TDB treatment and control. This Dataset is associated with the following ArrayExpress Experiment: E-MTAB-8211 - The effect of anti-HER2/CD3 TDB on transcription in human CD8 T cells (bulk)	Illumina HiSeq 4000	6
EGAD00001005188	Single-cell RNA-seq libraries were generated from human PBMCs that were incubated with anti-HER2/CD3 TDB in the presence of KPL-4 cells. This dataset is linked with the following ArrayExpress Experiment: E-MTAB-8212 - The effect of anti-HER2/CD3 TDB on transcription in human PBMCs (single-cell)	Illumina HiSeq 4000	4
EGAD00001005189	Here we provide a catalogue of variants called after sequencing the exomes of 50 Aboriginal individuals from the Northern Territory (NT) of Australia and compare these to 72 previously published exomes from a Western Australian (WA) population of Martu origin. Sequence data for both NT and WA samples were processed using an ‘intersect-then-combine’ (ITC) approach, using GATK and SAMtools to call variants. The data is provided as 2 VCF files, one for the WA population and one for the NT population.		122
EGAD00001005190	RNASeq for Genomic Analysis of Mucinous Tumours (GAMuT)	NextSeq 550	109
EGAD00001005191	To evaluate 3 different tissue dissociation protocols or fresh vs. frozen cell preparations, we performed single-cell RNA sequencing on cancer or distant normal tissue dissociates from 2 colorectal cancer patients. Total 18,409 cells from 10 sample preparations were analyzed (5 primary colorectal cancer and 5 matched normal mucosa). The results suggest highly consistent cellular proportions were recovered with different sample preparation methods.	Illumina HiSeq 2500	10
EGAD00001005192	BLUEPRINT EpiVar Whole Genome Sequencing Phase 2 genotypes		197
EGAD00001005193	That tobacco smoking causes lung cancer is well-established, but we lack quantitative understanding of its effects on genomes of normal bronchial epithelium. We sequenced whole genomes of 632 colonies derived from single bronchial epithelial cells across 16 subjects. Tobacco smoking is the major influence on mutation burden, adding 1000-10,000+ mutations/cell, massively increasing both between-subject and within-subject variance, and generating several distinct signatures of substitutions and indels. A population of cells in subjects with smoking history had mutation burdens equivalent to that expected for never-smokers: these cells lacked tobacco-specific mutational signatures, were four-fold more frequent in ex-smokers than current smokers, and had significantly longer telomeres than their more mutated counterparts. Driver mutations increased in frequency with age, affecting 4-14% of cells in middle-aged never-smokers. In current smokers, ≥25% of cells carried driver mutations and 0-6% cells had 2 or even 3 drivers. Thus, tobacco smoking increases mutation burden, cell-to-cell heterogeneity and driver mutations, but quitting promotes replenishment of bronchial epithelium from mitotically quiescent cells that have avoided tobacco mutagenesis.	HiSeq X Ten	644
EGAD00001005194	15x whole genome sequencing in samples from the isolated population of Orkney. This dataset contains all the data available for this study on 2019-07-23.	HiSeq X Ten	1360
EGAD00001005195	Whole exome sequencing of CD19- relapses in CARPALL study	Illumina HiSeq 3000	10
EGAD00001005196	This data set concerns DNA copy number alterations and mutation data from 30 IBD-associated dysplastic lesions and 13 IBD-associated cancers. DNA was isolated from formalin-fixed, paraffin-embedded material. Whole-genome shallow seq and Truseq amplicon cancer panel (Illumina) were used for detection of DNA copy number alterations and gene mutations, respectively.	Illumina HiSeq 2500	43
EGAD00001005197	Improving the understanding of cardiometabolic syndrome pathophysiology and its relationship with thrombosis are ongoing healthcare challenges. Using plasma biomarkers analysis coupled with the transcriptional and epigenetic characterisation of cell types involved in thrombosis, obtained from two extreme phenotype groups (obese and lipodystrophy) and comparing these to lean individuals and blood donors, the present study identifies the molecular mechanisms at play, highlighting patterns of abnormal activation in innate immune phagocytic cells and shows that extreme phenotype groups could be distinguished from lean individuals, and from each other, across all data layers. The characterisation of the same obese group, six months after bariatric surgery shows the loss of the patterns of abnormal activation of innate immune cells previously observed. However, rather than reverting to the gene expression landscape of lean individuals, this occurs via the establishment of novel gene expression landscapes. Netosis and its control mechanisms emerge amongst the pathways that show an improvement after surgical intervention. Taken together, by integrating across data layers, the observed molecular and metabolic differences form a disease signature that is able to discriminate, amongst the blood donors, those individuals with a higher likelihood of having cardiometabolic syndrome, even when not presenting with the classic features.	Illumina HiSeq 3000 Illumina HiSeq 4000	-
EGAD00001005198	To understand intrinsic cancer cell signatures and the surrounding microenvironemt and their interactions, we performed single-cell RNA sequencing on 63,689 cells from 23 patients with 23 primary colorectal cancer and 10 matched normal mucosa samples. Analyzing of primary colorectal cancer and normal mucosa samples show a comprehensive cellular landscape of colon cancer, which is a valuable resource for the development of therapeutic strategies.	Illumina HiSeq 4000	33
EGAD00001005199	BLUEPRINT WP10 Quantitative Trait Loci (QTLs) Phase 2 full summary statistics data include five molecular traits (eQTL, hQTL(H3K27ac), hQTL(H3K4me1), mQTL, and psiQTL) for three primary blood cells (Monocytes, Neutrophils, and T-cells). Each full summary statistics file contains the associations for all tested variants for each phenotype ID.		197
EGAD00001005200	BLUEPRINT WP10 Quantitative Trait Loci (QTLs) Phase 2 summary statistics of most significant association data include five molecular traits (eQTL, hQTL(H3K27ac), hQTL(H3K4me1), mQTL, and psiQTL) for three primary blood cells (Monocytes, Neutrophils, and T-cells). Each summary statistics file contains the most significant association for each phenotype ID.		197
EGAD00001005201	13 ATAC-Seq datasets of human pancreatic islets from 13 donors	Illumina HiSeq 2500	13
EGAD00001005202	6 ChIP-Seq datasets of Mediator in human pancreatic islets from 6 donors	Illumina HiSeq 2500	6
EGAD00001005203	3 ChIP-Seq datasets of Cohesin in human pancreatic islets from 3 donors	Illumina HiSeq 2500	3
EGAD00001005204	14 ChIP-Seq datasets of H3K27ac in human pancreatic islets from 14 donors, where islets were treated in high (11mM) glucose conditions. Samples IDs HI-129, HI-130, HI-131, HI-132, HI-135, HI-137 and HI-152 were also cultured in low glucose conditions.	Illumina Genome Analyzer IIx Illumina HiSeq 2500	14
EGAD00001005205	7 H3K27ac ChIP-Seq datasets from 7 donors where islets were treated in low (4mM) glucose conditions.	Illumina HiSeq 2500	7
EGAD00001005206	4 Promoter-Capture Hi-C datasets of human pancreatic islets from 4 human islet donors	Illumina HiSeq 2500	4
EGAD00001005207	7 RNA-Seq datasets in human pancreatic islets from 7 donors, where islets were treated in high (11mM) glucose conditions	Illumina HiSeq 2500	7
EGAD00001005208	7 RNA-Seq datasets in human pancreatic islets from 7 donors, where islets were using low (4mM) glucose conditions.	Illumina HiSeq 2500	7
EGAD00001005209	Input dataset of human islets. To be used in conjuction with Cohesin and Mediator ChIP-Seq datasets.	Illumina HiSeq 2500	1
EGAD00001005210	2 Circular chromosome conformation capture (4C-Seq) datasets of the human beta cell line EndoC-bh1.	Illumina HiSeq 2500	1
EGAD00001005211	This data includes whole exome sequencing of matched normal-tumor samples of patients who have received immunotherapy. '-1' refers to matched normal sample and '-2' refers to matched tumor sample.	Illumina HiSeq 2500	294
EGAD00001005212	WGS with linked reads of pediatric glioblastoma. For each patient, blood and tumor tissue were sequenced. For two patients, we also provide sequencing data for the blood of their parents.	HiSeq X Ten	26
EGAD00001005214	All normal somatic cells are thought to acquire mutations but understanding of the rates, patterns, causes and consequences of somatic mutation in normal cells is limited. Uterine endometrium adopts multiple physiological states over a lifetime and is lined by a gland-forming epithelium. Whole genome sequencing of normal endometrial glands from women aged 19 to 81 years showed them to be clonal cell populations derived from recent common ancestors, with total mutation burdens that increase with age at ~29 base substitutions/year and which are many-fold lower than endometrial cancers. Normal endometrial glands frequently carry driver mutations in cancer genes. Driver mutation burdens increase with age and correlate negatively with parity. Phylogenetic trees of normal endometrial glands constructed using whole genome sequences indicated that clones with drivers often originate during the first decades and spread to colonise the endometrial epithelial lining. The results show that driver mutation landscapes differ between normal cell types, perhaps shaped by differences in normal tissue physiology, and suggest that the procession of neoplastic changes leading to endometrial cancer is initiated early in life.	HiSeq X Ten	-
EGAD00001005215	The aim of this work is to apply an integrated systems approach to understand the biological underpinnings of large joint (hip and knee) osteoarthritis which culminates in the need for total joint replacement (TJR). We will obtain diseased and non-diseased cartilage as well as other disease-relevant tissue following TJR, coupled with a blood sample. We will generate genotype data and will characterise the pairs of diseased and non-diseased tissue samples in terms of methylation, transcription (RNASeq) and expression (quantitative proteomics). We will apply integrative approaches to combine information across the omics levels to characterise genes, pathways, and networks that underlie osteoarthritis progression. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2019-08-01.	Illumina HiSeq 2500	210
EGAD00001005216	Gene expression from human iPSC derived motor neurons.	NextSeq 500	6
EGAD00001005217	RNA sequencing of 31 pancreatic cancer organiods	NextSeq 500	31
EGAD00001005221	Dataset contains Exome sequencing data (aligned and base quality score recalibrated BAM files) for 236 tumor samples + matched normal from blood, collected from 21 patients with adult diffuse glioma. The majority of these samples were spatially-mapped during sample collection, enabling the genomic information derived from them to be mapped in 3D space.		236
EGAD00001005222	Dataset includes 160 double-stranded RNAseq libraries collected from 16 patients with adult diffuse glioma. The majority of these samples were spatially-mapped during sample collection, enabling the genomic information derived from them to be mapped in 3D space.		160
EGAD00001005223	Tricholastoma (TB): Exome Merkel cell carcinoma (MCC): Genome and Exome Healthy tissue (HT): Exome	Illumina NovaSeq 6000	3
EGAD00001005224	Whole transcriptome, strand-directional RNAseq data for paired primary dn recurrent samples from patients with glioblastoma	Illumina HiSeq 2500	22
EGAD00001005225	This dataset contains RNA sequencing data for 20 intra/extra hepatic bileduct organiods. Data is in BAM format and was processed by STAR.	NextSeq 500	20
EGAD00001005226	The cohort of 15 patients included ten patients with available tissue from the primary tumor and ≥1 metastatic site, four patients with pairs of metastases only, and one patient with an anastomotic recurrence five years after initial resection of the primary tumor	Illumina HiSeq 2000	36
EGAD00001005227	This data incldues matched exome data of patients who received immunotherapy.	Illumina HiSeq 2500	120
EGAD00001005228	This dataset comprises a 2572 sample joint called vcf from the Medical Genome Reference Bank.		2572
EGAD00001005229	WGS and WTS data of a single patient diagnosed with HSTCL. Whole-genome sequencing (WGS) was performed for the tumor-normal sample. Genomic DNA from tumor tissue was extracted with QIAamp DNA Mini Kit. The DNA for the matching normal was obtained from blood or buccal swabs and purified by Blood and Cell Culture DNA Mini kit or E.Z.N.A. Tissue DNA Kit (Omega Bio-tek) according to manufacturer’s instructions. The quantity and quality were assessed by Quant-iT PicoGreen dsDNA Assay Kit (Invitrogen) and agarose gel electrophoresis. All sequencing libraries were prepared using TruSeq Nano DNA Library Prep Kit (Illumina). Paired-end sequencing was performed on Illumina HiSeq 4000 2x151 bp read length. Whole-transcriptome sequencing (WTS): Total RNA from snap frozen EITL tumor samples was extracted using TRIzol (Invitrogen) and purified with RNeasy Mini Kit (Qiagen) according to manufacturer’s instructions. The integrity of RNA was determined by electrophoresis using 2100 Bioanalyzer (Agilent Technologies). 500 ng of total RNA was reverse transcribed with iScript cDNA Synthesis Kit (Bio-Rad, Hercules, CA, USA). Quantification was performed using SsoFast EvaGreen Supermix and CFX96 Real-Time PCR System (both Bio-Rad). Sequencing libraries were prepared using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero (Illumina) and WTS was performed on Illumina HiSeq 2500 with 2x101 bp read length.	Illumina HiSeq 2500 Illumina HiSeq 4000	2
EGAD00001005230	Whole-transcriptome sequencing (WTS) of 36 samples from patients diagnosed with NKTL. Total RNA from snap frozen EITL tumor samples was extracted using TRIzol (Invitrogen) and purified with RNeasy Mini Kit (Qiagen) according to manufacturer’s instructions. The integrity of RNA was determined by electrophoresis using 2100 Bioanalyzer (Agilent Technologies). 500 ng of total RNA was reverse transcribed with iScript cDNA Synthesis Kit (Bio-Rad, Hercules, CA, USA). Quantification was performed using SsoFast EvaGreen Supermix and CFX96 Real-Time PCR System (both Bio-Rad). Sequencing libraries were prepared using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero (Illumina) and WTS was performed on HiSeq 2500 and HiSeq 3000 (Illumina) with 2x101 bp and 2x151 bp read length, respectively.	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500	36
EGAD00001005231	Whole-genome sequencing (WGS) was performed for 50 pairs of tumor-normal samples from patients diagnosed with NKTL. Genomic DNA from tumor tissue was extracted with QIAamp DNA Mini Kit. The DNA for the matching normal was obtained from blood or buccal swabs and purified by Blood and Cell Culture DNA Mini kit or E.Z.N.A. Tissue DNA Kit (Omega Bio-tek) according to manufacturer’s instructions. The quantity and quality were assessed by Quant-iT PicoGreen dsDNA Assay Kit (Invitrogen) and agarose gel electrophoresis. All sequencing libraries were prepared using TruSeq Nano DNA Library Prep Kit (Illumina). Paired-end sequencing was performed on Illumina HiSeq 2000 or HiSeq X Ten as 2x101 bp or 2x151 bp, respectively.	HiSeq X Ten Illumina HiSeq 2000	120
EGAD00001005232	Whole genome sequencing of immune cells from patients diagnosed with psoriatic arthritis . This dataset contains all the data available for this study on 2019-08-07.	HiSeq X Ten	8
EGAD00001005233	This study is a benchmarking exercise to explore potential source of variation between different CRISPR drop out libraries. . This dataset contains all the data available for this study on 2019-08-07.	Illumina HiSeq 2500	37
EGAD00001005234	The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, Kings College London will characterise the mutational signatures induced by putative human carcinogens in order to identify the origins of mutational signatures found in human cancers. To achieve this human organoid cell cultures will be exposed to a representative catalogue of known or suspected human carcinogens and mutagens and, using whole genome sequencing, the patterns of mutations induced by them will be determined. Somatic mutational signatures will be subsequently extracted by non-negative matrix factorisation methods and correlated with exposure data. Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development. . This dataset contains all the data available for this study on 2019-08-07.	HiSeq X Ten	12
EGAD00001005235	DNA and RNA isolated from FFPE blocks of patients who received immune checkpoint inhibition (ICI) are sequenced with the aim to identify therapeutically tractable vulnerabilities in tumors to prevent ICI resistance. . This dataset contains all the data available for this study on 2019-08-07.	Illumina HiSeq 2500	141
EGAD00001005236	ost adults with intellectual disabilities (ID) do not undergo genetic diagnostic investigation as part of their clinical care and have 'missed the boat' with regard to the WES and WGS genetic testing that is now being provided for children with ID. There is a dramatically increased risk of psychistric disorders in adults with ID, e.g. the risk of psychoses is 10X higher than in the general population. It remains an open question as to how much of adult ID is genetic in origin and how similar the genetic forms of adult ID are to those being diagnosed in children, in part due to survivor bias. There is also the opportunity to identify adults with treatable forms of ID, of which over 80 have been described, thus improving their clinical management. Furthermore, analysis of medical records of adults with genetic forms of ID can help to characterise the 'natural history' of individual disorders, resulting in more accurate prognoses for diagnosed children and identifying opportunities for improved management and possibly therapeutic intervention (e.g. optimal anti-epileptic therapy). Here we propose to exome sequence (to ~50X coverage) 200 adults with ID and co-morbid psychiatric disorders. This cohort has previously been assayed with chromosomal microarrays (Wolfe et al 2017 EJHG, 25, 66-72) identifying a diagnostic yield of ~11% which is comparable to the CNV diagnostic yield in various child ID cohorts (10-15%). The authors observed no substantive biases in diagnostic yield between different psychiatric diagnostic classes. The WES data will be analysed using the diagnostic workflows developed in the DDD study to ensure comparability between child and adult ID datasets. This study is intended as a pilot study to demonstrate the value of WES in adults with ID. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2019-08-07.	Illumina HiSeq 4000	200
EGAD00001005237	Whole exome sequencing and RNAseq data.	Illumina HiSeq 2500 unspecified	827
EGAD00001005238	RNA-seq of primary and metastatic sites from highly clinically annotated HGSC samples. Samples were obtained pre-treatment based on a laparoscopic triage algorithm from patients who underwent R0 tumor debulking, or received neoadjuvant chemotherapy (NACT) with excellent (ER) or poor response(PR).	Illumina HiSeq 2000	74
EGAD00001005239	T200 caner panel sequencing on primary and metastatic sites from highly clinically annotated HGSC samples. Samples were obtained pre-treatment based on a laparoscopic triage algorithm from patients who underwent R0 tumor debulking, or received neoadjuvant chemotherapy (NACT) with excellent (ER) or poor response (PR).	Illumina HiSeq 2000	75
EGAD00001005240	The diversity and heterogeneity within high-grade serous ovarian cancer (HGSC) is not well understood. We performed whole genome sequencing on primary and metastatic sites from highly clinically annotated HGSC samples. Samples were obtained pre-treatment based on a laparoscopic triage algorithm from patients who underwent R0 tumor debulking, or received neoadjuvant chemotherapy (NACT) with excellent or poor response.	HiSeq X Ten	103
EGAD00001005246	Whole genome sequencing of infant high grade gliomas. BAM files of paired end reads aligned to GRCh37 with bwa	Illumina NovaSeq 6000	22
EGAD00001005247	Whole exome sequencing of infant high grade gliomas. BAM files of paired end reads aligned to GRCh37 with bwa	Illumina HiSeq 2000	13
EGAD00001005248	Targeted sequencing of infant high grade gliomas. BAM files of paired end reads aligned to GRCh37 with bwa. This targeted panel covers the exons of 435 genes commonly mutated in high grade gliomas in children.	Illumina MiSeq	16
EGAD00001005249	Exome and RNA sequencing data for EGAS00001003776 - one female patient with neurofibroma/schwannoma hybrid nerve sheath tumor (N/S HNST)	Illumina HiSeq 2500	2
EGAD00001005250	This dataset includes bam files of WES of three fibroblast samples derived from patients with aplastic anemia.	Illumina HiSeq 2000	3
EGAD00001005251	Retinoblastoma (RB), the commonest eye cancer in children was the first cancer for which a genetic cause was identified: the Rb1 gene is a tumour suppressor gene that is mutated in RB. The Rb1 gene defect alone does not predict the clinical outcome. We propose to study other possible mechanisms: 1. Stepwise further mutations occur in RB, increasing its carcinogenesis. We will sequence the whole genome in RB tissue, and relate the different genes expressed to the treatments used. 2. Extracellular matric proteins contribute to a tumour permissive environment for RB to continue to grow. This includes Samll Leucine Rich Proteoglycans (SLRP), a family of 15 secreted extracellular matrix proteins involved in eye development. 3. Cancer stem cells (CSC), a subpopulation of treatment resistant cells, drive RB tumours, and whether these stem cells can be manipulated for new therapies. The aim of this study is to assist finding targeted diagnostic techniques and treatments for RB. . This dataset contains all the data available for this study on 2019-08-14.	HiSeq X Ten	47
EGAD00001005252	Immortalised HaCaT keratinocytes were transduced with Cas9 and the CRISPR-KO v1.1 genome-wide gRNA library. The gRNA library was prepared from genomic DNA isolated 14 days post library transduction. gRNA representation will be compared to the original CRISPR-KO v1.1 library to reveal genes essential for HaCaT survival and growth. . This dataset contains all the data available for this study on 2019-08-14.	Illumina HiSeq 2500	19
EGAD00001005253	This data set contains whole exome sequences of individuals with self-stated parental relatedness from the East London Genes & Health cohort. Rare frequency functional variants in these healthy individuals will be studied with respect to the genetic health of the participants and loss-of-function analysis of human genes. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2019-08-14.	Illumina HiSeq 4000	1574
EGAD00001005254	Whole genome sequences at 15X depth of patients with Inflammatory Bowel Disease. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2019-08-14.	HiSeq X Ten	2546
EGAD00001005255	In this study a collection of core biopsies from breast cancer patients receiving neoadjuvant chemotherapy as part of the ChemoNEAR trial will be investigated. The study is intended to detect early acquired resistance in women with diagnosed breast carcinoma. Samples will undergo whole genome sequencing and analysis, including use of HR Predict. . This dataset contains all the data available for this study on 2019-08-14.	HiSeq X Ten	2
EGAD00001005261	CB: Aligned sequences used in the error modeling study		10
EGAD00001005262	This dataset contains whole genome sequencing data on 25 individuals with myasthenia gravis. The data was generated using Illumina sequencing technology and is presented as BAM files for each sample.		25
EGAD00001005263	Primary T cell immunodeficiency disorders have a heterogeneous genetic basis. This study will focus on one case characterised by severe T cell lymphopenia in the index case. We aim to sequence the complete exomes of this individual, her three unaffected siblings and parents in an effort to identify the causative genetic mutation responsible for this disorder. We will perform exome capture using Agilent SureSelect system, followed by sequencing on the HiSeq platform. Our study has the potential to uncover genes important for T cell development and novel therapeutic strategies to treat T cell immunodeficiencies. . This dataset contains all the data available for this study on 2019-08-19.	Illumina HiSeq 2000	-
EGAD00001005264	The objective of this study is to identify the causative genes in two unrelated congenital neutropenia families. We aim to whole exome sequence the affected individuals, unaffected siblings and parents in both cases in an effort to idenitfy the causative genetic mutation. Exome capture will be performed using Agilent SureSelect system. Subsequently, exome libraries will sequenced using the Illumina HiSeq platform. Sequence variant calling will be done in house and common variants excluded using public databases and data from unaffected family members. . This dataset contains all the data available for this study on 2019-08-19.	Illumina HiSeq 2000	8
EGAD00001005265	We plan to sequence the exomes of 4 AML cases (tumour and germline) in an effort to discover new mutations in this disease that could improve our understanding of leukaemogenesis and guide the development of new targeted therapies. The Sanger Institute will sequence the exomes of 4 Acute Myeloid Leukaemia cases including tumour and germline DNA so that somatically-acquired, AML-specific mutations can be accurately designated. . This dataset contains all the data available for this study on 2019-08-19.	Illumina HiSeq 2000	6
EGAD00001005266	Deep whole genome sequencing of sampels from the Cilento isolates. The samples are sequenced using the Illumina HiSeq X Ten system. . This dataset contains all the data available for this study on 2019-08-19.	HiSeq X Ten	20
EGAD00001005267	Whole genome sequencing of sampels from the NSPHS cohort. The samples are sequenced using the Illumina HiSeq X Ten system. . This dataset contains all the data available for this study on 2019-08-19.	HiSeq X Ten	20
EGAD00001005268	Whole genome sequencing of sampels from an isolated population from the Val Borbera valley in Italy. The samples are sequenced using the Illumina HiSeq X Ten system. . This dataset contains all the data available for this study on 2019-08-19.	HiSeq X Ten	20
EGAD00001005269	Deep whole genome sequencing of sampels from the Orkney Complex Disease Study (ORCADES), each with data on up to 300 quantitative traits and other risk factors associated with cardiovascular, metabolic and other complex diseases. The samples are sequenced using the Illumina HiSeq X Ten system. . This dataset contains all the data available for this study on 2019-08-19.	HiSeq X Ten	20
EGAD00001005270	AML-MRD: Aligned sequences used in the error modeling study		96
EGAD00001005271	This data contains DNA methylation data obtained from the PBMCs obtained from type 2 diabetes adolescents and controls. There are 21 diabetic samples and 10 controls. This dataset also contains metabolic data obtained from the serum of 155 samples. There are 113 diabetic and 42 control samples.	AB 5500xl Genetic Analyzer	21
EGAD00001005272	This dataset consists of 6 BAM files. These are whole exome sequencing data of pediatric patients with myelodysplastic syndrome.	Illumina MiSeq	6
EGAD00001005273	50 lymphoblastoid cell lines of adult female Twins which are part of the MuTHER study, processed with the FAIRE assay in order to generate maps of open chromatin. . This dataset contains all the data available for this study on 2019-08-21.	Illumina HiSeq 2000	78
EGAD00001005274	50 lymphoblastoid cell lines of adult female Twins which are part of the MuTHER study, subjected to Chromatin Immunoprecipitation using an antibody for H3K4me1. H3K4me1 peaks mark distal regulatory elements. . This dataset contains all the data available for this study on 2019-08-21.	Illumina HiSeq 2000	174
EGAD00001005275	The objective is to identify new disease genes involved in calcific aortic valve stenosis (CAVS) by screening newly identified candidate genes. The recruitment of patients with CAVS has been achieved by l’institut du thorax (Nantes, France). DNA from the selected patients has been analysed by targeted capture (Agilent SureSelect) and massively parallel sequencing (Illumina). . This dataset contains all the data available for this study on 2019-08-21.	Illumina HiSeq 2000	485
EGAD00001005276	These samples include exome sequences of samples from patients who suffered Sudden Unexplained Death in Epilepsy. They all are of European descent. . This dataset contains all the data available for this study on 2019-08-21.	Illumina HiSeq 2000	28
EGAD00001005277	This project is a pilot study, in collaboration with Maria Grazia Spillantini and Mariangela Iovino (Cambridge Centre for Brain Repair), to investigate the utility of IPS-derived neurons for the study of neurodegenerative disorders. Our aim is to characterise the transcriptional consequences of tauopathies using neurons derived from differentiated IPSCs as a model system. We will use IPS cells derived from six individuals, four with known mutations in the tau protein, 2 without. RNA will be extracted at Day 0 and Day 65 of differentiation by which time the neuronal tauopathy is apparent. RNA will be extracted and the transcriptome of each line characterised using RNAseq. We will then search for genes that are differentially expressed between the transcriptomes of individuals with tau mutations versus those in controls. My lab will analyse the RNAseq data, comparing both affected and controls and both time-points, to establish candidate genes. Darren Logan’s lab, along with our collaborators, will experimentally verify and further investigate these genes in additional lines and animal models. From this analysis we will generate a list of candidate genes that are differentially expressed between cases and controls. This study will not only help us understand the molecular basis of tauopathies, but also identify gene candidates for biomarkers of neurodegenerative disease. It will serve as a proof of principle for future planned studies into generating and transcriptomically analysing an allelic series of Tau mutations in IPSCs with a controlled genetic background. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2019-08-21.	Illumina HiSeq 2000	11
EGAD00001005278	We have collected material from a patient who had BrafV600E mutant melanoma that was treated with PLX4032. We have germline DNA from the patient and DNA and RNA from distinct lesions before and after treatment with PLX4032. We would like to exome sequence these samples to gain a snap shot of the mechanisms of resistance that are operative. . This dataset contains all the data available for this study on 2019-08-21.	Illumina HiSeq 2000	42
EGAD00001005279	High-grade serous ovarian cancer (HGSOC) likely originates from the fallopian tube (FT) epithelium, but advanced stages are mostly found outside the FT. We used ex-vivo cultures of HGSOC and knock-out of tumor suppressors in FT organoids to study changes in epithelial cells and niche requirements for normal and transformed FT cells. We found that transformed cells require BMP signaling and are growth arrested in Wnt rich medium. A SureSelectXT Automation Custom Capture Library (Agilent) target enrichment panel was designed. The enrichment panel comprised all coding exons of 121 genes associated with ovarian cancer. Capture was performed according to the manufacturer’s instructions using an NGS Workstation Option B (Agilent) for automated library preparation starting with 3 μg DNA per sample. Sequencing was performed on a Illumina Hiseq 2500 system gnerating 2x100bp paired end reads and a target coverage of >200 per sample. Sequence reads were mapped to the haploid human reference genome (hg19) using BWA. Variants where called with FreeBayes v1.1.		20
EGAD00001005280	Epigenomic and transcriptomic analysis of Langerhans cell histiocytosis (LCH) biopsies. Single-cell RNA-seq (10x Genomics) of seven LCH biopsies. ATAC-seq data from four sorted cell populations (in duplicate) from one LCH biopsy.	Illumina HiSeq 4000	13
EGAD00001005281	RNA-seq sequencing data of human germinal centre B-cells (42 samples)	Illumina HiSeq 4000	42
EGAD00001005282	This dataset includes bam files of tumor and paired normal samples derived from 15 patients with myelofibrosis. Tumor samples includes those before and after treatment.	Illumina HiSeq 2500	45
EGAD00001005283		Illumina HiSeq 2000 Illumina HiSeq 4000	30
EGAD00001005284		Illumina HiSeq 2000 Illumina HiSeq 4000	22
EGAD00001005285	Identifying high risk smoldering myeloma patients and progression mechanism is a prerequisite to implement effective inception strategies and curve myeloma related morbidity and mortality. We hypothesize that genomics may help identify determinants of progression that may help predict outcome and offer effective chemo preventive targets. Eighty-two patients underwent a custom targeted panel sequencing with an additional capture for the translocation loci. These results were compared to 223 newly diagnosed patients (EGAS00001003223 and EGAD00001004117) and 17 MGUS and 10 early myeloma patients. DNA was obtained from either CD138+ cells from the bone marrow of multiple myeloma patients (tumor) or from stem cell harvests or peripheral blood cells from the same patient (control). 100 ng of DNA was fragmented, end-repaired, and adapters ligated using the HyperPlus kit (KAPA Biosystems). After PCR amplification the libraries were hybridized with probes against either a targeted panel consisting of 140 genes and chromosomal regions (Nimblegen) using SeqCap reagents (Nimblegen). Hybridized libraries underwent further amplification before being sequenced on a NextSeq500 (Illumina) using 75 bp paired end reads. Overall, these data highlight the importance of dysregulation of the MAPK pathway in the progression to MM.	NextSeq 500	241
EGAD00001005286	We developed a new bioinformatics method for detecting the eccDNA in plasma. We revealed that the biological properties between eccDNA and linear DNA are different. eccDNA could be potentially provided as a new class of circulating biomarker.	Illumina HiSeq 1500 Illumina HiSeq 2500	15
EGAD00001005287	Deep WGS (germline and 2-5 tumour regions) was performed on 20 patients: 13 lung adenocarcinoma, 5 squamous cell carcinoma, 2 small-cell lung cancers. A total of 68 BAM files are provided, where tumours were sequenced to 60x or 150x depth.	HiSeq X Ten	68
EGAD00001005288	Exome sequencing of 87 Fibromyalgia patients	Illumina HiSeq 2500	87
EGAD00001005289	exome sequencing data captured with agilent v4 (71k) and sequenced on illumina technology. data from a total of 40 samples, of which 14 are vitiligo cases with familial history of vitiligo or immune disease, and the remaining are alopecia areata cases.	Illumina HiSeq 2500	19
EGAD00001005290	Cytokines affect T cell responses by polarising them to different phenotypes. We isolated T cells from healthy platelet donors and cultured them in resting and stimulated condition, as well as in the presence of Th2, iTreg and Th17 polarizing cocktail. To characterize the efficacy of cytokine induced porization and subpopulation specific response, we profiled single cell transcriptome five days following polarization using 10x platform (3' v2).	Illumina HiSeq 4000	16
EGAD00001005291	Cytokines affect T cell responses by polarising them to different phenotypes. We isolated T cells from healthy platelet donors and cultured them in resting and stimulated conditions, as well as in the presence of Th1, Th2, Th17 and iTreg cocktail. In addition, T cells were stimulated in the presence of IL-10, IL-21, IL-27, IFNb and TNFa. We performed bulk RNA sequencing to assess impact of different disease-relevant cytokines upon T cell response.	Illumina HiSeq 2500	141
EGAD00001005296	Genome wide CRISPR screen was performed to find resistance to targeted drugs for melanoma and lung . This dataset contains all the data available for this study on 2019-08-28.	Illumina HiSeq 2500	237
EGAD00001005297	A targeted gene screen of 365 known cancer genes in luminal breast cancer samples pre-chemotherapy and at resection post-chemotherapy to evalaute clonal expansion of chemotherapy cancer cells. . This dataset contains all the data available for this study on 2019-08-28.	Illumina HiSeq 2000 Illumina HiSeq 2500	133
EGAD00001005298	This project aims to evaluate the transcriptional response to disease measured in whole blood of participants who developed enteric fever after challenge and, importantly, those who were challenged but stayed well throughout the challenge period. This data will provide unique coverage of the transcriptome and will yield invaluable insight after integration with a wealth of clinical data collected during this trial. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2019-08-28.	Illumina HiSeq 2000	195
EGAD00001005299	This study involves exome sequencing of blood/bone marrow DNA from patients with myeloid malignancies. Blood DNA samples have been taken from patients at different timepoints of disease phenotype. We hope to elucidate mechanisms of clonal evolution in these patients. . This dataset contains all the data available for this study on 2019-08-28.	Illumina HiSeq 2000 Illumina MiSeq	46
EGAD00001005300	Study to stimulate WT and IL-10RB mutant macrophages with LPS in presence or absence of recombinant IL-10 and compare their gene expression profiles by RNASeq These data are part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2019-08-28.	Illumina HiSeq 2000	32
EGAD00001005302	Metadata summarizes participants (n=198), samples (n=396), basic clinical information, and analysis. analysis1: raw sequencing reference alignment files (bam/bai) analysis2: error-corrected sequencing reference alignment files (bam/bai) analysis3: variant calling using error-corrected sequencing reference alignment (vcf)		396
EGAD00001005303	211 NKTL FFPE specimens were screened for somatic mutations using deep targeted capture sequencing. FFPE rolls or slides were extracted using QIAamp DNA FFPE Tissue kit (QIAGEN). The FFPE genomic DNA was treated with NEBNext FFPE DNA Repair Mix and assessed by Quant-it PicoGreen dsDNA Assay Kit (Invitrogen). The library was generated from 10-200 ng DNA with SureSelectXT Low Input Target Enrichment System for Illumina Paired-End Sequencing Library (Agilent Technologies) according to manufacturer’s instructions. RNA based probe was designed with SureDesign (Agilent Technologies) to target-capture 140 genes. Next, the captured libraries were pooled in equimolar concentration and sequenced on Illumina Novaseq 6000 platform with SP or S1 chip.	Illumina NovaSeq 6000	214
EGAD00001005305	This dataset contains genomic and transcriptomic profiling of skin samples (74) from patients with CYLD cutaneous syndrome	Illumina HiSeq 2500 Illumina MiSeq Illumina NovaSeq 6000	69
EGAD00001005306	Primary plasma cell leukemia (pPCL) samples were sequenced using the Nimblegen MedExome Plus hybridization capture to detect translocations, copy number changes, and mutations in 3 pPCL samples and patient matched controls. Sequencing was performed on a NextSeq500 using 75 bp paired end reads.	NextSeq 500	7
EGAD00001005307	This data set includes 72 mate pair sequenced osteosarcomas (36 as part of a discovery cohort and 36 as part of a validation cohort). It also includes RNA-sequencing data on 67 osteosarcomas (mostly overlapping with the above mate pair sequenced cases) and 13 osteoblastomas used as controls for gene expression levels.	NextSeq 500	101
EGAD00001005308		Illumina HiSeq 2000 Illumina HiSeq 4000	112
EGAD00001005310	Basic phenotypic data (country, ethnicity and sex) for 348 samples of the H3Africa Chip Design Study. Divided into 8 datasets of 41 samples from Zambia, 24 samples from Cameroon, 50 samples from Mali, 26 samples from Cameroon, 49 samples from Nigeria, 48 samples from Botswana, 50 samples from Benin, 60 samples from Burkina Faso and Ghana.		348
EGAD00001005311	This study entails whole genome sequencing of an interleukin (IL)-12 b-1 receptor-deficient individual who presented with a chronic systemic Salmonella Enteritidis infection that did not resolve with standard IFNg and antibiotic treatment. Whole genome sequencing of the patient's parents are also included. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2019-09-05.	HiSeq X Ten	3
EGAD00001005312	This study investigates the genomic and transcriptomic characteristics of Wilm's tumour organoids . This dataset contains all the data available for this study on 2019-09-05.	HiSeq X Ten	-
EGAD00001005313	Swift kit whole genome bisulphite of MPN colonies . This dataset contains all the data available for this study on 2019-09-05.	HiSeq X Ten	16
EGAD00001005314	Single Nuclei ATAC seq data from GBM tumor samples. NovaSeq6000 was used for ATAC seq. The files uploaded are bam files created with grch38 reference.	Illumina NovaSeq 6000	6
EGAD00001005315		NextSeq 500	9
EGAD00001005316	Next generation RNA-Sequencing (RNA-seq) is a flexible approach that can be applied to e.g. global quantification of transcript expression, the characterization of RNA structure such as splicing patterns and profiling of expressed mutations. Many RNA-seq protocols require up to microgram levels of total RNA input amounts to generate high quality data, and thus remain impractical for the limited starting material amounts typically obtained from rare cell populations, such as those from early developmental stages or from laser micro-dissected clinical samples. Here, we present an assessment of the contemporary ribosomal RNA depletion-based protocols, and identify those that are suitable for inputs as low as 1-10 ng of intact total RNA and 100-500 ng of partially degraded RNA from formalin-fixed paraffin-embedded tissues.	Illumina HiSeq 2500 Illumina MiSeq	3
EGAD00001005317	Patient-derived lung cancer organoids cram files : targeted seq 13 samples, whole exome seq 12 samples mutation profiles of PDO and matched tissue : aggregated vcf 1 file details : https://www.nature.com/articles/s41467-019-11867-6	Illumina HiSeq 2500 Illumina MiSeq	44
EGAD00001005318		Illumina HiSeq 4000	51
EGAD00001005319		BGISEQ-500 HiSeq X Ten Illumina NovaSeq 6000	59
EGAD00001005320	This dataset includes "clinical exome" profiling (approximately 4000 genes related to diseases) on individuals (n=7) from a family with a familial history of Alzheimer's disease. Two affected cases ad five cases without dementia are included.	Illumina MiSeq	7
EGAD00001005321	The dataset includes Fastq files from WES experiments performed on a proband presenting with syndromic optic atrophy and his healthy parents. Exons were captured by hybridization and sequenced on an Illumina platform	Illumina HiSeq 2500	3
EGAD00001005322	DNA extracted from sorted CD19+ tumor cells (18 samples - 16 patients) was used for exome capture with the SureSelect All Exon Kit following the standard protocols. Paired-end sequencing (2 x 100 bp) was performed using HiSeq2000 sequencing instruments. The files are in FASTQ format.	Illumina HiSeq 2000	18
EGAD00001005323	DNA extracted from sorted CD3+ cells (16 patients) was used for exome capture with the SureSelect All Exon Kit following the standard protocols. Paired-end sequencing (2 x 100 bp) was performed using HiSeq2000 sequencing instruments.The files are in FASTQ format.	Illumina HiSeq 2000	16
EGAD00001005324	RNA was extracted from flow-sorted CD19+. RNA-Seq was performed on 32 samples of 30 patients (2 replicates per samples). RNA-Seq libraries were subjected to non-stranded paired-end (2 x 75 bp) sequencing on HiSeq 2500 (Illumina). The files are in FASTQ format.	Illumina HiSeq 2500	64
EGAD00001005331	Total of 180 gynecologic tumor specimens were subjected for targeted-exome and/or whole-transcriptome sequencing.	Illumina HiSeq 2500	180
EGAD00001005335	August 2019 data update (fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	HiSeq X Ten Illumina HiSeq 2500	17
EGAD00001005337	Genomic data obtained from the joint processing and variant calling of 4,810 individuals from Singapore. VCF files are by Chromosome (chr. 1-22 plus X) for all 4,810 individuals. Self-reported ethnicity is found in the "Region" column of metadata file.		4810
EGAD00001005338	Whole genome sequencing of single cells identifies stochastic aneuploidies, genome replication, states, and clonal repertoires for library A96146A 1195 samples; filetype=bam	HiSeq X Five	3
EGAD00001005339	The dataset for Genome-wide cell-free DNA fragmentation in patients with cancer includes 538 bam files from whole genome next-generation sequencing on the Illumina HiSeq2500. The samples analyzed include plasma samples from healthy individuals and patients with cancer.	Illumina HiSeq 2500	537
EGAD00001005340	Whole genome sequencing of single cells identifies stochastic aneuploidies, genome replication, states, and clonal repertoires for library A96172B 1694 samples; filetype=bam	HiSeq X Five	3
EGAD00001005341	WGS from 4 patients, WTS from only 3 patients (insufficient tissue from 4th patient for WTS). Whole-genome sequencing (WGS) was performed for 60 pairs of tumor-normal samples from patients diagnosed with NKTL. Genomic DNA from tumor tissue was extracted with QIAamp DNA Mini Kit. The DNA for the matching normal was obtained from blood or buccal swabs and purified by Blood and Cell Culture DNA Mini kit or E.Z.N.A. Tissue DNA Kit (Omega Bio-tek) according to manufacturer’s instructions. The quantity and quality were assessed by Quant-iT PicoGreen dsDNA Assay Kit (Invitrogen) and agarose gel electrophoresis. All sequencing libraries were prepared using TruSeq Nano DNA Library Prep Kit (Illumina). Paired-end sequencing was performed on Illumina HiSeq 2000. Whole-transcriptome sequencing (WTS): Total RNA from snap frozen EITL tumor samples was extracted using TRIzol (Invitrogen) and purified with RNeasy Mini Kit (Qiagen) according to manufacturer’s instructions. The integrity of RNA was determined by electrophoresis using 2100 Bioanalyzer (Agilent Technologies). 500 ng of total RNA was reverse transcribed with iScript cDNA Synthesis Kit (Bio-Rad, Hercules, CA, USA). Quantification was performed using SsoFast EvaGreen Supermix and CFX96 Real-Time PCR System (both Bio-Rad). Sequencing libraries were prepared using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero (Illumina) and WTS was performed on Illumina HiSeq 2500 with 2x101 bp read length. Description of prefix used in filenames: T: Tumor samples N: Normal samples (Blood) P: PDX samples	Illumina HiSeq 2000 Illumina HiSeq 2500	9
EGAD00001005343	random whole-genome shotgun sequencing of cfDNA in control samples (NPH*) and late-stage cancer samples. First letter denotes primary cancer tissue (C: Colon, B: Breast, P: Prostate)	Illumina NovaSeq 6000 NextSeq 550	41
EGAD00001005344	The dataset includes the BAM files from WES experiments performed on a proband presenting with syndromic optic atrophy and his healthy parents - Family 2 in our study	Illumina HiSeq 2000	1
EGAD00001005345	Whole genome sequencing of single cells identifies stochastic aneuploidies, genome replication, states, and clonal repertoires for library A96226B 1274 samples; filetype=bam	HiSeq X Five	4
EGAD00001005346	A family trio from Uganda (Baganda ethno-linguistic group) has been sequenced to high depth (ca. 30x) on the Illumina HiSeq 2500 platform.	Illumina HiSeq 2000	3
EGAD00001005347	Whole genome sequencing of single cells identifies stochastic aneuploidies, genome replication, states, and clonal repertoires for library A96193B 2410 samples; filetype=bam	HiSeq X Five	3
EGAD00001005348	Whole genome sequencing of single cells identifies stochastic aneuploidies, genome replication, states, and clonal repertoires for library A96199A 843 samples; filetype=bam	HiSeq X Five	3
EGAD00001005351	Analysis of mutational signatures caused by exposure to known mutagens in human induced pluripotent stem (iPS) cells. A reference human iPS cell-line will be exposed to 100 chemicals known or proposed to be mutagenic. Following exposure to mutagen, cells will undergo a period of recovery before sub clones are generated and sequenced. The progenitor "parental" IPS cell-line will be used to generate reference sequence data, in order to determine the mutational signatures acquired as a result of exposure to different mutagens.	HiSeq X Ten	6
EGAD00001005353	Whole genome sequencing of single cells identifies stochastic aneuploidies, genome replication, states, and clonal repertoires for library A96199B 1170 samples; filetype=bam	HiSeq X Five	3
EGAD00001005354	Whole genome sequencing of single cells identifies stochastic aneuploidies, genome replication, states, and clonal repertoires for library A96211C 1397 samples; filetype=bam	HiSeq X Five	3
EGAD00001005355	Whole genome sequencing of single cells identifies stochastic aneuploidies, genome replication, states, and clonal repertoires for library A96225C 1034 samples; filetype=bam	HiSeq X Five	2
EGAD00001005356	We used 200 ccRCC samples from 51 tumors to simultaneously isolate DNA, RNA, and protein according to established protocol. RNA quality was assessed using an Agilent Bioanalyzer, and total RNA with RIN>7 was used for further RNA sequencing. 184 ccRCC samples from 49 tumors passing initial quality control underwent RNA sequencing at Admera Health Inc. (Genohub Inc., Austin, TX). RNA sequencing libraries were prepared using the Illumina TruSeq Stranded mRNA high throughput (HT) sample preparation kit following the manufacturers’ protocol. Pair-end RNA Seq data was deposited in this cohort.	Illumina HiSeq 4000	173
EGAD00001005357	Whole exome sequencing data from a series of 5 patient derived organoids (PDOs) established from metastatic colorectal cancers (CRCs).	Illumina HiSeq 2500	10
EGAD00001005358	All sequencing was performed within the DNAlink (Korea) by using the Solexa sequencing technology (Illumina, San Diego, CA). mRNA was isolated from total RNA using poly-T oligo-attached magnetic beads and was fragmented with fragmentation buffer to an average size of 300 bp. The libraries were prepared by using TruSeq RNA Library Prep Kit v2 (Illumina) and were sequenced on the Illumina HiSeq2000 using the manufacturer’s recommended protocols	Illumina HiSeq 2000	220
EGAD00001005359	This is 2nd part of data for original Control iPSC lines with clinically annotated genetic variants for versatile multi-lineage differentiation,	HiSeq X Five	1
EGAD00001005361	231 HCC exome sequencing with Sureselect 50Mb	Illumina HiSeq 2000	452
EGAD00001005362	Paired single-cell sequencing dataset of T-cell receptors from IELs, from both treated and untreated celiac patients and from controls. (Amplicon sequencing, paired-end fastq files).	Illumina MiSeq	20
EGAD00001005363	Tumor biopsies from LAM disease were retrospectively analyzed by multiple techniques to characterize the alterations in patients ,to elucidate the landscape of genetic/genomic alterations.	NextSeq 500	61
EGAD00001005364	Exome sequencing data of two siblings of with a neurodegenerative phenotype due to SMVT deficiency. Exonic sequences were enriched using the SeqCap EZ Human Exome Library v3.0 kit (Roche NimbleGen) and libraries sequenced as 100bp paired-end reads on the HiSeq 2000 platform (Illumina).	Illumina HiSeq 2000	2
EGAD00001005365	Single-cell sequencing of human pancreatic cells on 10X 5' platform.	Illumina HiSeq 4000	8
EGAD00001005366	Whole-genome sequencing (WGS) was performed for 13 pairs of tumor-normal and 5 tumor-only samples from patients diagnosed with angiosarcoma. Genomic DNA from tumor tissue was extracted with QIAamp DNA Mini Kit. The DNA for the matching normal was obtained from blood or buccal swabs and purified by Blood and Cell Culture DNA Mini kit or E.Z.N.A. Tissue DNA Kit (Omega Bio-tek) according to manufacturer’s instructions. The quantity and quality were assessed by Quant-iT PicoGreen dsDNA Assay Kit (Invitrogen) and agarose gel electrophoresis. All sequencing libraries were prepared using TruSeq Nano DNA Library Prep Kit (Illumina). Paired-end sequencing was performed on Illumina HiSeq X Ten as 2x151 bp.	HiSeq X Ten Illumina HiSeq 2000	31
EGAD00001005367	Whole-transcriptome sequencing (WTS) of 6 tumor-normal and 6 tumor-only samples from patients diagnosed with angiosarcoma. Total RNA from snap frozen EITL tumor samples was extracted using TRIzol (Invitrogen) and purified with RNeasy Mini Kit (Qiagen) according to manufacturer’s instructions. The integrity of RNA was determined by electrophoresis using 2100 Bioanalyzer (Agilent Technologies). 500 ng of total RNA was reverse transcribed with iScript cDNA Synthesis Kit (Bio-Rad, Hercules, CA, USA). Quantification was performed using SsoFast EvaGreen Supermix and CFX96 Real-Time PCR System (both Bio-Rad). Sequencing libraries were prepared using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero (Illumina) and WTS was performed on HiSeq 2500 and HiSeq 3000 (Illumina) with 2x101 bp and 2x151 bp read length, respectively.	Illumina HiSeq 2500	18
EGAD00001005368	Single Cell RNA sequencing for 5 low grade glioma samples. NovaSeq6000 was used for scRNA Seq. The files uploaded are bam files created with grch38 reference.	Illumina NovaSeq 6000	5
EGAD00001005369	Single Cell RNA sequencing for 4 high grade glioma samples. NovaSeq6000 was used for snRNA Seq. The files uploaded are bam files created with grch38 reference.	Illumina NovaSeq 6000	4
EGAD00001005370	WES from two human osteosarcoma with two samples each from the corresponding cell line, BAM files	NextSeq 500	6
EGAD00001005371	The biology of cell-free DNA fragmentation and the roles of DNASE1, DNASE1L3 and DFFB	NextSeq 500	40
EGAD00001005372	12 tissues from the warm autopsy are selected for this project. Using 10X Chromium technology we will generate ~1000 single cell/nulei genomic libraries per tissue. Each tissue will be whole genome sequenced (~2 lanes per 1000 cells) on hiseq X10. per single cell we will generate CNV profile and we investigate the level of genomic heterogenity with in tissue and across different tissues. . This dataset contains all the data available for this study on 2019-10-02.	HiSeq X Ten	6
EGAD00001005373	In this study, we performed systematic comparative analysis of seven widely-used SNV-calling methods, including SAMtools, the GATK Best Practices pipeline, CTAT, FreeBayes, MuTect2, Strelka2 and VarScan2, on both simulated and real single-cell RNA-seq datasets. We generated SMART-seq2 data for 70 CD45- single cells, which were derived from two colorectal cancer patients (P0411 and P0413). The average sequencing depths of these cells were 1.4 million reads per cell. We also generated tumor and adjacent normal bulk WES data, as well as tumor bulk RNA-seq data for these patients.	Illumina HiSeq 4000	75
EGAD00001005374	ChIP-seq for AR, FOXA1 and HOXB13 on 8 prostectomy samples, both regions with/-out tumor cells, Fastq files.	Illumina HiSeq 2500	50
EGAD00001005375		Illumina HiSeq 2000	73
EGAD00001005376		Illumina HiSeq 2000	11
EGAD00001005377		Illumina HiSeq 2000	10
EGAD00001005378	30x whole genome sequencing of samples from the VIKING Health Study - Shetland. 500 DNA samples were sequenced using the Illumina HiSeq X system. FASTQ files are deposited	HiSeq X Ten	500
EGAD00001005379	This study is the first to interrogate the whole mtDNA in BP patients and controls and to implicate multiple novel mtDNA variants in disease susceptibility. Whole mtDNA of German BP patients (n=180) and age- and sex-matched healthy controls (n=188) were sequenced using next generation sequencing (NGS) technology, followed by the replication study using Sanger sequencing of an additional independent BP (n=89) and control cohort (n=104). While the BP and control groups showed comparable mitochondrial haplogroup distributions, the haplogroup T exhibited a tendency of higher frequency in BP patients suffering from neurodegenerative diseases (ND) compared to BP patients without ND (p= 0.1448, Fisher’s exact test)	Illumina MiSeq	368
EGAD00001005380	Contains RNAseq data for 14 transduced/non-transduced organoids	Illumina NovaSeq 6000	14
EGAD00001005381	Woodcock et al TenMenDeep EGA Dataset A. These are Illumina based deep sequencing data based on bait capture sequencing. See Woodcock et al methods for more detail. Note: the Amplicon sequencing data type is selected because the EGA Website currently has no option to select Bait Capture Sequencing or similar.	Illumina HiSeq 2500	117
EGAD00001005382	Woodcock et al TenMenDeep EGA Dataset B. These are Illumina based deep sequencing data based on bait capture sequencing. See Woodcock et al methods for more detail. Note: the Amplicon sequencing data type is selected because the EGA Website currently has no option to select Bait Capture Sequencing or similar.	Illumina HiSeq 2500	33
EGAD00001005383		Illumina HiSeq 2000	20
EGAD00001005384		Illumina HiSeq 2000	9
EGAD00001005385		Illumina HiSeq 2000	64
EGAD00001005386	113 DNA samples were derived from the tumors of the low grade glioma patients and sequenced using Illumina WES (exome seq) paired-end technology. Dataset contains 113 BAM files aligned to hg19 using BWA v.0.5.9. After mapping duplicated reads were removed, reads were re-aligned around InDels and read base quality score was re-calibrated.		-
EGAD00001005387	44 DNA samples were derived from the tumors of the low grade glioma patients and sequenced using Illumina RNAseq paired-end technology. Dataset contains 44 BAM files aligned to hg19 using Tophat 2.		-
EGAD00001005388	Data supporting: “Deep molecular phenotyping reveals the identity of Barrett’s esophagus and its malignant transition.” Nowicki-Osuch, Zhuang et al. RNAseq (BAM files) 241 tumour samples	Illumina HiSeq 2000	1
EGAD00001005389	Whole genome sequencing of 35 osteosarcoma patients (primary, relapsed, and metastatic) with matched normals. Tumors were sequenced at target 60X and matched normals at target 30X.	HiSeq X Ten	72
EGAD00001005390	Single Cell-RNA Seq IDHR132H Wild-type Primary GBM Female, 76. Single Cell RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference through cellranger count (10xGenomics.)	Illumina NovaSeq 6000	1
EGAD00001005391	SF11977 single cell RNA-seq IDHR132H Wild-type GBM Female, 61 Single Cell RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference through Cellranger count (10xGenomics.)	Illumina NovaSeq 6000	-
EGAD00001005392	SF11956 IDHR132H WT GBM. Male, 63. Single Cell RNA seq from high grade glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference through Cellranger count (10xGenomics.)	Illumina NovaSeq 6000	-
EGAD00001005393	SF11644 Primary GBM Gender Male age 57. Single Cell RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference through Cellranger count (10xGenomics.)	Illumina NovaSeq 6000	-
EGAD00001005394	Single cell RNA-Seq Primary diffuse astrocytoma G2. IDH mutant, ATRX mutant. Gender Male Age 34. Single Cell RNA seq from primary astrocytoma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference through Cellranger count (10xGenomics.)	Illumina NovaSeq 6000	-
EGAD00001005395	Single cell Primary astrocytoma G2. IDH mutant, ATRX negative. Male, 44. Single Cell RNA seq from primary astrocytoma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference through Cellranger count (10xGenomics.)	Illumina NovaSeq 6000	-
EGAD00001005396	Single cell RNA-Seq Low Grade Astrocytoma IDHR132H mutant. Male, 64. Single Cell RNA seq from primary astrocytoma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference through Cellranger count (10xGenomics.)	Illumina NovaSeq 6000	1
EGAD00001005397	SF11949 Primary oligodendroglioma G3 IDH1 Mutant. Male, 40 Single Cell RNA seq from primary astrocytoma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference through Cellranger count (10xGenomics.)	Illumina NovaSeq 6000	-
EGAD00001005398	IDH1 mutant oligodendroglioma male, 40. Single Nuclei ATAC seq data from low grade human glioma sample. NovaSeq6000 was used for ATAC seq. The files uploaded are bam files created with grch38 reference.	Illumina NovaSeq 6000	-
EGAD00001005399	Single Nuclei RNA-Seq Primary High-grade Glioma. Gender Female Age 51. Single Nuclei RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference	Illumina NovaSeq 6000	1
EGAD00001005401	Single Nuclei RNA-Seq Primary High-grade Glioma. Gender Male Age 73. Single Nuclei RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference.	Illumina NovaSeq 6000	1
EGAD00001005402	Single Nuclei RNA-Seq Primary High-grade Glioma. Gender Female age 40. Single Nuclei RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference.	Illumina NovaSeq 6000	1
EGAD00001005403	Single Nuclei RNA Seq of primary GBM. Gender Female Age 44. Single Nuclei RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference	Illumina NovaSeq 6000	-
EGAD00001005405	IDH1 mutant GBM 55, Male. Single Nuclei ATAC seq data from high grade human glioma samples. NovaSeq6000 was used for ATAC seq. The files uploaded are bam files created with grch38 reference.	Illumina NovaSeq 6000	-
EGAD00001005406	IDHR132H Wildtype GBM. Male, 63 Single Nuclei ATAC seq data from high grade human glioma samples. NovaSeq6000 was used for ATAC seq. The files uploaded are bam files created with grch38 reference.	Illumina NovaSeq 6000	-
EGAD00001005407	High grade glioma sample, Gender Male Age 46. Single Nuclei ATAC seq data from high grade human glioma samples. NovaSeq6000 was used for ATAC seq. The files uploaded are bam files created with grch38 reference.	Illumina NovaSeq 6000	-
EGAD00001005408	SF11331 Primary GBM Male,55 Single Nuclei ATAC seq data from high grade human glioma sample. NovaSeq6000 was used for ATAC seq. The files uploaded are bam files created with grch38 reference.	Illumina NovaSeq 6000	-
EGAD00001005409	SF10022 single nuclei RNA-Seq Primary High-grade glioma. gender Male Age 65. Single Nuclei RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference	Illumina NovaSeq 6000	1
EGAD00001005410	Single Nuclei RNA-Seq Primary GBM. Gender Female Age 51. Single Nuclei RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference.	Illumina NovaSeq 6000	1
EGAD00001005411	Single Cell RNA seq from Recurrent oligodendroglioma sample. Gender Male Age 67. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference through Cellranger count (10xGenomics.)	Illumina NovaSeq 6000	-
EGAD00001005412	Single Nuclei RNA-Seq Primary IDHR132H Wild-type GBM. Male, 61. Single Nuclei RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference.	Illumina NovaSeq 6000	1
EGAD00001005413	SF11612 Recurrent oligodendroglioma. Gender Male Age 67. Single Nuclei ATAC seq data from low grade human glioma samples. NovaSeq6000 was used for ATAC seq. The files uploaded are bam files created with grch38 reference.	Illumina NovaSeq 6000	-
EGAD00001005414	Single Nuclei ATAC Seq IDHR132H mutant Astrocytoma . Male, 64. Single Nuclei ATAC seq data from low grade human glioma sample. NovaSeq6000 was used for ATAC seq. The files uploaded are bam files created with grch38 reference.	Illumina NovaSeq 6000	-
EGAD00001005415	Single Nuclei RNA-Seq Primary High-grade Glioma. Gender male age 39. Single Nuclei RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference	Illumina NovaSeq 6000	1
EGAD00001005416	Organoid cultures were exposed to two different E.Coli strains and a dye control with three biological duplicates. Their original culture was harvested as a control. In total 10 organoid cultures were whole-genome sequenced using the Novaseq6000 platforms. The data is deposited as .bam format.	Illumina NovaSeq 6000	20
EGAD00001005417	Disease: Severe congenital deafness, early onset cataracts and various neurological features Family: 3 affected individuals originated from the same small village (Amarat) in the Kayseri region of Turkey and belonging to the same large extended consanguineous family. Dataset: 5 BAM files. Whole-genome sequencing (WGS) was applied to the three affected individuals (II.2, II.4 and II.7) and two healthy individuals (II.1 and II.3).	Illumina HiSeq 2000	5
EGAD00001005418	SF11979 snATAC, IDHR132H WT GBM Female, 76 Single Nuclei ATAC seq data from high grade human glioma samples. NovaSeq6000 was used for ATAC seq. The files uploaded are bam files created with grch38 reference.	Illumina NovaSeq 6000	1
EGAD00001005419	Diffuse large B-cell lymphoma (DLBCL) is the most common histologic subtype of non-Hodgkin lymphoma and is notorious for its clinical heterogeneity. Patient outcomes can be predicted by cell-of-origin (COO) classification, demonstrating that the underlying transcriptional signature of malignant B-cells informs biological behavior in the context of standard combination chemotherapy regimens. In the current study, we used mass cytometry (CyTOF) to examine tumor phenotypes at the protein level with single cell resolution in a collection of 27 diagnostic DLBCL biopsy specimens from treatment naïve patients. We found that malignant B-cells from each patient occupied unique regions in 37-dimensional phenotypic space with no apparent clustering of samples into discrete subtypes. Interestingly, variable MHC class II expression was found to be the greatest contributor to phenotypic diversity. Within individual tumors, a subset of cases showed multiple phenotypic subpopulations, and in one case we were able to demonstrate direct correspondence between protein-level phenotypic subsets and DNA mutation-defined subclones. In summary, CyTOF analysis can resolve both inter- and intra-tumoral heterogeneity among primary samples, and reveals that each case of DLBCL is unique and may be comprised of multiple, genetically distinct subclones.		17
EGAD00001005420	Here we performed single-cell RNA sequencing to address repertoire stability and subset plasticity during IL-15 driven homeostatic proliferation. Sorted NK cell subsets representing discrete stages of NK cell differentiation are compared with the corresponding subsets after proliferation and further sorted into two subsets depending on the rate of proliferation.	NextSeq 500	8
EGAD00001005421	These are 21 metastatic melanoma exomes matched with 7 germlines from 7 multisite metastatic melanoma cases.	Illumina HiSeq 2000	-
EGAD00001005422	Dataset contains 854 single cell sequenced colorectal cancer organoids.	Illumina NovaSeq 6000 NextSeq 500	854
EGAD00001005423	Whole Genome sequencing. 1 ug of genomic DNA from each lymph node sample was used for the construction of a TruSeq DNA PCR Free (350) library before sequencing in a Illumina HiSeq X Ten (2 × 151 bp). Mean coverage 30x.	HiSeq X Ten	2
EGAD00001005424	Exome Sequencing. 3 ug of genomic DNA from each lymph node sample were sheared and used for the construction of a paired-end sequencing library as described in the paired-end sequencing sample preparation protocol provided by Illumina. Enrichment of exonic sequences was then performed for each library using either the Sure Select Human All Exon 50 Mb or All Exon+UTRs v4 kits following the manufacturer’s instructions (Agilent Technologies). Exon-enriched DNA was pulled down by magnetic beads coated with streptavidin (Invitrogen), followed by washing, elution and 18 additional cycles of amplification of the captured library. Enriched libraries were sequenced in one lane of an Illumina GAIIx sequencer or in two lanes of a HiSeq2000 when using pools of eight samples.	unspecified	13
EGAD00001005425	Whole Exome Sequencing for a cohort of 20 B-ALL samples : 5 Down syndrome (DS), 7 Hyperdiploid (HeH), 3 iAMP21 and 5 others.	Illumina HiSeq 2000	40
EGAD00001005426	RNA-sequencing for a cohort of B-ALL samples : 5 Down Syndrome (DS), 16 Hyperdiploid (HeH), 6 iAMP21, 9 other. RNA-sequencing for B-cell progenitors from 3 healthy. It also contains RNA-sequencing datasets of Patient-Derived Xenografts (X) developed from the B-ALL samples : 4 Down Syndrome (DS), 4 Hyperdiploid (HeH), 1 iAMP21, 3 other.	Illumina HiSeq 2000	51
EGAD00001005427	Three SpCas9-ABE (R785X/R785X) and three xCas9-ABE-repaired organoid clones (F508del/R553X) and their respective unrepaired control organoids were paired-end whole genome sequenced using Illumina Novaseq 6000 system. The reads were mapped to hg19 genome assembly and data is provided as BAM files.	Illumina NovaSeq 6000	8
EGAD00001005428	Single Nuceli Primary GBM 73 Male. Single Nuclei RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference.	Illumina NovaSeq 6000	-
EGAD00001005429	Single Cell Prmary high grade glioma IDHR132H Wild-type Female 76 Single Cell RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference.	Illumina NovaSeq 6000	-
EGAD00001005430	Single Nuclei Primary GBM IDHR132H Wildtype. Female 76. Single Nuclei RNA seq from high grade primary glioma sample. NovaSeq6000 was used for RNA seq. The files uploaded are bam files created with grch38 reference.	Illumina NovaSeq 6000	1
EGAD00001005431	Our understanding of the BCR repertoire in the context of immune-mediated diseases is incomplete, and defining this could provide new insights into pathogenesis and therapy. Here, we compared the BCR repertoire in systemic lupus erythematosus, anti-neutrophil cytoplasmic antibody (ANCA)-associated vasculitis, Crohn’s disease, Behçet’s disease, eosinophilic granulomatosis with polyangiitis, and immunoglobulin A (IgA) vasculitis by analysing BCR clonality, use of immunoglobulin heavy-chain variable region (IGHV) genes and—in particular—isotype use. An increase in clonality in systemic lupus erythematosus and Crohn’s disease that was dominated by the IgA isotype, together with skewed use of the IGHV genes in these and other diseases, suggested a microbial contribution to pathogenesis. Different immunosuppressive treatments had specific and distinct effects on the repertoire; B cells that persisted after treatment with rituximab were predominately isotype-switched and clonally expanded, whereas the inverse was true for B cells that persisted after treatment with mycophenolate mofetil. Our comparative analysis of the BCR repertoire in immune- mediated disease reveals a complex B cell architecture, providing a platform for understanding pathological mechanisms and designing treatment strategies.	Illumina MiSeq	167
EGAD00001005432	To be added...	Illumina HiSeq 4000	27
EGAD00001005433	The dataset contains plasma DNA methylation data derived from metastatic prostate cancer patients.	Illumina HiSeq 2500	115
EGAD00001005434	Data supporting: "Genomic evidence supports a clonal diaspora model for metastases of esophageal adenocarcinoma." Noorani et al. WGS (BAM files) 134 samples for 18 cases Includes primary, lymph-node, distant metastatic, Barrett's and normal samples.	Illumina HiSeq 2000	1
EGAD00001005435	This dataset contains 9 RNA-seq BAM files. RNA was derived from TERT promoter mutant GBM cell lines and sequenced on an Illumina HiSeq4000 sequencer with paired-end reads and an average read length of 50 base pairs. Reads were aligned with TopHat (v2.0.14) using a GENCODE V19 transcriptome-guided alignment.		9
EGAD00001005438	Data supporting: “Deep molecular phenotyping reveals the identity of Barrett’s esophagus and its malignant transition.” Nowicki-Osuch, Zhuang et al. scRNAseq (BAM files) 38 Barrett's and normal samples	unspecified	1
EGAD00001005439	This dataset contains small RNA sequencing data and mRNA capture sequencing data from 20 different human biofluids (amniotic fluid, aqueous humor, ascites, bile, bronchial lavage fluid, breast milk, cerebrospinal fluid, colostrum, gastric fluid, pancreatic cyst fluid, plasma, saliva, seminal fluid, serum, sputum, stool, synovial fluid, sweat, tear fluid and urine). In total, 180 samples were sequenced. Files are provided in fastQ format. Samples were sequenced on a NextSeq 500.	NextSeq 500	180
EGAD00001005442	Exome sequencing of ID trios and sibpairs	Illumina HiSeq 2000	123
EGAD00001005443	Metagenomes of stool samples from 46 Lifelines control subjects (no antimicrobial use in the past three months before sampling, no occupational lifestock contact). Samples were sequenced and analysed as part of the EFFORT project and derived from the LifeLines cohort from the Northern parts of the Netherlands. http:///www.lifelines.nl.	Illumina HiSeq 4000	46
EGAD00001005444	Metagenomes of stool samples from 54 pig farmers, 24 broiler farmers and 70 slaughter line workers. Note: Access to the data will only be granted for antibiotic resistance studies in accordance with the EFFORT consents issued by the participants.	Illumina HiSeq 4000	148
EGAD00001005445	This dataset includes 44 bam files derived of 21 patients with IVL. Tumor samples are derived from cfDNA (n = 18), PDX (n = 4) and bone marrow (n = 2). Normal samples are derived from peripheral blood.	Illumina HiSeq 2500	44
EGAD00001005446	Tregs were sorted as CD4+CD25+CD127- cells from peripheral blood of 14 healthy individuals, 8 patients with mild/severe rheumatoid arthritis, 1 patient with systemic lupus erythematosus/rheumatoid arthritis, 2 patients with ulcerative colitis and 2 patients with Chrohn's disease. RNA was extracted and polyA libraries were prepared using the Illumina Truseq sample preparation kit v.2. Single-end 75bp sequencing was performed on NextSeq500.	NextSeq 500	27
EGAD00001005448	Single cell RNA sequencing (scRNA-seq) is widely used for profiling transcriptomes of individual cells. The droplet-based 10X Genomics Chromium (10X) approach and the plate-based Smart-seq2 full-length method are two frequently-used scRNA-seq platforms, yet there are only a few thorough and systematic comparisons of their advantages and limitations. Here, by directly comparing the scRNA-seq data by the two platforms from the same samples of CD45- cells, we systematically evaluated their features using a wide spectrum of analysis. Smart-seq2 detected more genes in a cell, especially low abundance transcripts as well as alternatively spliced transcripts, but captured higher proportion of mitochondrial genes. The composite of Smart-seq2 data also resembled bulk RNA-seq data better. For 10X-based data, we observed higher noise for mRNA in the low expression level. Despite the poly(A) enrichment, approximately 10-30% of all detected transcripts by both platforms were from non-coding genes, with lncRNA accounting for a higher proportion in 10X. 10X-based data displayed more severe dropout problem, especially for genes with lower expression levels. However, 10X-data can better detect rare cell types given its ability to cover a large number of cells. In addition, each platform detected different sets of differentially expressed genes between cell clusters, indicating the complementary nature of these technologies. Our comprehensive benchmark analysis offers the basis for selecting the optimal scRNA-seq strategy based on the objectives of each study.	Illumina HiSeq 4000	78
EGAD00001005449	This dataset includes ChIP-seq data from two cell lines (HKCI-11 (GOFp53) and MIHA(WT p53)). All the experiments were performed on Illumina HiSeq 2000 platform with raw reads stored in fastq format.	Illumina HiSeq 2000	2
EGAD00001005450	This dataset contains target capture sequence data from 255 samples, including 154 tumors and 101 normal samples. All the experiments were performed on Illumina HiSeq 2000 platform with raw reads stored in fastq format.	Illumina HiSeq 2000	255
EGAD00001005451	This dataset contains whole genome sequence data from 24 samples, including 16 tumors and 8 normal samples. All the experiments were performed on Illumina HiSeq 2000 platform with raw reads stored in fastq format.	Illumina HiSeq 2000	24
EGAD00001005452	This dataset contains whole genome sequence data from 12 samples from 1 patient, including 8 tumor sectors and 4 normal samples. All the experiments were performed on Illumina HiSeq platform with raw reads stored in fastq format.	Illumina HiSeq 2000	12
EGAD00001005453	This dataset contains whole exome sequence data from 86 samples from 6 patient. All the experiments were performed on Illumina HiSeq platform with raw reads stored in fastq format.	Illumina HiSeq 2000	86
EGAD00001005454	Illumina platform sequencing of whole genome libraries prepared from paired tumour/normal samples from 103 cases of melanoma Uveal subtype		-
EGAD00001005455	Data supporting: “Deep molecular phenotyping reveals the identity of Barrett’s esophagus and its malignant transition.” Nowicki-Osuch, Zhuang et al. Single cell metadata and analysis 38 Barrett's and normals		1
EGAD00001005456	There are two samples, 42 (control) and 49 To test the role of activated CRLF2/IL7RA in leukemia initiation we expressed CRLF2 together with IL7RA in human CB hematopoietic progenitors. Human CRLF2 and wild type and/or activated mutant form of human IL7RA (IL7RAwt/ins) were cloned into a lentiviral vector with a bi-cistronic cassette under the expression control of an Eμ-B29 promoter/enhancer to augment expression in B-cell precursors. Backbone vector expressing GFP (BB). Whole genome sequencing Leukemic (49) and BB transduced (42) corresponding CB cells were collected from transplanted mice. Sequencing libraries were prepared from these samples	Illumina HiSeq 2500	2
EGAD00001005457	Whole-exome sequencing (WES) and whole-genome sequencing (WGS) were performed on matched adjacent normal tissues, multiregionally sampled adenomas at different stages and carcinomas from 5 patients with FAP and 1 patient with MUTYH-associated polyposis (MAP) (n=56 exomes; n=56 genomes; n=8,757 single cells).	Illumina HiSeq 4000	165
EGAD00001005458	Whole exome sequencing of 15 DNA samples, and whole genome sequencing of 2 matched DNA samples. Whole exome sequencing is of RMS samples (both alveolar and embryonal) and from cell lines as well as patient samples. Patient samples are of pediatric RMS patients.	Illumina Genome Analyzer IIx Illumina HiSeq 2500	17
EGAD00001005459	Whole genome sequencing of HSPC and SI clones of 2 disomy- and 1 trisomy 21 fetuses samples (HiSeq X Ten samples). 5 disomy clones and 5 trisomy clones were included in this experiment. Three bulk samples were also included.	HiSeq X Ten	13
EGAD00001005460	BAM files corresponding to sequencing of 18 circulating tumor DNA and matched tumor samples from SCLC patients. Each ctDNA sample was sequenced twice.	Ion Torrent PGM Ion Torrent Proton	18
EGAD00001005461	This dataset contains two experiments. 1) Single cell RNA-seq of diagnostic samples from patients with MLL-rearranged infant ALL that underwent relapse or not (samples ending in R relapsed, samples ending in N did not). For some of the patients, multiple indipendent plates were produced (each plate is a sample). 2) in vitro prednisolone exposure experiment. diagnostic bone marrow samples from patient 4662R were cultured for three days with and without prednisolone. Single cell experiments were conducted according to Muraro et al (cell systems, 2016, doi:10.1016/j.cels.2016.09.002). Cell barcodes and UMI sequences are embedded in the header of each fastq entry. Cell barcodes irrelevant to this experiment were removed before submission.	NextSeq 500	11
EGAD00001005462	BAM files corresponding to sequencing of 28 circulating tumor DNA and matched tumor samples from SCC patients. Each ctDNA sample was sequenced twice.	Ion Torrent PGM Ion Torrent Proton	28
EGAD00001005463	BAM files corresponding to the sequencing of 125 circulating cell-free DNA from 125 healthy patients. Each sample was sequenced twice.	Ion Torrent Proton	125
EGAD00001005464	Single-cell RNA-seq of tumor-infiltrating lymphocytes from 14 cancer patients before treatment, taken from tumor, normal adjacent tissue, and peripheral blood. Dataset consists of paired-end FASTQ files, including replicate libraries and runs.	Illumina HiSeq 4000	45
EGAD00001005465	Single-cell TCR-seq of tumor-infiltrating lymphocytes from 14 cancer patients before treatment, taken from tumor, normal adjacent tissue, and peripheral blood. Dataset consists of paired-end FASTQ files, including replicate libraries and runs.	Illumina HiSeq 2500	88
EGAD00001005466	Whole genome sequencing of 100 unrelated Uzbeks in order to impute genotypes into PE cases and controls from Uzbekistan and to provide genetic data and infrastructure for future genetic studies in Uzbekistan and Central Asia more generally and to fill a gap in worldwide information as Central Asia is not adequately represented in available genomic data. This dataset is one component of the InterPregGen FP7 project. DNA samples for this component were collected by InterPregGen Consortium collaborators in Tashkent, Uzbekistan at the Institute of Immunology, Uzbek Academy of Sciences and at the Republic Specialized Scientific Practical Medical Centre of Obstetrics and Gynecology.	Illumina HiSeq 2000	100
EGAD00001005467	Whole genome sequencing of 100 unrelated Kazakhs in order to impute genotypes into PE cases and controls from Kazakhstan and to provide genetic data and infrastructure for future genetic studies in Kazakhstan and Central Asia more generally and to fill a gap in worldwide information as Central Asia is not adequately represented in available genomic data. This dataset is one component of the InterPregGen FP7 project. DNA samples for this component were collected by InterPregGen Consortium collaborators at the Scientific Center of Obstetrics, Gynecology and Perinatology, Almaty, Kazakhstan (Gulnara Svyatova, Principal Investigator)	Illumina HiSeq 2000	100
EGAD00001005468	Dataset includes 2 scRNA-seq samples from a 6.5-7 post-conception weeks human embryonic heart and 19 samples from 4.5-9 post-conception weeks human embryonic hearts analyzed with the Spatial Transcriptomics method. H&E stains can be sent if requested.	Illumina HiSeq 2500 NextSeq 500	21
EGAD00001005469	This includes variant calls (single nucleotide variants and small insertions/deletions) from 8086 (mostly British Pakistani/British Bangladeshi) individuals from the following studies: 1. 5236 British Pakistani/British Bangladeshi adults from East London Genes and Health (ELGH) 2. 2624 British South Asian mothers from Born in Bradford (mostly Pakistani) (BiB) 3. 1061 British South Asian adults from Birmingham (mostly Pakistani) (Birm) All of the Birmingham and most of the Born in Bradford samples were previously sequenced as part of PMID: 26940866. In the sample list file, the columns of interest to most people will be: vcf.id - sample ID from the vcf cohort - which cohort they're in sex.assigned - sex inferred from coverage on the X and Y chromosomes. Individuals for whom this did not match their reported sex have been discarded total, chrX and chrY - coverage within bait regions across all chromosomes, chrX and chrY respectively Mapping was done with bwa-mem and variant calling was carried out with GATK HaplotypeCaller. We removed variant sites for which the following was true: SNPs: "QD < 2.0 \|\| FS > 30 \|\| MQ < 40.0 \|\| MQRankSum < -12.5 \|\| ReadPosRankSum < -8.0" Indels: "QD < 2.0 \|\| FS > 30 \|\| ReadPosRankSum < -20.0"		-
EGAD00001005470	Whole-exome sequencing data from Illumina NextSeq 500. It consists of 88 paired-end FASTQ files from 44 primary, residual, relapsed tumors and normal samples from the blood.	NextSeq 500	44
EGAD00001005472	Low coverage nanopore sequencing of ovarian cancer tumors	GridION MinION	4
EGAD00001005473	Low coverage nanopore sequencing of prostate cancer tumors	GridION	5
EGAD00001005474	This dataset contains all available targeted and exome sequencing paired fastq files from our study, "Identification of hypermutation and defective mismatch repair in ctDNA from metastatic prostate cancer". Patient identifiers are denoted by the first three characters of the sample aliases (e.g. "P01"), and additional information is appended to reflect the panel used (targeted 73 gene panel: "PC", or whole-exome panel: "WXS"), and whether the sample represents cell-free DNA ("cfdna") or paired white-blood cell control ("wbc"). Several patients have multiple serial collections available, and these are denoted by the characters "C1, C2, C3," etc. All samples were sequenced using Illumina technology.	Illumina HiSeq 2500 Illumina MiSeq	154
EGAD00001005475	The dataset includes exome sequencing results for a patient with SSBP1 mutations that cause a complex optic atrophy spectrum disorder		1
EGAD00001005476	WGS Nanopore nanopore sequencing of organoid line HGS-3.1 and matching blood reference HGS-3	MinION	2
EGAD00001005477	LBC1921 and LBC1936 GVCFs called with GATK's HaplotypeCaller were combined and subject to variant quality score recalibration. This VCF contains the subset of samples (n = 296) from the LBC1921 cohort.		1
EGAD00001005478	LBC1921 and LBC1936 GVCFs called with GATK's HaplotypeCaller were combined and subject to variant quality score recalibration. This VCF contains the subset of samples (n = 1068) from the LBC1936 cohort.		1068
EGAD00001005479	Whole-exome sequencing coupled with RNA-seq of preinvasive (n=98) and invasive (n=99) lung adenocarcinoma samples.	HiSeq X Ten	394
EGAD00001005480	Whole-genome sequencing (WGS) data for 546 Singaporean volunteers used to estimate WGS-LTL in the study. Samples were sequenced using Illumina Hiseq X to a mean coverage of 30X.	HiSeq X Ten	546
EGAD00001005481	This dataset contains single cell RNA sequencing data of PBMC samples from 10 bladder cancer patients. cDNAs and single cell RNA libraries were prepared following manufacturer’s user guide (10x Genomics). Each library was sequenced in HiSeq4000 (Illumina) to achieve ~300 million reads following manufacturer’s sequencing specification.	Illumina HiSeq 4000	10
EGAD00001005482	Bacterial 16S V4 rDNA was amplified using two differently barcoded V4 fusion primers. Pooled PCR samples were purified and paired-end sequenced on MiSeq instrument for 250 cycles. The steps from DNA quantification to sequencing were conducted at Second Genome Inc.	Illumina MiSeq	109
EGAD00001005483	These are caveman, pindel, battenberg and brass calls for index patients' metastatic melanoma genomes within this study.		-
EGAD00001005484	WXS files for Zhang PanNBL paper titled "Pan-neuroblastoma analysis reveals age- and signature-associated driver alterations"	Illumina HiSeq 2000	634
EGAD00001005486	This dataset contain WGBS sequencing result of HEMa_LP. The cells were cultured in Medium 254 supplemented with PMA-Free Human Melanocyte Growth Supplement-2 (HMGS-2) under 37°C,5% CO2.	HiSeq X Ten NextSeq 500	1
EGAD00001005487	These are caveman, pindel and sequenza calls for the metastatic melanoma exomes within this study.		-
EGAD00001005488	Paired tumor/normal WGS and RNA-seq of primary neuroblastoma.	HiSeq X Ten Illumina HiSeq 4000	117
EGAD00001005489	Nimblegen SeqCap (sequence capture) deep targetted DNA sequencing pNET	Illumina HiSeq 2500	98
EGAD00001005491	The dataset contains WGS and RNA-seq from Myeloma XI trial	HiSeq X Ten Illumina HiSeq 2500	246
EGAD00001005492	Content: 60 GB patient tumours and 4 normal brain samples combined in pairs by region (x2=8 total input samples). RNAseq: 1 lane per sample, total strand-specific rRNA-depleted (normal samples were combined = 2 lanes/samples per brain region). WGBS: 2 lanes per sample (normal samples were combined = 2 lanes/samples per brain region). ChIPseq (histone mark): a subset of 20 GB samples were profiled. For the same modification were multiplexed and sequenced on 4 lanes each (H3K27ac, H3K4me1) or a single lane (all others). WGS: used as matching input control for the 20 ChIPseq samples. Data type and technology: RNA-seq: PE 100bp sequenced on HiSeq2000. WGBS: PE 100bp sequenced on HiSeq2000/4000. ChIPseq: SE 50bp sequenced on HiSeq2000/4000. WGS: PE 150bp sequenced on HiSeq X.	HiSeq X Ten Illumina HiSeq 2000	172
EGAD00001005493	Content: 2 GB RTK I cell lines (LN229, ZH487) in two conditions (NT control and shSOX10). RNAseq: single replicates per condition, polyA+ RNA sequencing, SE. ATACseq: biological replicates per condition, SE. ChIPseq (histone H3 modifications, LN229 only): all marks for each condition were pooled and sequenced on two lanes for each pool. ChIPseq (BRD4 and SOX10): SOX10 libraries were sequenced on single lanes. BRD4 samples were multiplexed and sequenced in two lanes. ChIPseq input samples are also included. Data type and technology: RNAseq: SE 50bp sequenced on HiSeq2000/4000. ATACseq: SE 50bp sequenced on HiSeq2000/4000. ChIPseq: SE 50bp sequenced on HiSeq2000/4000.	Illumina HiSeq 2000 Illumina HiSeq 4000	6
EGAD00001005494	Nimblegen SeqCap Custom Panel Sequencing for pNet	Illumina HiSeq 2500	96
EGAD00001005495	The genomic hallmark of clear cell renal cell carcinoma is the loss of the short arm of chromosome three. This appears to be the earliest genomic event in the formation of these cancers. Often chromosome 3 is lost at the same time as part of chromosome 5 is duplicated via an unbalanced translocation, often with features consistent with focal chromothripsis. In this study, we sought to reconstruct the chromothriptic event that underlies the initiation of kidney cancer. We used long read sequencing (promethION, Oxford Nanopore Technologies) of patient tumour-derived DNA to elucidate how a single cell division error can generate cancer genome complexity.	PromethION	2
EGAD00001005497	TTN gene targeted sequencing for AMC cohort (n=24)	Illumina MiSeq	24
EGAD00001005498	Whole Exome sequencing of a set of Spanish patients suffering rare genetic diseases. The set consists of 3 patients, two were diagnosed with Aniridia (ANI-0006 and ANI-0023) and another one was diagnosed with Retinitis Pigmentosa (RP-0247).	unspecified	3
EGAD00001005499	Targeted next-generation sequencing of 13 pediatric bithalamic diffuse gliomas. BAM files of targeted next-generation DNA sequencing data of 13 pediatric gliomas, with multi-region sequencing data from 2 of these cases (17 total tumor samples). Genomic DNA was extracted from formalin-fixed, paraffin-embedded blocks of tumor tissue from using the QIAamp DNA FFPE Tissue Kit (Qiagen). Capture-based next-generation DNA sequencing was performed at the University of California, San Francisco Clinical Cancer Genomics Laboratory, using an assay that targets all coding exons of 480 cancer-related genes, select introns of 47 genes, and TERT promoter with a total sequencing footprint of 2.8 Mb (UCSF500 Cancer Panel). Sequencing libraries were prepared from genomic DNA, and target enrichment was performed by hybrid capture using a custom oligonucleotide library (Nimblegen SeqCap EZ Choice). Captured libraries were sequenced as paired-end 100 bp reads on an Illumina HiSeq 2500 instrument. Duplicate sequencing reads were removed computationally to allow for accurate allele frequency determination and copy number calling.	Illumina HiSeq 2500	17
EGAD00001005500	Illumina platform sequencing of whole genome libraries prepared from paired tumour/normal samples from 87 cases of melanoma Acral subtype. 63 cases also have RNASeq sequencing from the tumour sample.		-
EGAD00001005501	Illumina RNASeq sequencing of tumour samples from 41 cases of melanoma		-
EGAD00001005502	TST170 DNA FASTQ files	Illumina HiSeq 2500	16
EGAD00001005503	DNA BAM files		16
EGAD00001005504	18 WGBS lanes for 9 samples of pilocytic astrocytoma.	Illumina HiSeq 2000	9
EGAD00001005506	WGS files for Mullighan_GL_reALL paper titled "Mutational landscape and patterns of clonal evolution in relapsed pediatric acute lymphoblastic leukemia"	Illumina HiSeq 2000	99
EGAD00001005507		Illumina HiSeq 4000 Illumina NovaSeq 6000	25
EGAD00001005508	3' mRNA-Seq obtained from distinct isolated cell types (epithelia cells,immune cells, fibroblasts) of endoscopically obtained esophageal adenocarcinoma tissue as well as normal esophageal mucosa. Libraries for RNA-sequencing were prepared using the QuantSeq 3' mRNA-Seq Library Prep Kit FWD for Illumina according to the low input protocol. Libraries were sequenced on a HiSeq 4000 (Illumina) by 1x 50 bases.	Illumina HiSeq 4000	31
EGAD00001005509	WXS files for Mullighan_GL_reALL paper titled "Mutational landscape and patterns of clonal evolution in relapsed pediatric acute lymphoblastic leukemia"	Illumina HiSeq 2000	276
EGAD00001005510	RNAseq files for Mullighan_GL_reALL RNASEQ2 paper titled "Mutational landscape and patterns of clonal evolution in relapsed pediatric acute lymphoblastic leukemia"	Illumina HiSeq 2000	34
EGAD00001005511	RNASeq files for Mullighan_GL_reALL RNASEQ1 paper titled "Mutational landscape and patterns of clonal evolution in relapsed pediatric acute lymphoblastic leukemia"	Illumina HiSeq 2000	81
EGAD00001005512	RNAsequencing data from human pancreatic islets from 191 donors, Lund University. Processed for the Inspire consortium.	Illumina HiSeq 2000	191
EGAD00001005519	Files from DNA and RNA sequencing from primary tumors and metastases from pancreatic cancer patients along with matched normal tissues.	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina NovaSeq 6000	252
EGAD00001005520	This dataset consists of three bam files (two cell-free DNA and one germline DNA) from a metastatic bladder cancer patient with BAP1 variants. Bam files were generated from targeted Illumina sequencing data.	Illumina HiSeq 2500 Illumina MiSeq	3
EGAD00001005521	Genotyping data (Imputed) from human pancreatic islets from 191 donors from Lund that were analysed as part of the Inspire consortium.		191
EGAD00001005523	Phenotype data from human pancreatic islets from 191 donors, Lund University. Processed for the Inspire consortium.		191
EGAD00001005524	Colorectal cancer (CRC) is characterized by functional intratumor heterogeneity that shares many similarites with the hierarchical organization of the normal intestinal epithelium. In order to relate transcriptional subtypes to functional tumor cell heterogeneity we applied scRNA-seq to 12 patient-derived CRC spheroid cultures. We identified shared expression programs that relate to intestinal lineages and revealed metabolic signatures that are linked to cancer cell differentiation. In addition, we validated and complemented sequencing results by quantitative microscopy using live-dyes and multiplexed RNA fluorescence in situ hybridization, thereby revealing metabolic compartmentalization and potential cell-cell interactions. Finally, we demonstrate functional differences between metabolically distinct lineage subtypes that might have strong implications for future treatment strategies of CRC.	NextSeq 500	8714
EGAD00001005525	Validation data containing sequencing data of 13 samples. An hybrid capture approach was used to validate findings of both Manta and GRIDSS for the samples in the validation set. The dataset also contains the reference sequences used.	Illumina NovaSeq 6000	13
EGAD00001005526		Illumina HiSeq 2000	180
EGAD00001005707	WGS of more samples in the ovarian cancer organoid biobank dataset.	HiSeq X Ten	23
EGAD00001005709	To identify what factors cause a different reactivity to MLN4924, 15 cells were categorized into high, intermediate, and low MLN4924 resistance groups based on the half-maximal inhibitory concentration (IC50) of MLN4924. PDC1, PCD2, PDC3, PDC4, and PDC5 showed high MLN4924 sensitivity, whereas PDC12, PDC13, PDC14, and PDC15 showed low MLN4924 sensitivity. Whole-transcriptome sequencing of these 9 patient-derived glioblastoma stem cells was performed.	Illumina HiSeq 2000	9
EGAD00001005710	TST170 Pilot RNA VCF	Illumina HiSeq 2500	16
EGAD00001005711	TST170 Pilot RNA FASTQ	Illumina HiSeq 2500	16
EGAD00001005712	46 BAM files from 23 urothelial bladder cancer patients on an immunotherapy clinical trial. PBMC normal samples and solid tumor samples are paired. Alignment was done by BWA with reference genome hg19.	Illumina HiSeq 2000	46
EGAD00001005713	Whole exome sequencing and RNA sequencing data from 30 patients with prostate cancer.	Illumina HiSeq 2000	87
EGAD00001005714	Single cell atlas of human airways from 10 healthy volunteers by 10X Genomics 3’ RNA-seq profiling. 77,969 cells were collected by bronchoscopy at 35 distinct locations, from the nose to the 12th division of the airway tree, either by forceps (46,791 cells), or brush biopsies (31,178 cells).	NextSeq 500	35
EGAD00001005715	RNA-Seq data of 36 HPV-negative HNSCC specimens from patients treated at The Netherlands Cancer Institute, Amsterdam. HNSCC biopsy samples obtained prior to treatment (chemo-radiotherapy) were used for polyA mRNA sequencing.	Illumina HiSeq 2000	36
EGAD00001005716	RNA-Seq data of 55 HPV-negative HNSCC specimens from patients treated at the VU medical Centre, Amsterdam, The Netherlands. HNSCC biopsy samples obtained prior to treatment (chemo-radiotherapy) were used for polyA mRNA sequencing.	Illumina HiSeq 2000	46
EGAD00001005717	RNA-Seq data of 17 HPV-negative HNSCC specimens from patients treated at the MAASTRO clinic, Maastricht, The Netherlands. HNSCC biopsy samples obtained prior to treatment (chemo-radiotherapy) were used for polyA mRNA sequencing.	Illumina HiSeq 2000	17
EGAD00001005718	Low-coverage whole genome sequencing data of 37 HPV-negative HNSCC specimens from patients treated at The Netherlands Cancer Institute. HNSCC biopsy samples obtained prior to treatment (chemo-radiotherapy) were used for WGS to a depth of approx. 0.5X.	Illumina HiSeq 2000	37
EGAD00001005719	Low-coverage whole genome sequencing data of 37 HPV-negative HNSCC specimens from patients treated at the VU Medical Centre, Amsterdam. HNSCC biopsy samples obtained prior to treatment (chemo-radiotherapy) were used for WGS to a depth of approx. 0.5X.	Illumina HiSeq 2000	37
EGAD00001005720	RNA-Seq data of 8 HNSCC specimens from patients diagnosed with metastatic disease at The Netherlands Cancer Institute, Amsterdam. Primary HNSCC biopsy samples obtained prior to treatment were used for polyA mRNA sequencing.	Illumina HiSeq 2000	8
EGAD00001005721	RNA-Seq data of 25 HNSCC specimens from patients treated at The Netherlands Cancer Institute, Amsterdam and enrolled in the ARTFORCE trial. HNSCC biopsy samples obtained prior to treatment (chemo-radiotherapy) were used for polyA mRNA sequencing.	Illumina HiSeq 2000	25
EGAD00001005722	RNA-Seq data of 28 HPV-negative HNSCC specimens from patients treated at the Netherlands Cancer Institute (Amsterdam), VU Medical Centre (Amsterdam) or MAASTRO Clinic (Maastricht) in The Netherlands. HNSCC biopsy samples were obtained prior to treatment (chemo-radiotherapy) for the prospective study within the DESIGN project and used for polyA mRNA sequencing.	Illumina HiSeq 2000	28
EGAD00001005723	This dataset contains sequencing data from a large-scale study of mtDNA variations measured, using a sensitive mtDNA-targeted sequencing method called STAMP, in lymphoblast and blood samples of Huntington’s Disease patients.	Illumina HiSeq 2500	2602
EGAD00001005724	This dataset contains whole genome sequencing data from Illumina short-reads sequencing (2X150bp) and 10X Genomics linked-reads sequencing. Both the sequencing technologies were used to sequence MCF7 cell line and a primary breast triple-negative cancer sample. The fastq of paired-end reads for both the samples sequenced with both the technologies is available.	Illumina NovaSeq 6000	4
EGAD00001005728	aCGH CNV detection by CNsolidate for 6,827 DDD probands		1
EGAD00001005729	WGS files for Mullighan BiTE WGS paper titled "Tumor intrinsic and extrinsic mechanisms of response and resistance to blinatumomab in relapsed/refractory acute lymphoblastic leukemia"	Illumina HiSeq 2000	56
EGAD00001005730	WXS files for Mullighan BiTE WXS paper titled "Tumor intrinsic and extrinsic mechanisms of response and resistance to blinatumomab in relapsed/refractory acute lymphoblastic leukemia"	Illumina HiSeq 2000	60
EGAD00001005731	RNAseq files for Mullighan BiTE RNASEQ1 paper titled "Tumor intrinsic and extrinsic mechanisms of response and resistance to blinatumomab in relapsed/refractory acute lymphoblastic leukemia"	Illumina HiSeq 2000	41
EGAD00001005732	lowinput RNASEQ files for Mullighan BiTE RNASEQ2 paper titled "Tumor intrinsic and extrinsic mechanisms of response and resistance to blinatumomab in relapsed/refractory acute lymphoblastic leukemia"	Illumina HiSeq 2000	10
EGAD00001005733	single cell RNASEQ files for Mullighan BiTE RNASEQ3 paper titled "Tumor intrinsic and extrinsic mechanisms of response and resistance to blinatumomab in relapsed/refractory acute lymphoblastic leukemia"	Illumina HiSeq 2000	10
EGAD00001005734	Exome Sequencing and RNA Sequencing Data for PDX Samples	Illumina Genome Analyzer	30
EGAD00001005735	This data set contains the raw .fastq files from two RNA-sequencing experiments and two small RNA-sequencing experiments. Both control brain tissue and tissue from sufferers of mesial temporal lobe epilepsy were sequenced. Two different brain regions were sequenced; the cortex and the hippocampus. For more details please see: Mills, James D., et al. "Coding and non-coding transcriptome of mesial temporal lobe epilepsy: Critical role of small non-coding RNAs." Neurobiology of disease 134 (2020): 104612.	Illumina HiSeq 4000	33
EGAD00001005736	In the brain the cells that control inflammation are called a type of white blood cell called microglia. Microglia are located throughout the brain and spinal cord and account for 10–15% of all cells found within the brain. As the resident white blood cells, they are the main active immune defence in the central nervous system (CNS). Microglia are part of an important class of cells known as macrophages that have two main states: M1 and M2. M1 cells are pro- inflammatory, leading to more inflammation, while M2 are anti-inflammatory, and drive wound healing. In this study, we will collect primary microglia from surgical biospies of 100 individuals. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/		-
EGAD00001005737	WES using IDT xGen Research Exome on Illumina NovaSeq 2x150bp: Normal sample (buffy coat), cecum tumor biopsy at diagnosis, ileocecal valve region tumor sample at week 19, pericolonic metastasis at week 19, lymph node metastasis at week 19, peritoneal metastasis at week 19. Week 19 samples from hemicolectomy. Deep coverage cfDNA NGS using PanCeq pan-cancer panel on NextSeq 2x150bp: week 2 and week 10.	Illumina NovaSeq 6000 NextSeq 500	8
EGAD00001005738	79 RNAseq samples from 56 patients with melanoma who have undergone immune checkpoint blockade immunotherapy.		-
EGAD00001005739	Single-cell gene expression was profiled for 22 Hodgkin lymphoma tumors and 5 reactive lymph nodes (2 replicates were performed for RLN-1). Library preparation was performed with the 10x Chromium platform (3' version 2 assay). Sequencing was performed on an Illumina NextSeq. The BAM files were generated from the raw sequencing data using Cell Ranger (v2.1.0) mkfastq and count commands.	Illumina HiSeq 2500	28
EGAD00001005740	To define the cellular characteristics of malignant ascites of advanced gastric cancer patients and search for therapeutic strategies, we obtained 5 malignant ascites and 1 cerebrospinal fluid from five patients with gastric cancer. We analyzed single-cell RNA-seq data of 180 cells from 4 malignant ascites and 1 cerebrospinal fluid metastasis using Fluidigm® C1™ System. Whole exome sequencing data was also generated from blood or tumor tissue.	Illumina HiSeq 2500	11
EGAD00001005741	When comparing the differentiation capacities of pluripotent stem cell lines that have different genetic backgrounds, batch to batch experimental variablility poses a significant challenge, especially when trying to identify smaller effects. One way to address this issue is to differentiate several different lines in the same culture dish, thereby elimating experimental variation. In addition, it allows researchers to analyze many more lines with less experiments. Parallel single cell RNA-Seq exploits that individual cells are tagged and hence each cell can be reliably assigned to the donor of origin based on the genetic variants it contains. In addition, analyzing the genetic signature of single cells within a differentiating population can reveal differentation stages that are not easily detected in bulk RNAseq data. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2500	13433
EGAD00001005743	Fastq files for 80 Multiple Myeloma Patients and 12 cell-lines	454 GS Junior	12
EGAD00001005744		Illumina HiSeq 2500	28
EGAD00001005745	This dataset contains whole genome sequencing data aligned to the b37 reference genome for 4 spatially and temporally distinct tumors from one patient with a matched normal blood sample.	HiSeq X Ten Illumina NovaSeq 6000	5
EGAD00001005746	Whole Exome sequencing of a set of Spanish patients suffering rare genetic diseases. The set consists of 4 patients, one was diagnosed with Retinitis Pigmentosa (RP-1629), another one was diagnosed with Macular Dystrophy (MD-0235) and two were diagnosed with Leber's Congenital Amaurosis (LCA-0081 and LCA-0103).	unspecified	4
EGAD00001005747	RNAseq sample used in study titled "Immune-awakening revealed by peripheral T cell dynamics after one cycle of immunotherapy".	Illumina HiSeq 2500	1
EGAD00001005748	Exome sequencing data from patient with Chronic Lymphocytic Leukemia. DNA was extracted from sorted B-CLL and T cells or granulocytes.	Illumina HiSeq 2500	36
EGAD00001005749	This study reports the whole-genome sequencing data for 20 inflammatory breast cancer patients, each of whom has one normal blood sample and one breast tumor sample. Overall, there are 40 files included in this study, in the format of BAM.	Illumina HiSeq 2500	40
EGAD00001005750	The dataset for white blood cell and cell-free DNA analyses for detection of residual disease in gastric cancer includes 169 bam files from targeted deep sequencing on the Illumina HiSeq2500. The samples analyzed include genomic DNA from white blood cells and cell-free DNA from longitudinal blood collections of patients with gastric cancer.	Illumina HiSeq 2500	167
EGAD00001005751	In this study we aim to characterise the landscape of mutation and clonal selection in the human pancreas. The study combines targeted sequencing and whole-genome sequencing of microbiopsies from the pancreas. The range of patients studied will include healthy individuals, both smokers and non-smokers, and patients with pancreatic ductal adenocarcinoma. This dataset contains all the data available for this study on 2019-12-17.	HiSeq X Ten	136
EGAD00001005753	Four micrograms of total RNA was used for cDNA library construction using the KAPA Stranded mRNA-Seq Kit (KR0960-v3.15), following manufacturer's protocol. The adaptor-ligated libraries were enriched by 6 cycles of polymerase chain reaction (PCR). Libraries were sequenced using the Novaseq 6000 with paired end 151bp reads.	Illumina HiSeq 1500 Illumina NovaSeq 6000	55
EGAD00001005754	Five hundred fifty nanograms of genomic DNA were input for library preparation after fragmentation by Covaris S2, following the KAPA Hyper Prep Kit (KR0961-V1.14) protocols, with selection for a library size range of 250-450 bp. Three hundred nanograms per library DNA each from 12 samples were normalized and combined into a single pool for exome capture using the xGen Lockdown Probes and Reagents based on their standard protocols	HiSeq X Ten Illumina HiSeq 1500 Illumina NovaSeq 6000	117
EGAD00001005756	Paired-end DNA-seq FASTQ files from 16 carriers of the BMPR2 p.Arg491Gln mutation in a family affected by hereditary pulmonary arterial hypertension (HPAH). Whole genome sequencing of these samples was performed in an Illumina HiSeq 4000 instrument. Libraries were prepared using the Fisher PE Kit (Kapa Biosystems). Each sample was multiplexed across flowcells and lanes, leading to a total number of 86 FASTQ files.	Illumina HiSeq 4000	16
EGAD00001005757	Paired-end DNA-seq BAM files from 16 carriers of the BMPR2 p.Arg491Gln mutation in a family affected by hereditary pulmonary arterial hypertension (HPAH). Whole genome sequencing of these samples was performed in an Illumina HiSeq 4000 instrument. Libraries were prepared using the Fisher PE Kit (Kapa Biosystems). FASTQ files were processed at the CNAG (Barcelona) using the GEM short-read aligner on the human genome version hs37d5, producing a total of 16 BAM files.		16
EGAD00001005758	VCF file from 16 carriers of the BMPR2 p.Arg491Gln mutation in a family affected by hereditary pulmonary arterial hypertension (HPAH). Whole genome sequencing of these samples was performed in an Illumina HiSeq 4000 instrument. Libraries were prepared using the Fisher PE Kit (Kapa Biosystems). BAM files were processed at the CNAG (Barcelona) with their pipeline, including GATK v3.6 for genotyping and other tools such as snpEff for annotating variants, to produce this VCF file with a total of 9,643,070 variants, out of which 7,891,370 are SNVs.		16
EGAD00001005759	Five hundred nanograms of genomic DNA was fragmented by Covaris S2, the fragmented DNAs were performed end-repair, A-tailing at the 3 prime end, adaptors ligation with an IDT dual-indexed UMI adaptor system at the terminal ends. The adapter ligated library with size range 300-750bp were selected by dual-SPRI method. Twenty percent of the size selected PCR-free libraries were enriched by 5 PCR cycles prior to library size assessment by Bioanalyzer Fragment Analyzer. The PCR-free libraries were quantified by qPCR.The PCR-free libraries were denatured and diluted to optimal concentration. Illumina NovaSeq 6000 was used for Pair-End 151bp sequencing.	Illumina NovaSeq 6000	9
EGAD00001005760	Transcriptomics for samples obtained from six patients (MBR01, MBR03, MBR05, MBR07, MBR10, MBR11)	Illumina HiSeq 2500	14
EGAD00001005761	Bevacizumab is an approved anti-angiogenic drug for patients with metastasized colorectal cancer (mCRC) targeting VEGF. The survival benefit of anti-VEGF therapy in mCRC patients is limited to a few months and acquired resistance mechanisms are greatly unknown. Using plasma DNA, we studied the evolution of tumor genomes in a cohort of patients with mCRC (n=150) and observed a recurrent focal amplification (8.7% of cases) on chromosome 13q12.2. Analysis of TCGA data (n=619) suggested an association with later stages, which we confirmed by longitudinal plasma analyses. We defined the minimally amplified region and studied the mechanistic consequences of copy number gain of the involved genes. The amplification of one gene, POLR1D, impacted cell proliferation, resulting in upregulation of VEGFA, an important regulator of angiogenesis which has been implicated in the resistance to bevacizumab. In several patients, we observed the emergence of this 13q12.2 amplicon under bevacizumab treatment, which was invariably associated with evolution of therapy resistance. Hence, we describe a novel resistance mechanism against a widely applied treatment in mCRC patients which will impact clinical management .	Illumina MiSeq NextSeq 550	38
EGAD00001005762	Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study	PromethION	1
EGAD00001005763	Genome and transcriptome sequence data from a metastatic pancreatic neuroendocrine patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study	PromethION	1
EGAD00001005764	Whole exome sequencing study for 8 pairs of primary NSCLCs and distant metastases. This dataset includes a total of 30 samples.	Illumina HiSeq 2000	30
EGAD00001005765	RNA sequencing study for 8 pairs of primary NSCLCs and distant metastases. This dataset includes a total of 29 samples.	Illumina HiSeq 2000	29
EGAD00001005766	Newly generated 52 gastric tumor specimens were subjected for targeted-exome and/or whole-transcriptome sequencing	Illumina HiSeq 2500	52
EGAD00001005767	The appearance of type 1 diabetes (T1D)-associated autoantibodies is the first and only measurable parameter to predict progression toward T1D in genetically susceptible individuals. However, autoantibodies indicate an active autoimmune reaction, wherein the immune tolerance is already broken. Therefore, there is a clear and urgent need for new biomarkers that predict the onset of the autoimmune reaction preceding auto-antibody positivity or reflect progressive b-cell destruction. Here we report the mRNA sequencing-based analysis of 306 samples including fractionated samples of CD4+ and CD8+ T cells as well as CD4, CD8 cell fractions and unfractionated PBMC samples longitudinally collected from seven children who developed beta-cell autoimmunity (case subjects) at a young age and matched control subjects.	Illumina HiSeq 2500	306
EGAD00001005768	It was a single-cell RNA sequencing study on the PBMC samples from four Finnish children at risk of developing Type 1 diabetes and their gender age and HLA matched control children. All four Case children were positive for multiple islet specific autoantibodies and two of them also progressed to clinical disease during the follow up whereas the control children remain negative for all autoantibodies. Single-cell analysis confirmed some of the signatures obtained from the bulk data. It identified that high IL32 in case samples in the bulk RNA-seq was contributed mainly by activated T cells and NK cells. Trajectory analysis of the scRNA-seq data suggested that IL32 expression increased as the T cells moved towards activated state.	Illumina HiSeq 3000	8
EGAD00001005769	Interstitial deletion of the long arm of chromosome 5 (del(5q)) is the commonest structural genomic variant in myelodysplastic syndromes (MDS). Lenalidomide (LEN) is the treatment of choice for patients with del(5q) MDS, but half of the responding patients become resistant within two years. TP53 mutations are detected in ~20% of patients who become resistant to LEN. Our data show that patients who become resistant to LEN harbor either TP53 or RUNX1 mutations or loss of RUNX1 expression. Here we show that LEN-induced degradation of IKZF1 permits a RUNX1/GATA2 complex to drive megakaryocytic differentiation and consequent del(5q) MDS progenitor cell death via CRBN-mediated CSNK1A1 degradation. Overexpression of GATA2 is able to restore LEN sensitivity in the context of RUNX1 or TP53 mutations by enhancing LEN-induced megakaryocytic differentiation. Screening for TP53 and RUNX1 mutations or downregulation should identify patients resistant to LEN, and strategies to activate GATA2 may resensitize del(5q) MDS cells to LEN.		16
EGAD00001005770	The aim of this study is to reconstruct the phylogenetic development of childhood tumours	HiSeq X Ten	8
EGAD00001005772	Paired blood and saliva samples from five unrelated individuals were directly compared for quality of whole genome sequencing. Two (Sample Pairs 1 and 2) were female probands diagnosed with tetralogy of Fallot, a type of congenital heart disease, and three (Sample Pairs 3, 4 and 5) were male probands diagnosed with hypertrophic cardiomyopathy. WGS was performed using Illumina HiSeq X to a target average coverage depth of 30x and a read length of 150 bp. The resulting reads were not filtered for minimum quality in order to avoid losing possible contaminant reads. Sequencing read alignment was done using Isaac Aligner to human genome build hg19. Short variant i.e. single-nucleotide variant (SNV) and small insertion-deletion (indel) calling was performed using Isaac Variant Caller with default parameters.		10
EGAD00001005773	We have sequenced whole genomes of 10 melanoma samples (1 cell line; A375 and 9 patient derived short term cultures). Libraries were prepared with 10X linked reads technology in order to obtain phase information and subsequently sequenced on Illumina NovaSeq6000.	Illumina NovaSeq 6000	10
EGAD00001005774	Fastq files from amplicon sequencing in 106 Multiple Sclerosis patients and 105 healthy volunteers in CD4 T cells, CD8 T cells and genomic DNA using PE300 Illumina MiSeq.	Illumina MiSeq	633
EGAD00001005775	This dataset contains 7 paired end fastq files obtained with Illumina Hiseq and Nextseq sequencing of whole exomes relevant to a study of pseudodiastrophic dysplasia (PDD). It includes 3 patients from 2 unrelated families diagnosed with PDD, together with the four parents.	Illumina HiSeq 2500 NextSeq 500	7
EGAD00001005776	Deep WGS sequencing (160x) of 2 different sites of disease of a patient with a RET fusion positive cancer. Amplicon sequencing of 19 other sites of the same patients for the RET fusion.	Illumina MiSeq Illumina NovaSeq 6000	20
EGAD00001005777	Whole genome sequencing data from four affected and one unaffected individuals from two families with familial adult myoclonic epilepsy, one of Sri Lankan origin and one of Indian origin. BAM files aligned to hg19 reference genome.	HiSeq X Ten	5
EGAD00001005778	Aligned BAM files from NextSeq500 tageted panel sequencing of 84 samples from matched tumour-normal pairs of 42 melanoma patients. The dataset consists of 30 non-responders and 12 responders to ICB.	NextSeq 500	84
EGAD00001005779	Files from DNA sequencing from primary tumors and metastases from pancreatic cancer patients along with matched normal tissues. Sequencing files include those derived from whole exome sequencing as well as MSK-IMPACT sequencing.	Illumina HiSeq 2500	81
EGAD00001005780	Whole-exome sequencing for 95 PMBCL cases (including 21 with matching normal DNA) was performed using a targeted capture approach with the SureSelect Human All Exon V6+UTR bait (Agilent Technologies) followed by massively parallel sequencing of enriched fragments on the HiSeq2500 platform (Illumina). Five libraries were pooled per lane and a 125bp paired-end mode was used. Tumor and normal DNA samples were sequenced to an average of 115X (SD 24X). All reads were aligned to the human reference genome (hg19) using bwa-mem version 0.7.5a29 with optical and PCR duplicates removed using the Picard tool.		116
EGAD00001005781	Whole genome sequencing data for 101 BL patients and transcriptome sequencing for 82 (out of 101) BL patients.	HiSeq X Ten Illumina HiSeq 2500	183
EGAD00001005782	This dataset contains 9 bam files of exome sequencing for an experiment of evolved resistance. Here a barcoded cell line (HCC827 - POT) has been treated under high concentrations of gefitinib (GEF) and trametinib (TRM) until resistance has evolved, as well as under control conditions (DMSO). The dataset contains exome sequencing of confluent cells for three replicates for each anti-cancer drug as well as two replicates of growth under DMSO conditions. The original barcoded cell line (POT) was also exome sequenced and is included in the cohort. Sequencing was performed on the Illumina NovaSeq platform.	Illumina NovaSeq 6000	9
EGAD00001005784	CRISPR/Cas9 lethality screens in a set of Asian head and neck cancer cell lines to identify novel targets. . This dataset contains all the data available for this study on 2020-01-15.	Illumina HiSeq 2500	100
EGAD00001005785	The aim of this study is to describe the transcriptome of single arthritic cells. . This dataset contains all the data available for this study on 2020-01-15.	HiSeq X Ten Illumina HiSeq 4000	510
EGAD00001005786	Cancer is a genetic disease caused by an accumulations of mutations, however many of these mutations have been identified in pathologically normal tissue. We aim to use laser-capture microscopy (LCM) to sample individual clones from the lung tissue of individuals with a variety of lung diseases (COPD, UIP, IPF, Emphysema, pulmonary hypertension). This will allow us to identify whether cancer-associated mutations appear in this normal tissue, assess the mutational burden present, and identify the mutational processes causing these mutations. Smoking is a large risk factor for developing many of these lung diseases so we are particularly keen to determining whether there is evidence of a smoking signature in these patients. . This dataset contains all the data available for this study on 2020-01-15.	HiSeq X Ten	190
EGAD00001005787	Cancer is a genetic disease caused by an accumulation of mutations, however many of these mutations have been identified in pathologically normal tissue. We aim to use laser-capture microscopy (LCM) to sample individual clones from breast tissue to identify whether cancer-associated mutations appear in this normal tissue, assess the mutational burden present, and identify the mutational processes causing these mutations. We will sample from a wide age range of individuals (<20 to >70 years old) to determine whether these processes differ in pre- and post-menopausal women. We will also be comparing the tissue from healthy individuals (samples from breast reduction surgery) to those at elevated risk of breast cancer (mastectomy from BRCA1/2 patients) and those who have breast cancer (adjacent normal, distal normal, and tumour tissue from mastectomy). This will allow us to determine how these processes are different between these groups of individuals, and gain insight into the earliest stages of tumour development. . This dataset contains all the data available for this study on 2020-01-15.	Illumina HiSeq 4000	689
EGAD00001005788	We will be testing the hypothesis that MBD4 PTV germline carriers also show an increased number of C toT germline mutations in their offspring. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2020-01-15.	HiSeq X Ten	39
EGAD00001005789	Samples prepared by LCM - 5 cases for pilot study. Bulk DNA not available. . This dataset contains all the data available for this study on 2020-01-15.	HiSeq X Ten	16
EGAD00001005790	Falciparum malaria is clinically heterogeneous and yet in most cases the risk of life-threatening disease dramatically declines after the first few infections of life because children rapidly acquire disease tolerance (resistance to severe malaria without improved control of parasite burden). Identifying the factors that determine clinical outcome in a malaria-naive host is therefore paramount to reduce malaria mortality. However, the relative contribution of disease-causing variants of the Plasmodium var gene family versus pathogenic inflammatory cytokine cascades remains fiercely debated - we sought to reconcile these conflicting arguments by studying their interaction in vivo. To this end, two human challenge models were used to reveal the parasite-host interactions that underpin variation in falciparum malaria. To capture the diversity of human immune responses, each individual was analysed independently by tracking dynamic changes in their whole blood transcriptome through time. And to uncover evidence of preferential expansion of disease-causing variants, var gene expression was tracked in vivo from the start to end of infection. In this way, we could show that group A var genes are always expressed upon liver egress but in a minority population that does not increase over 10-days of blood cycling; there is no selection of disease-causing variants in the naive host. In fact, parasites do not respond in any way to differences or changes in host environment. On the other hand, host-intrinsic variation determines the intensity of inflammation and progression to clinical malaria. And furthermore, regulation of the interferon signaling network controls host fate. These data emphasise the role of human immune decision-making in shaping course & outcome of infection. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2020-01-15.	Illumina HiSeq 2500	30
EGAD00001005791	Whole exome sequence data in fastq format was aligned to the GRCH38 reference genome. Aligned sequence was preprocessed with GATK for Indel Realignment and Base Quality Score Recalibration. Duplicates were marked with Picard Mark Duplicates. Aligned sequence is in bam format. Details of the alignment can be found in he bam header. Tumour samples were classified as Anaplastic Thyroid, Poorly-differentiated or well-differentiated cancers.		-
EGAD00001005795	The study includes methylC-capture sequencing (MCC-Seq) on 94 sperm DNA samples derived from both fertile and infertile individuals who were recruited from the Men’s Health Clinic at the Royal Victoria Hospital, Montreal, Quebec. All the data were generated with 100bp paired-end reads using the Illumina NovaSeq 6000 systems.	Illumina NovaSeq 6000	94
EGAD00001005796	The dataset for Multimodal Genomic Features Predict Outcome of Immune Checkpoint Blockade in Non-small Cell Lung Cancer includes 106 bam files from whole exome next-generation sequencing on the Illumina HiSeq2500. The samples analyzed include matched tumor/normal samples from non-small cell lung cancer patients treated with immunotherapy.	Illumina HiSeq 2500	106
EGAD00001005797	Fastq files of chromatin run-on (14 fibrolamellar carcinoma, 3 non-malignant liver; single-end) and transcriptome (23 fibrolamellar carcinoma, 2 non-malignant liver; paired-end) sequencing of fibrolamellar carcinoma	NextSeq 500	30
EGAD00001005798	The sequencing results provided in this study is enriched through liquid phase hybridization capture. The data set shows 35 clinical cfDNA samples showing a dominant peak at 166bp and 35 clinical cfDNA samples showing a dominant peak at 134/144bp.	HiSeq X Ten Illumina HiSeq 4000 Illumina MiSeq	70
EGAD00001005799	Bam and fastq files from RNA-seq of PDAC samples described in Transcription phenotypes of pancreatic cancer are driven by genomic events events during tumour evolution	Illumina HiSeq 2500 unspecified	34
EGAD00001005800	RNA-seq of SMARCA2/4 knock-down prostate cancer cell lines (LNCaP and 22Rv1, 15 samples altogether). Dataset contains BAM files from RNA-seq performed using Illumina HiSeq 2500.	Illumina HiSeq 2500	15
EGAD00001005801	JAK and STAT alterations in CD30 positive LPD, panel sequencing, 12 cutaneous lymphoma patients, 40 samples	Illumina MiSeq Ion Torrent PGM	40
EGAD00001005802	This dataset contains targeted sequencing of breast tumors with germline BRCA1/2 mutations (n = 30) and those without. Breast cancer related genes (n = 115) have been captured and sequenced.	Illumina HiSeq 2500	60
EGAD00001005803	RNA sequencing	unspecified	39
EGAD00001005804	Paired end shallow whole genome sequencing (sWGS) data for the identification of somatic copy number alterations (SCNA) and the estimation of tumor fractions in plasma DNA of renal cell carcinoma (RCC) patients (MonRec Cohort)	Illumina MiSeq NextSeq 550	117
EGAD00001005805	Paired end shallow whole genome sequencing (sWGS) data of cell-free DNA from plasma from self-reporting healthy individuals (MonRec Cohort)	NextSeq 550	22
EGAD00001005806	Mutation analysis of 10 frequently mutated genes in renal cell carcinoma (BAP1, KDM5C, MET, MTOR, PBRM1, PIK3CA, PTEN, SETD2, TP53, VHL) in plasma DNA of RCC patients using a custom QIASeq panel (MonRec Cohort)	Illumina MiSeq NextSeq 550	276
EGAD00001005807	Bulk RNA-sequencing was performed on CD4+ T cells isolated from the blood of visceral leishmaniasis patients (n = 12) and endemic controls (EC; n = 12). CD4+ T cells were obtained by magnetic-activated cell sorting (MACS). Alterations in the transcripts of T helper (Th) cells during infection were identified.	Illumina NovaSeq 6000	48
EGAD00001005808	Raw whole exome sequencing data (fastq) for the GATCI project	unspecified	-
EGAD00001005809	Whole exome sequencing data for 381 TGA probands	HiSeq X Ten	381
EGAD00001005810	Raw RNA sequence data (fastq) for the GATCI project	unspecified	8
EGAD00001005812	Whole exome sequencing (WES) of tumor tissues from RCC patients (DIAMOND cohort)	Illumina HiSeq 4000	74
EGAD00001005813	A 2.077Mb (57306 probes) personalised capture panel [Tailored Panel Sequencing (TAPAS)] was designed based upon the somatic SNVs identified by WES of RCC patient FF and FFPE tissue samples and applied to cfDNA in plasma and urine.	Illumina HiSeq 4000	62
EGAD00001005814	Paired end shallow whole genome sequencing (sWGS) data of cell-free DNA from plasma and urine from RCC patients (DAIMOND cohort)	Illumina HiSeq 4000	106
EGAD00001005815	Paired end shallow whole genome sequencing (sWGS) data of tumor tissue from RCC patients (DAIMOND cohort)	Illumina HiSeq 4000	45
EGAD00001005816	This dataset includes whole genome sequence data for ChIPmentation assays (18 H3K4me3, 20 H3K27ac and 3 input samples) of human stimulated and cultured CD4+ Treg cells.	Illumina HiSeq 2500 Illumina MiSeq	1
EGAD00001005817	Sequence data in fastq format was aligned to the GRCh38 reference genome with BWA-MEM and preprocessed with GATK for indel realignment and base quality score recalibration. Aligned sequence was analyzed with GATK HaplotypeCaller to generate germline variant calls. Variant calls are in VCF format. In total, there are 60 tumour samples from 38 patients, all with matched normal. Further details can be found in the vcf headers		-
EGAD00001005818	Sequence data in fastq format was aligned to the GRCh38 reference genome with BWA-MEM and preprocessed with GATK for indel realignment and base quality score recalibration. Aligned sequence was analyzed with SomaticSniper to generate somatic variant calls. Variant calls are in VCF format. In total, there are 60 tumour samples from 38 patients, all with matched normal. Further details can be found in the vcf headers		-
EGAD00001005819	We performed whole exome sequencing of 8 samples derived from a patient with metastatic melanoma. These represent six different regions of a metastatic melanoma biopsy that was treated with anti-PD-1 inhibitor, one pre-treatment biopsy that was treatment naive and one post-PD-1 inhibitor treated lesion. Exome sequencing data was generated using methods as previously described, including library preparation using the Agilent SureSelect XT Target Enrichment protocol (#5190-8646) prior to sequencing on an Illumina HiSeq 2000/2500 v3 system using 76bp paired-end reads. Raw sequencing data was then processed using Saturn V, the next generation sequencing data processing and analysis pipeline developed by the Department of Genomic Medicine at the UT MD Anderson Cancer Center.	Illumina HiSeq 2500	8
EGAD00001005820	We performed RNA sequencing of 48 different regions sub-sampled from a metastatic melanoma biopsy that was treated with anti-PD-1 inhibitor. RNAseq was performed on samples with a minimum RNA integrity number (RIN) of 5.5 except for two cases (6A10 and 8A3) with RINs greater than 3. A minimum of 700ng of RNA were required for all samples undergoing RNAseq. Paired-end transcriptome reads were aligned using TopHat2, to the UCSC hg19 reference genome.	Illumina HiSeq 2500	48
EGAD00001005821	We performed deep targeted DNA sequencing for a panel of 265 cancer-related genes. This included subsampling 35 different regions of a metastatic melanoma biopsy that was treated with anti-PD-1 inhibitor. Samples with cancer cell purity greater than 80% based on pathologic assessment were used for cancer gene panel DNA sequencing. Mean sequencing coverage was 861x and paired-end reads in FASTQ format were generated by the Illumina pipeline and aligned to the reference human genome hg19 build using the Burrows-Wheeler Alignment Tool (BWA, v0.7.5) with default settings. Aligned reads were further processed using GATK with best practices for removing duplicates, indel removal and recalibration.	Illumina HiSeq 2500	35
EGAD00001005822	Sequence data in fastq format was aligned to the GRCh38 reference genome with BWA-MEM and preprocessed with GATK for indel realignment and base quality score recalibration. Aligned sequence was analyzed with MuTect to generate somatic variant calls. Variant calls are in VCF format. In total, there are 60 tumour samples from 38 patients, all with matched normal. Further details can be found in the vcf headers.		-
EGAD00001005823	Genome and transcriptome sequence data from a breast ductal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005824	Genome and transcriptome sequence data from an uveal melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005825	Genome and transcriptome sequence data from a metastatic rectal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005826	Genome and transcriptome sequence data from a colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005827	Genome and transcriptome sequence data from a primary unknown- upper GI or pulmonary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005828	Genome and transcriptome sequence data from a metastatic choroidal melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005829	Genome and transcriptome sequence data from an ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005830	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005831	Genome and transcriptome sequence data from a poorly differentiated adenocarcinoma more consistent with metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005832	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005833	Genome and transcriptome sequence data from a metastatic rectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005834	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005835	Genome and transcriptome sequence data from a metastatic squamous cell carcinoma of the esophagus patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005836	Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005837	Genome and transcriptome sequence data from a metastatic adenocarcinoma of unknown primary (upper GI?) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005838	Genome and transcriptome sequence data from a cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005839	Genome and transcriptome sequence data from a metastatic choroid melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005840	Genome and transcriptome sequence data from a carcinoma primary unknown origin patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005841	Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005842	Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005843	Genome and transcriptome sequence data from a metastatic gastrointestinal stromal tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005844	Genome and transcriptome sequence data from a metastatic adrenocortical carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005845	Genome and transcriptome sequence data from a metastatic unknown primary cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005846	Genome and transcriptome sequence data from a cavernous sinus meningioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005847	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005848	Genome and transcriptome sequence data from a chondrosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005849	Genome and transcriptome sequence data from a cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005850	Genome and transcriptome sequence data from a metastatic squamous cell carcinoma of the lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005851	Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005852	Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005853	Genome and transcriptome sequence data from a lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005854	Genome and transcriptome sequence data from a pancreatic ductal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005855	Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005856	Genome and transcriptome sequence data from a cervical adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005857	Genome and transcriptome sequence data from a sex-cord stromal tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005858	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005859	Genome and transcriptome sequence data from an endometrioid ovary carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005860	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005861	Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005862	Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005863	Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005864	Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005865	Genome and transcriptome sequence data from a tongue squamous cell carcinoma (head and neck) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005866	Genome and transcriptome sequence data from a neuroendocrine tumor (GIC) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005867	Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005868	Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005869	Genome and transcriptome sequence data from a transformed diffuse large B cell lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005870	Genome and transcriptome sequence data from an osteosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005871	Genome and transcriptome sequence data from a chordoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005872	Genome and transcriptome sequence data from a cervical squamous cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005873	Genome and transcriptome sequence data from a colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005874	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005875	Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005876	Genome and transcriptome sequence data from a colorectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005877	Genome and transcriptome sequence data from a malignant epithelial mesothelioma (THR) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005878	Genome and transcriptome sequence data from a melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005879	Genome and transcriptome sequence data from an ovarian cystadenocarcinoma low grade patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005880	Genome and transcriptome sequence data from a lung adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005881	Genome and transcriptome sequence data from a colorectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005882	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005883	Genome and transcriptome sequence data from a malignant choroid melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005884	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005885	Genome and transcriptome sequence data from a large intestine-colon cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005886	Genome and transcriptome sequence data from a colorectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005887	Genome and transcriptome sequence data from a parotid gland adenoid cystic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005888	Genome and transcriptome sequence data from a breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005889	Genome and transcriptome sequence data from a pancreatic ductal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005890	Genome and transcriptome sequence data from a melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005891	Genome and transcriptome sequence data from a thyroid hurthle cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005892	Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005893	Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005894	Genome and transcriptome sequence data from a Ewing sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005895	Genome and transcriptome sequence data from a cecum adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005896	Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005897	Genome and transcriptome sequence data from a metastatic adrenocortical cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005898	Genome and transcriptome sequence data from a malignant peripheral nerve sheath tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005899	Genome and transcriptome sequence data from an endometrial stromal sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005900	Genome and transcriptome sequence data from a hemangioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005901	Genome and transcriptome sequence data from a pancreatic ductal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005902	Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005903	Genome and transcriptome sequence data from a metastatic gallbladder adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005904	Genome and transcriptome sequence data from a pancreatic ductal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005905	Genome and transcriptome sequence data from an adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005906	Genome and transcriptome sequence data from a metastatic clear cell carcinoma of the ovary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005907	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005908	Genome and transcriptome sequence data from a squamous cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005909	Genome and transcriptome sequence data from an alveolar soft part sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005910	Genome and transcriptome sequence data from a gastroesophageal junction adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005911	Genome and transcriptome sequence data from a colon adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005912	Genome and transcriptome sequence data from an unknown tissue unknown histology patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005913	Genome and transcriptome sequence data from an osteosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001005914	Sequence data in fastq format was aligned to the GRCh38 reference genome with BWA-MEM. Aligned sequence was preprocessed with GATK for Indel Realignment and Base Quality Score Recalibration. Duplicates were marked with Picard Mark Duplicates. Aligned sequence is in bam format. Details of the alignment can be found in the bam header		-
EGAD00001005915	Data supporting: "Repurposing of KLF5 activates a cell cycle signature during the progression from Barrett’s Oesophagus to Oesophageal Adenocarcinoma." Rogerson et al. RNA-seq data 2 samples	Illumina HiSeq 2000	1
EGAD00001005916	Exome sequences were aligned to the GRCH38 reference genome. Aligned sequence was analyzed with GATK Haplotype Caller to generate germline variant calls. Variant calls are in VCF format. Details for the call can be found in the VCF header		-
EGAD00001005917	Exome sequences were aligned to the GRCH38 reference genome. Aligned sequence was analyzed with GATK/MuTect, to generate somatic variant calls. Somatic variant calls are in VCF format. Details for the mutect call can be found in the vcf header.		-
EGAD00001005918	Exome sequences were aligned to the GRCH38 reference genome. Aligned sequence was analyzed with GATK/SomaticSniper, to generate somatic variant calls. Somatic variant calls are in VCF format. Details for the mutect call can be found in the vcf header.		-
EGAD00001005919	We will be using G&T method to sequence single cell genome and transcriptome derived from FS13B iPSCs cell line. The cell cycle state of each of the single cells is known. Hence, we will be analysing the genome and transcriptome of single cells from each of the cell cycle state to generate a copy number profile and transcriptome profile per given cell cycle stage: G1, S, G2, S. . This dataset contains all the data available for this study on 2020-01-29.	Illumina HiSeq 4000	192
EGAD00001005920	Sequencing of LCM-derived microbiopsies from 20 women who underwent risk-reducing reduction mastectomies due to germline BRCA1/2. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Targeted data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequently be sent for whole-genome sequencing. Results from this portion of the study will be compared to women who had cosmetic breast reduction surgeries and those with cancer. . This dataset contains all the data available for this study on 2020-01-29.	Illumina HiSeq 4000	49
EGAD00001005921	Sequencing of LCM-derived microbiopsies from 20 women who underwent risk-reducing reduction mastectomies due to germline BRCA1/2. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Exome data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequently be sent for whole-genome sequencing. Results from this portion of the study will be compared to women who had cosmetic breast reduction surgeries and those with cancer. . This dataset contains all the data available for this study on 2020-01-29.	Illumina HiSeq 4000	8
EGAD00001005922	Sequencing of LCM-derived microbiopsies from 40 women who underwent mastectomies due to breast cancer. LCM and sequencing will be conducted on both normal, unaffected breast, and, where possible, tumour tissue. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue, and compare findings between the normal and associated cancer tissues. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this portion of the study will be compared to women who had cosmetic breast reduction surgeries and those who are BRCA carriers. . This dataset contains all the data available for this study on 2020-01-29.	HiSeq X Ten	46
EGAD00001005923	Sequencing of LCM-derived microbiopsies from 30 women who mastectomies due to Breast Cancer. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Targeted data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequently be sent for whole-genome sequencing. Results from this portion of the study will be compared to women who had cosmetic breast reduction surgeries and those with germline BRCA 1/2 mutations. . This dataset contains all the data available for this study on 2020-01-29.	Illumina HiSeq 4000	29
EGAD00001005924	Sequencing of LCM-derived microbiopsies from 30 women who underwent mastectomies due to a breast cancer diagnosis. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Exome data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequently be sent for whole-genome sequencing. Results from this portion of the study will be compared to women who had cosmetic breast reduction surgeries and those with germline BRCA 1/2 mutations. . This dataset contains all the data available for this study on 2020-01-29.	Illumina HiSeq 4000	2
EGAD00001005925	Sequencing of LCM-derived microbiopsies from explanted lung from pulmonary fibrosis patient. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Deep sampling throughout multiple regions of the lung will determine whether there are differences in smoking-related mutation burden in different portions of the lung. Targeted sequencing will be conducted on samples to identify drivers of interest and clonality of the samples, well-performing samples will be sent for subsequent whole-genome sequencing. Results from this portion of the study will be compared to other individuals with smoking-related diseases (COPD, pulmonary fibrosis, lung cancer), and normal, non-smoking lungs. . This dataset contains all the data available for this study on 2020-01-29.	Illumina HiSeq 4000	27
EGAD00001005926	Raw whole genome sequencing data (fastq) for the GATCI project	HiSeq X Ten unspecified	-
EGAD00001005928	This dataset contains cell-free reduced representation bisulfite sequencing data from 60 pediatric cancer samples. Files are provided in fastq format. Samples were sequenced on a NextSeq 500.	NextSeq 500	60
EGAD00001005929	This is the dataset used in the benchmarking of ProSolo, a new probabilistic single nucleotide variant caller for single cell DNA sequencing data that provides control over the false discovery rate of different single cell events at genomic sites (e.g. alternative allele presence or allele dropout). It provides the whole exome sequencing data used in assessing ProSolo's performance that is not available elsewhere, namely bulk whole exome sequencing data of a patient with a constitutional mismatch repair defect (MSH6-) and their parents and siblings, a bulk whole exome sample of granulocytes from that patient and 5 single granulocytes whole exome sequenced after whole genome amplification.	Illumina HiSeq 2500	12
EGAD00001005931	RNASeq data from paired malignant/benign prostate tissues	Illumina HiSeq 2500	32
EGAD00001005932	This data set contains small RNA-sequencing and RNA-sequencing data from subependymal giant cell astrocytomas (SEGA) resected from tuberous sclerosis complex patients. Small RNA-sequencing and RNA-sequencing were performed on the same set of SEGAs (n=19) and periventricular controls (n=8). For full details on library preparation and patients please refer to the paper "The coding and non-coding transcriptional landscape of subependymal giant cell astrocytomas." (PMID: 31834371 DOI: 10.1093/brain/awz370).	Illumina HiSeq 2500	27
EGAD00001005933	Fastq files of Reduced Representation Bisulfite Sequencing data (HaeIII, covering about 7 million CpGs per sample) of induced pluripotent stem cells (iPSC), definitive endoderm (DE) and hepatocyte-like cells (HLC). The dataset comprises data generated by the in vitro differentiation protocol Cellartis (Takara Bio, "CEL", n = 4).	Illumina HiSeq 2500	12
EGAD00001005934	Fastq files of ATAC-seq data of induced pluripotent stem cells (iPSC), definitive endoderm (DE), hepatocyte-like cells (HLC) and primary human hepatocytes (PHH). The dataset comprises data from two different in vitro differentiation protocols: Cellartis (Takara Bio, "CEL", n = 4) and as described by Wang et al. (PMID: 28287600, "HAY", n = 1), as well as from 3 PHH donors.	Illumina HiSeq 2500	15
EGAD00001005935	Fastq files of mRNA-seq data of induced pluripotent stem cells (iPSC), definitive endoderm (DE) and hepatocyte-like cells (HLC). The dataset comprises data from the in vitro differentiation protocol Cellartis (Takara Bio, "CEL", n = 4) and several interventions (11x3 replicates).	Illumina HiSeq 2500	45
EGAD00001005936	WXS files for Mullighan Leventaki ALCL paper titled "Integrative molecular analysis of pediatric Anaplastic large cell lymphoma reveals subtypes with distinct immune suppression signatures."	Illumina HiSeq 2000	42
EGAD00001005937	Raw Whole Exome Sequencing data from Blood samples drawn from related Female participants presenting severe congenital neutropenia.	Illumina HiSeq 2500	2
EGAD00001005938	This dataset contains 3 pairs of exomes, germline (from whole blood) and patient-derived xenograft (PDX), from human pancreatic durctal adenocarcinoma patients. The data is referred to in the publication: "Pro-immunogenic impact of MEK inhibition synergizes with agonist anti-CD40 immunostimulatory antibodies in tumor therapy" (Nature Communications, 2020) Abstract: Cancer types with lower mutational load and a non-permissive tumor microenvironment are intrinsically resistant to immune checkpoint blockade. While the combination of cytostatic drugs and immunostimulatory antibodies constitutes an attractive concept for overcoming this refractoriness, suppression of immune cell function by cytostatic drugs may limit therapeutic efficacy. Here we show that targeted inhibition of mitogen-activated protein kinase (MAPK) kinase (MEK) does not impair dendritic cell-mediated T-cell priming and activation. Accordingly, combining MEK inhibitors (MEKi) with agonist antibodies (Abs) targeting the immunostimulatory CD40 receptor resulted in potent synergistic anti-tumor efficacy. Detailed analysis of the mechanism of action of MEKi GDC-0623 by means of flow cytometric analysis of the tumor immune infiltrate and whole tumor transcriptomics showed that, in addition to its cytostatic impact on tumor cells, this drug exerts multiple pro-immunogenic effects, including the suppression of M2-type macrophages, myeloid derived suppressor cells and CD4+ T-regulatory cells. In addition, MEKi was found to induce tumor-cell intrinsic interferon signaling, which contributed to antigen presentation by tumor cells. Finally, the tumoridical impact of MEKi involves the activation of multiple pro-inflammatory pathways involved in immune cell effector function in the tumor microenvironment. Our data therefore indicate that the combination of MEK inhibition with agonist anti-CD40 Ab is a promising therapeutic concept, especially for the treatment of mutant Kras-driven tumors such as pancreatic ductal adenocarcinoma.	Illumina HiSeq 2500	6
EGAD00001005941	Paired melanoma tumor and normal (PBMC) WES data from a cohort of 26 patients subsequently treated with combined immune checkpoint blockade.	Illumina HiSeq 2000	52
EGAD00001005945	50 paired benign/cancer samples from prostate tissue generated in 2 different runs - on 3 plates on the IonTorrent Proton. Total of 200 fastq.gz single end runs. Read length ~300 bp. %GC 44 Sequences per file approx 1 Mio.	Ion Torrent Proton	100
EGAD00001005946	Fastq files of deeply sequenced single cell RNA-seq data (Smartseq2, approx. 2 million reads / sample) of hepatocyte-like cells (HLC) and primary human hepatocytes (PHH). The dataset comprises data from two different in vitro differentiation protocols: Cellartis (Takara Bio, "CEL", n = 3) and as described by Wang et al. (PMID: 28287600, "HAY", n = 1), as well as PHH from 3 donors. Each replicate comprises 96 single cells.	Illumina HiSeq 2500	7
EGAD00001005947	The dataset contains exome sequencing fastq from 5 ovarian cancer patients, paired with tumor normal blood samples. Three tumor samples were sequenced from each patient: a biopsy sample ("-1" suffix in the file name), a local sample (multiple regions around the biopsy pooled together, with the "-2" suffix in the file name), and a global sample (multiple regions from the tumor pooled together, with a "-3" suffix in the file name).	NextSeq 500	20
EGAD00001005948	Whole genome transcriptome poly-A selected strand specific 100bp paired-end RNA sequencing of post-mortem brain tissue from prefrontal cortex and orbitofrontal cortex were performed. Brain tissue samples were collected from four different biobanks in England and USA.	Illumina HiSeq 2500	223
EGAD00001005949	This study assessed molecular determinants of response in a cohort of patients with AML that were treated with venetoclax in combination with either DNA methyltransferase inhibitors or low dose cytarabine. RNA sequencing was performed on 31 patients from three different response classes [10 Group A - Durable remission (n=10), Group B - Relapsed (n=10) and Group C - Refractory (n=11)]. Library preparation and sequencing was performed at the Australian Genome Research Facility, using the Truseq Stranded mRNA library kit. Technical and batch replicate samples are included. Gene count data are provided with the original publication. The use of the sequencing data is subject to a data transfer agreement and is restricted to ethically approved research into blood cell malignancies and cannot be used to assess germline variants.	Illumina HiSeq 2500	39
EGAD00001005950	Gray Platelet Syndrome (GPS) is a rare recessive bleeding disorder resulting from biallelic variants in NBEAL2. As part of a comprehensive evaluation of the phenotype and genotype in 47 patients with GPS, four different blood cell-types (platelets, neutrophils, monocytes, and CD4-lymphocytes) were evaluated using bulk RNA-seq in five patients and five controls. These data are deposited in this archive in FASTQ format.	Illumina HiSeq 4000	40
EGAD00001005951	RNASeq files for Mullighan Leventaki ALCL Project paper titled "Integrative molecular analysis of pediatric Anaplastic large cell lymphoma reveals subtypes with distinct immune suppression signatures."	Illumina HiSeq 2000	32
EGAD00001005952	Familial Multiple Sclerosis study dataset, including variant calling files from 138 samples with three different phenotypes: Multiple Sclerosis (MS), other Autoimmune Diseases (AID) and unaffected individuals.		138
EGAD00001005953	part of the DEEP project results resulted in the publication of 'Integrative analysis of single-cell expression data reveals distinct regulatory states in bidirectional promoters', Epigenetics & Chromatin (2018), Fatemeh et al., DOI: 10.1186/s13072-018-0236-7, PMID: 30414612, PMCID: PMC6230222. This dataset contains the subset of DEEP data related to that study.	Illumina HiSeq 2500	1
EGAD00001005954	Additional histone modification data, not yet released as part of IHEC, for cell line 01_HepG2_LiHG_Ct1, H3K122ac.	Illumina HiSeq 2500	1
EGAD00001005955	Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients.	HiSeq X Ten	7
EGAD00001005956	Whole exome sequencing of tumors and paired adjacent uninvolved tissues from 222 early stage NSCLC patients, in order to identify genomic drivers present in early-stage non-small cell lung cancer and determine the overall tumor mutational burden in early-stage non-small cell lung cancer.	Illumina HiSeq 2500	540
EGAD00001005957	The dataset referenced by EGA Study ID EGAS00001004208 includes 20 human exome sequencing data and 11 human RNA sequencing data from tumor or normal tissues. Each sequencing data includes two pair-end short read files in fastq format.	Illumina HiSeq 2500	31
EGAD00001005958	Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients.	Illumina HiSeq 4000	11
EGAD00001005959	In this experiment we investigated the effect of HDAC3 inhibition on the transcriptome of IFNg-primed macrophages under different tolerization conditions. Peripheral blood mononuclear cells (PBMCs) were isolated from 3 healthy donors. PBMCs were isolated from whole blood of healthy donors using Ficoll gradient (Invitrogen). Monocytes (CD14+ cells) were positively selected from PBMCs using CD14 Microbeads according to the manufacturer’s instructions (Miltenyi Biotec). Monocytes were subsequently treated with or without 500 nM HDAC3i (ITF3100) for 30 minutes prior to overnight IFNg priming (50 ng/mL). Cells were then kept without LPS (non-LPS; N), treated with 10 ng/mL LPS once (non-tolerized; NT), or treated with LPS twice (tolerized; T; second LPS concentration: 100 ng/mL). In total, there were 18 samples included.	Illumina HiSeq 4000	18
EGAD00001005960	The immune microenvironment of hepatocellular carcinoma (HCC) is poorly characterized. Combining two single-cell RNA sequencing technologies, we produced transcriptomes of CD45+ immune cells for HCC patients from five immune-relevant sites: tumor, adjacent liver, hepatic lymph node (LN), blood, and ascites. This dataset is part of Smartseq2 data	Illumina HiSeq 4000	17
EGAD00001005961	The immune microenvironment of hepatocellular carcinoma (HCC) is poorly characterized. Combining two single-cell RNA sequencing technologies, we produced transcriptomes of CD45+ immune cells for HCC patients from five immune-relevant sites: tumor, adjacent liver, hepatic lymph node (LN), blood, and ascites. This is the droplet data of this study	Illumina HiSeq 4000	19
EGAD00001005963	RNA seq analysis of 6 CUP metastases (each in triplicate), analysed by paired sequencing with NextSeq 500. Whole exome sequencing of 15 CUP metastases, analysed by paired sequencing with NextSeq 500.	NextSeq 500	16
EGAD00001005964	This dataset includes whole transcriptome data of human stimulated and cultured CD4+ Treg cells (39 samples).	Illumina HiSeq 2500	39
EGAD00001005965	Single-cell ATAC-seq data for 5 CLL samples (2 controls, 3 tumor) of the CancerEpiSys-PRECiSe project.	Illumina HiSeq 2000	5
EGAD00001005966	Tagged-WGBS for 3 Naive B Cell samples of the CancerEpiSys-PRECiSe project.	Illumina HiSeq 2000	3
EGAD00001005967	ATAC-seq data for 26 CLL samples (7 controls, 19 tumor) of the CancerEpiSys-PRECiSe project.	Illumina HiSeq 2000 Illumina HiSeq 4000	26
EGAD00001005968	long RNA data for 27 CLL samples (8 controls, 19 tumor) of the CancerEpiSys-PRECiSe project.	Illumina HiSeq 2000	27
EGAD00001005969	ChIPseq data for 31 CLL samples (12 controls, 19 tumor) of the CancerEpiSys-PRECiSe project; containing histone H3, histone modifications and transcription factor binding sites (CTCF, EBF1).	Illumina HiSeq 2000 NextSeq 550	31
EGAD00001005970	WGBS data for 75 paired fastq, spread over 31 samples (4 healthy T-cell, 7 healthy B-cell, 20 B-cell CLL tumors) of the CancerEpiSys-PRECiSe project.	Illumina HiSeq 2000	31
EGAD00001005971	This dataset contains 14 paired-end FASTQ sequences from mRNA-Seq on single human M-II stage oocytes that were collected from ovarian tissue from unstimulated patients undergoing fertility preservation treatments due to cancer diagnoses, which did not influence ovarian function. Cumulus-oocyte-complexes were matured in vitro according Gruhn et al (Science 365: 1466-1469) and short term flash frozen prior to lysis, RNA extraction, full length cDNA preparation and amplification using the Ultra-low-input SMART-Seq2 v4 kit from Takara Clonetech. Further, these cDNA were used to prepare libraries for sequencing according the Nextera XT DNA library preparation kit from Illumina	NextSeq 500	14
EGAD00001005972	Dataset contains CYP2D6 sequencing data of 566 individuals who used tamoxifen as adjuvant breast cancer therapy. Phenotype data consists of the ratio between the metabolites endoxifen and desmethyltamoxifen (Metabolic ratio (MR)) as a proxy for CYP2D6 enzyme activity. Each sample is linked to one bam-file containing the CYP2D6 sequence.	PacBio RS II	2
EGAD00001005973	We are interested in inter-individual variation in transcriptional response to immune checkpoint blockade. We have analysed poly A purified RNA expression from CD8 T cells (297 transcriptomes in total) isolated from metastatic melanoma patients (n=106) prior to and during treatment with either single agent (Pembrolizumab) or combination (Ipilimumab/ Nivolumab) immune checkpoint blockade. We compare expression at different stages of treatment and additionally contrast this with that from healthy controls (n=68).	Illumina HiSeq 4000	297
EGAD00001005974	Oxford Nanopore long-read sequencing of A17-LAxillaryLN2Met-23312 PELICAN sample, identified as D051965 un Pan-Cancer Analysis of Whole Genomes study, and identified as PD13412a by prior Gundem et al whole genome sequencing study (PMID 25830880). Data used to support Figure 6 in Pubmed ID 32025007 "Pan-Cancer Analysis of Whole Genomes Consortium." Nature 2020 578:8293.	PromethION	1
EGAD00001005975	Genome Asia VCF files		1163
EGAD00001005976	This dataset contains NanoString gene expression of PBMC from patients from IMvigor210, IMvigor211 and IMmotion150 cohorts	unspecified	3
EGAD00001005977	Single Cell RNAseq of blood and tumor from renal cancer patients	Illumina HiSeq 4000	8
EGAD00001005978	To identify the therapeutic targets in a treatment-refractory cancer patient, we performed single-cell RNA sequencing for 3,115 cells from primary bladder cancer (BC159-T#3) and patient-derived xenografts (BC159-T#3-PDX-vehicle and BC159-T#3-PDX-tipifarnib). Matched time-series bulk tumor tissues were also sequenced using whole exome target probe (WES) and whole transcriptome target probe (WTS).	Illumina HiSeq 2500	10
EGAD00001005981	These are the raw sequencing files for the 50 brain tumour, 3 extracranial tumour and 34 matched normals for the patients in the discovery cohort.	Illumina HiSeq 4000	87
EGAD00001005982	These are the variant calls for the 50 brain tumour samples in the discovery cohort.		50
EGAD00001005983	These are the Sequenza copy number calls for the 30 brain tumour samples within the discovery cohort.		30
EGAD00001005984	Raw sequencing files for the 18 (brain tumour-only) samples within the external validation cohort.	Illumina HiSeq 4000	18
EGAD00001005985	These are the raw sequencing files for the orthogonal validation brain tumour samples in the discovery cohort.	Illumina HiSeq 4000	31
EGAD00001005987	Whole transcriptome and targeted dna sequencing (Ampliseq) of pediatric low-grade glioma samples at the Hospital for Sick Children	Illumina HiSeq 2500 Illumina MiSeq	101
EGAD00001005990	Sequencing of LCM-derived microbiopsies from explanted lung from COPD patient. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Deep sampling throughout multiple regions of the lung will determine whether there are differences in smoking-related mutation burden in different portions of the lung. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this portion of the study will be compared to other individuals with smoking-related diseases (COPD, pulmonary fibrosis, lung cancer), and normal, non-smoking lungs. . This dataset contains all the data available for this study on 2020-02-20.	HiSeq X Ten	12
EGAD00001005991	Sequencing of LCM-derived microbiopsies from explanted lung from pulmonary fibrosis patient. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Deep sampling throughout multiple regions of the lung will determine whether there are differences in smoking-related mutation burden in different portions of the lung. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this portion of the study will be compared to other individuals with smoking-related diseases (COPD, pulmonary fibrosis, lung cancer), and normal, non-smoking lungs. . This dataset contains all the data available for this study on 2020-02-20.	HiSeq X Ten	18
EGAD00001005992	Using whole genome sequencing of lymphocytes excised from human tissue using laser capture microscopy (LCM), we identify the mutations arising in these microenvironments. This work will contribute towards developing a catalogue of mutations present in tissue resident lymphocytes across a range of tissues, and will characterize the mutational signatures that result from each microenvironment. . This dataset contains all the data available for this study on 2020-02-20.	HiSeq X Ten	9
EGAD00001005993	The aim of this study is to define the mutational landscape of human liver tumours. . This dataset contains all the data available for this study on 2020-02-20.	HiSeq X Ten	6
EGAD00001005994	The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, the International Agency for Research on Cancer is coordinating the recruitment of 5000 individuals with cancer (colorectal, renal, pancreatic, oesophageal adenocarcinoma or oesophageal squamous cancers) across 5 continents to explore whether different mutational signatures explain marked variation in incidence. In brief, through an international network of collaborators around the world, biological materials are collected, along with demographic, histological, clinical and questionnaire data. Whole genome sequences of tumour-germline DNA pairs are generated at the Wellcome Trust Sanger Institute. Somatic mutational signatures are subsequently extracted by non-negative matrix factorisation methods and correlated with risk factors data. Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development. . This dataset contains all the data available for this study on 2020-02-20.	HiSeq X Ten	4
EGAD00001005995	The study will use WGS to aid in benchmarking different culture conditions in a set of genetically annotated human organoid lines. The data will be used to assess whether there is any clonal differences introduced when culturing these lines in different conditions. . This dataset contains all the data available for this study on 2020-02-20.	HiSeq X Ten	30
EGAD00001005996	Bacterial isolation in infected brains in patients with Huntington's disease. Here we used next generation sequencing of 16S ribosomal RNA gene PCR amplicons (NGS 16S amplicon analysis)	Illumina MiSeq	1
EGAD00001005998	106 bulk RNA-seq samples of primary human keratinocytes 8 single cell RNA-seq samples of human epidermis	Illumina HiSeq 2500 Illumina NovaSeq 6000	112
EGAD00001005999	Contains 46 aGCT tumor sample WGS BAMs from 33 patients and corresponding 33 germline reference WGS BAMs from those 33 patients	Illumina NovaSeq 6000	79
EGAD00001006000	Single cell RNA-seq profiling of ~62k purified CD8+ T cell transcriptomes, from six healthy older adult donors using 10X genomics. Cells from each donor were separated based on their IL-7R protein expression (i.e. CD8+ IL7R+ and CD8+ IL7R- T cells).	Illumina HiSeq 4000	12
EGAD00001006001	This dataset includes 52 samples from 19 individuals with pancreatic neuroendocrine tumours. These samples were analyzed using RNA-sequencing, whole-exome sequencing, shallow (~0.3x) whole-genome sequencing, and a 21-gene panel targeted capture sequencing. Everything was sequenced using the Illumina HiSeq and is provided in BAM or FASTQ format.	Illumina HiSeq 2000	51
EGAD00001006003		Illumina HiSeq 2500	144
EGAD00001006004	Whole-exome sequencing dataset for the Australian Ovarian Cancer Study (AOCS) and The Jikei University School of Medicine (JIKEI) ovarian clear cell carcinoma (OCCC) clinical outliers project, representing 10 donors. Consists of 20 fastq files: one normal and one tumour from each patient.	Illumina HiSeq 4000	20
EGAD00001006005	liver cancer paired with normal controls, viral and non-viral origin		108
EGAD00001006006	Illumina Nextseq 500 whole transcriptome RNAseq from PBMCs - run1	NextSeq 500	2
EGAD00001006007	In this study, we evaluated the effect of preservation agens on the effect of the methylation pattern of cell-free DNA. The methylation pattern was assessed with cell-free reduced representation sequencing (cf-RRBS). All 45 samples were sequenced on a NovaSeq6000, and samples are provided as raw fastq files.	Illumina NovaSeq 6000	45
EGAD00001006008	RNAseq was performed on CDX, CDX-derived cell line and LNCaP cell line, with triplicates.	Illumina HiSeq 4000 Illumina NovaSeq 6000	26
EGAD00001006009	WES was performed on 2 TURP, 6 biopsies, 1 CDX, 1 cell line, 6 CTCs, 1 DNA germline and 1WBC from prostate cancer	Illumina HiSeq 4000	18
EGAD00001006010	This dataset consists of 20 fastq files in total from exome and myeloid gene panel sequencing of 15 carriers of germline RUNX1 mutations from 10 different families.	Illumina HiSeq 2500 Ion Torrent PGM Ion Torrent Proton NextSeq 500	16
EGAD00001006011	8 matched pair melanocytic nevi Agilent SureSelect Human All Exon V5 plus UTR	Illumina HiSeq 2500	16
EGAD00001006012	The dataset is comprised of 3 blood plasma samples (Patient 1-3_plasma_cfDNA) paired with genomic DNA (Patient 1-3_tumor gDNA) from the corresponding primary neuroblastoma from three patients. For all these samples whole-exome sequencing (WES) data have been generated.	Illumina HiSeq 4000	6
EGAD00001006013	Three technical replicates of FACS-sorted T cells (CD45+CD3+) and one replicate of FACS-sorted tumor cells (MCSP+) were loaded to a targeted 10,000 cells per lane on the 10X Genomics Chromium Controler with the single cell 5’ Immune Repertoire and Gene Expression profiling kit. In total, we loaded ~30,000 individual tumor infiltrating lymphocytes (TILs) and ~10,000 melanoma cells on the 10X platform (10X Genomics, CA, USA). Reverse transcription, TCR enrichment, and library preparations were performed according to the 10X Genomics 5’ V(D)J protocol revision C. Transcriptome libraries were pooled and sequenced on the Illumina NovaSeq 6000 S2 flow cell with 26 R1, 8 i7, and 91 R2 cycles respectively. The TCR libraries were pooled and sequenced on the Illumina MiSeq V2 150 cycles paired-end. Single cell transcriptomic and TCR data was processed with the 10X Genomics Cell Ranger Pipeline version 2.2.0 with the software-provided GRCh38 reference transcriptomes. After quality control, there was RNAseq profile data available from 6267 immune and 4303 melanoma cells. Downstream processing and visualization was encompassed through Seurat and tSNE plots.	Illumina MiSeq Illumina NovaSeq 6000	5
EGAD00001006014		HiSeq X Ten Illumina HiSeq 4000	2
EGAD00001006016		HiSeq X Ten Illumina HiSeq 2000	12
EGAD00001006017	Single Cell RNA-Seq of Primary GBM. Gender Female, Age, 57.	Illumina NovaSeq 6000	1
EGAD00001006018	Mixed Sample of scRNA-Seq primary low grade glioma. Genders: Male, Age: 34, 44.	Illumina NovaSeq 6000	1
EGAD00001006019	Single Cell-RNA Seq of Wildtype Primary GBM for Female, Age 50.	Illumina NovaSeq 6000	1
EGAD00001006020	Primary diffuse astrocytoma G3 Male, 74	Illumina NovaSeq 6000	1
EGAD00001006022	The motor cortex is the earliest affected brain region in ALS. This dataset contains total RNA sequencing (stranded, 2x101bp) data derived from the motor cortex of 11 sporadic ALS patients and 8 healthy controls.	Illumina HiSeq 2500	19
EGAD00001006023		HiSeq X Ten Illumina HiSeq 4000	2
EGAD00001006024	Pair end fastq file of 40 Roma whole genome sequence data. This dataset contains the fastq files obtained using illumina hiseq X generated reads, ~30X coverage. 10 Makedonian Roma - Balkan Roma 10 Spanish Roma - North/Western Roma 10 Hungarian Roma - Vlax and Romungro Roma 5 Lithuanian Roma - North/Western Roma 5 Ukranian Roma - Romungro Roma.	HiSeq X Ten	40
EGAD00001006025	We have performed a comprehensive and integrative genomic study of mantle cell lymphoma (MCL) to elucidate the features that may determine the different clinical and biological behavior of the two molecular subtypes of this lymphoma, conventional (cMCL) and leukemic non-nodal MCL (nnMCL). This data integrated with epigenomics and transcriptomics has allowed to uncover novel molecular mechanisms in the origin and development of these tumors and provide relevant information to stratify patients in different risk groups.	Illumina MiSeq	114
EGAD00001006026	Immune checkpoint inhibitors targeting the PD-1 pathway have transformed the management of many advanced malignancies, including clear cell renal cell carcinoma (ccRCC), but the drivers and resistors of PD-1 response remain incompletely elucidated. Here, we analyzed 592 tumors collected from advanced ccRCC patients enrolled in prospective clinical trials of treatment with PD-1 blockade (or mTOR inhibition as control arm) by whole-exome and RNA-sequencing, integrated with immunofluorescence analysis, to define the somatic alteration landscape of late-stage ccRCC and to uncover the immunogenomic determinants of therapeutic response. While conventional genomic markers (tumor mutation burden, neoantigen load) and degree of CD8+ T cell infiltration were not associated with clinical response, we discovered numerous chromosomal alterations in advanced ccRCC associated with response or resistance to PD-1 blockade. These advanced tumors were highly CD8+ T cell infiltrated, with only 22% and 5% with an immune desert and immune excluded phenotype, respectively. Our analysis revealed that CD8+ infiltrated tumors are depleted of favorable PBRM1 mutations and are enriched for unfavorable chromosomal losses of 9p21.3 when compared to non-infiltrated tumors. These data demonstrate how the interplay of somatic alterations and immunophenotypes impacts therapeutic efficacy.	Illumina HiSeq 2500	53
EGAD00001006027	Immune checkpoint inhibitors targeting the PD-1 pathway have transformed the management of many advanced malignancies, including clear cell renal cell carcinoma (ccRCC), but the drivers and resistors of PD-1 response remain incompletely elucidated. Here, we analyzed 592 tumors collected from advanced ccRCC patients enrolled in prospective clinical trials of treatment with PD-1 blockade (or mTOR inhibition as control arm) by whole-exome and RNA-sequencing, integrated with immunofluorescence analysis, to define the somatic alteration landscape of late-stage ccRCC and to uncover the immunogenomic determinants of therapeutic response. While conventional genomic markers (tumor mutation burden, neoantigen load) and degree of CD8+ T cell infiltration were not associated with clinical response, we discovered numerous chromosomal alterations in advanced ccRCC associated with response or resistance to PD-1 blockade. These advanced tumors were highly CD8+ T cell infiltrated, with only 22% and 5% with an immune desert and immune excluded phenotype, respectively. Our analysis revealed that CD8+ infiltrated tumors are depleted of favorable PBRM1 mutations and are enriched for unfavorable chromosomal losses of 9p21.3 when compared to non-infiltrated tumors. These data demonstrate how the interplay of somatic alterations and immunophenotypes impacts therapeutic efficacy.	Illumina HiSeq 2500	31
EGAD00001006028	Genomic characterization (through whole-exome sequencing) of circulating tumor cells, bone marrow clonal plasma cells and extramedullary plasmacytomas from multiple myeloma patients.	Illumina HiSeq 2000 Illumina NovaSeq 6000	214
EGAD00001006029	Immune checkpoint inhibitors targeting the PD-1 pathway have transformed the management of many advanced malignancies, including clear cell renal cell carcinoma (ccRCC), but the drivers and resistors of PD-1 response remain incompletely elucidated. Here, we analyzed 592 tumors collected from advanced ccRCC patients enrolled in prospective clinical trials of treatment with PD-1 blockade (or mTOR inhibition as control arm) by whole-exome and RNA-sequencing, integrated with immunofluorescence analysis, to define the somatic alteration landscape of late-stage ccRCC and to uncover the immunogenomic determinants of therapeutic response. While conventional genomic markers (tumor mutation burden, neoantigen load) and degree of CD8+ T cell infiltration were not associated with clinical response, we discovered numerous chromosomal alterations in advanced ccRCC associated with response or resistance to PD-1 blockade. These advanced tumors were highly CD8+ T cell infiltrated, with only 22% and 5% with an immune desert and immune excluded phenotype, respectively. Our analysis revealed that CD8+ infiltrated tumors are depleted of favorable PBRM1 mutations and are enriched for unfavorable chromosomal losses of 9p21.3 when compared to non-infiltrated tumors. These data demonstrate how the interplay of somatic alterations and immunophenotypes impacts therapeutic efficacy.	Illumina HiSeq 2500	278
EGAD00001006030	Germline exome sequencing data (paired Fastq files) from 516 BRCA1/2-negative women affected with familial high-grade serous (or similar) ovarian carcinoma, as analysed and described in Subramanian et al (Nature Communications, 2020).	Illumina HiSeq 2500	516
EGAD00001006031	Whole genome, exome and RNA sequencing of uveal melanoma metastases, primary tumors and matched normal DNA.	HiSeq X Ten Illumina HiSeq 2500 Illumina NovaSeq 6000 NextSeq 500	240
EGAD00001006032	Previously unpublished WGS reads mapping within the IG loci used in the benchmark of IgCaller.	unspecified	176
EGAD00001006033	Data supporting: "Genomic copy number predicts esophageal cancer years before transformation." Killcoyne, Gregson et al. sWGS data 1000 samples BAM files	Illumina HiSeq 2500 Illumina HiSeq 4000	1000
EGAD00001006034	This is the PacBio long read data used for performing de novo assembly of the EGYPT individual (mapped against GRCh38).	Sequel	1
EGAD00001006035	10x Genomics linked read data used in variant phasing and de novo assembly scaffolding for the EGYPT individual (mapped against GRCh38).	HiSeq X Ten	1
EGAD00001006036	This is the blood RNA-Seq read data used for expression analysis such as haplotypic expression (mapped against GRCh38).	Illumina NovaSeq 6000	1
EGAD00001006037	High-coverage WGS	HiSeq X Ten	9
EGAD00001006038	This data set contains for 10 Egyptian individuals the WGS reads mapping to chrM. These were subsequently used for haplogroup assignment.	HiSeq X Ten	10
EGAD00001006039	This data set comprises WGS small variants and structural variants called in a cohort of 110 Egyptian individuals (10 individuals have been sequenced as part of this study and 100 are from EGAD00001001372/EGAD00001001380).		10
EGAD00001006040	This data set contains for 217 Egyptian individuals the amplicon sequencing reads mapping to chrM. These were subsequently used for haplogroup assignment.	Illumina MiSeq	217
EGAD00001006041	The dataset includes 174 FASTQ files from paired-end WXS sequencing on Illumina HiSeq2500 for 39 patients.	Illumina HiSeq 2500	87
EGAD00001006042	The dataset includes 77 FASTQ files from single-end total RNA sequencing on Illumina HiSeq2500 for 39 patients.	Illumina HiSeq 2500	77
EGAD00001006043	TST170 Pilot DNA VCF files		16
EGAD00001006044	TST170 Pilot RNA BAM files		16
EGAD00001006045	WGS bam file for the 18 samples used in Michealraj et al. Cell 2020. The dataset includes PFA ependymoma tissue, derived line and blood samples of 6 patients WGS data were aligned with BWA to the hg38 human reference genome (igenome) and further processed according to the GATK best practice pipeline.	HiSeq X Five	18
EGAD00001006046	RNA-seq fastq files for the 16 samples used in Michealraj et al. Cell 2020. The samples include PFA and ST ependymoma tissues, normal pediatric brain as control and PFA ependymoma lines.	Illumina HiSeq 2500	16
EGAD00001006047	DNA was isolated from aberrant plasma cells (aPCs) and peripheral blood of 12 ALA, 10 ALA+MM and 29 MM individuals. DNA from aPCs was amplified using REPLI-g Mini Kit (Qiagen). Totally, we analyzed 51 patients, 102 samples. One batch of exome libraries (paired tumor-normal samples from 12 ALA, 10 ALA+MM and 6 MM) was prepared using SureSelect Human All Exon V5 Kit (Agilent Technologies) and sequenced on Illumina HiSeq 4000 platform, 100 cycles. Second batch (23 MM samples; IDs ARK01-ARK26) was prepared using SureSelect Human All Exon V5 + IGH, IGK, IGL, MYC (Agilent Technologies) library preparation kit and sequenced on Illumina HiSeq 2000 platform in paired-end settings, 75 cycles. The reads were mapped using BWA-MEM on human genome GRCh38 without alternate loci.	Illumina HiSeq 2500	99
EGAD00001006048	ChIP-seq was performed for the following histone modifications in both megakaryocytes and granulocytes: H3K27ac, H3K4me2, and H3K36me3. ChIP-seq was performed for H3K27me3 and CTCF in megakaryocytes only. All ChIP experiments consist of n=3 replicates for each of QPD and Control, with the exception of control granulocyte H3K4me2 for which only n=2 replicates were performed. 4C-seq datasets consist of 2 viewpoints (PLAU promoter and an intergenic enhancer), profiled in megakaryocytes from n=2 controls and n=4 QPD.	Illumina HiSeq 2500	77
EGAD00001006049	379 tissue samples from various parts of the developing human embryo brain were dissociated and single cells were collected and processed without bias for mRNA-seq using 10X chromium 3' protocol. Libraries were sequenced on Illumina NovaSeq and reads aligned against the human GRCh38 genome.	Illumina NovaSeq 6000	379
EGAD00001006050	5 trios were whole genome sequenced with PacBio Sequel to a depth of 15X (Trios 1-4) or 40X (Trio 5). For each trio the child was affected with severe ID, and the parents were unaffected. Dataset consists of Trio 2 samples: T2P, T2F and T2M	Sequel	15
EGAD00001006051	We here focused on whole blood from paxgene tubes from healthy Tanzanians and the impact of urbanization and diet on innate immune responses. We performed RNA-sequencing of whole blood from paxgene tubes from healthy Tanzanians and investigate that transcriptional changes depend of diet and location.	NextSeq 500	316
EGAD00001006053	RNA-seq was performed on cultured megakaryocytes and peripheral-blood derived granulocytes from individuals with QPD and a unaffected controls. Each group consisted of n=3 biological replicates.	Illumina HiSeq 2500	12
EGAD00001006054	Data were generated by next-generation sequencing (Illumina) in a fastq format. This dataset involved sequencing data from pregnant women and patients with hepatocellular carcinoma (HCC). For HCC samples, the paired buffy coat and tumor DNA tissue samples were also sequenced.	Illumina HiSeq 4000	29
EGAD00001006055	Mutational signatures are imprints of cell-intrinsic and extrinsic pathophysiological processes that have occurred through tumorigenesis. Experimental efforts to explore signature etiologies have produced a compendium of signatures of exogenous mutagens previously. Here, we unearth major sources of endogenous DNA damage and the genes that are critical to mitigating this innate stream of DNA damage under normal, physiological circumstances. We performed whole genome sequencing of 173 subclones of CRISPR-Cas9 knockouts of 43 genes in a human induced pluripotent stem cell system, in the absence of any added DNA damage. We reveal substitution and indel signatures that arise from those genes which are essential guardians of the genome. By detailed dissection and comparisons to cancer-derived signatures, we demonstrate interminable sources of constitutive DNA damage, and some mechanistic knowledge into how guardian genes preserve the genome. Based on these experimental insights, we develop and benchmark a tool for clinical classification of tumor samples.	HiSeq X Ten	173
EGAD00001006056	The aim of this project is to differentiate human embryonic stem cells to an extra-embryonic fate, specifically the hypoblast. This is of uttermost importance given the current lack of human hypoblast stem cells. We hypothesized that the pluripotent characteristics of the starting human embryonic stem cell population may dictate the competency for extra-embryonic cell fate specification. Based on this hypothesis and using human embryonic stem cells maintained in different naïve-like culture regimes, we have now developed conditions that allow the differentiation of human embryonic stem cells to a stable GATA6+ SOX2- population. This suggests that these cells may be putative human hypoblast stem cells. To validate this finding here we propose to perform RNA sequencing experiments of the differentiated human embryonic stem cells. By comparing their RNA expression profile to the single cell sequencing data of the human embryo that we are currently generating, we will be able to determine the identity of our GATA6+ SOX2- cells, and establish whether they represent the in vivo human hypoblast. This dataset contains all the data available for this study on 2020-04-20.	Illumina HiSeq 4000	7
EGAD00001006057	paired WGS sequencing of nodal B-cell lymphoma, one tumor and one control, one patient (H021). Sequencing on Hiseq XTen with TruSeq Nano library preparation kit.	HiSeq X Ten	2
EGAD00001006058	paired WGS data of one tumor of one patient with nodal B-cell lymphoma. Tumor cells were sorted according to CD48 expression in a high and low fraction. Library preparation with TruSeq Nano and sequencing on Hiseq XTen.	HiSeq X Ten	2
EGAD00001006059	Tumors and control of nodal B-cell lymphoma of one patient. WES sequencing on Illumina HiSeq 4000 with Agilent SureSelect V5+UTRs. Bam files were aligned with bwa mem to hg19.	Illumina HiSeq 4000	5
EGAD00001006060	paired EXOME sequencing on Illumina HiSeq 4000 using Agilent SureSelect V6 of one tumor sample of one patient with B-cell lymphoma. The bam-file was mapped to the hg19 genome.	Illumina HiSeq 4000	1
EGAD00001006061	Whole genome and whole exome sequencing data supporting the manuscript 'Somatic evolution in the non-neoplastic IBD affected colon' by Sigurgeir Olafsson et al.	HiSeq X Ten Illumina NovaSeq 6000	693
EGAD00001006062	This dataset include bam files of 16 paired tumor/normal of extranodal NK/T cell lymphomas.	Illumina HiSeq 2500	32
EGAD00001006063	Illumina platform RNA-seq data from 47 Pancreatic neuroendocrine tumour samples		41
EGAD00001006064	A 19-sample data set containing data from FFPE high grade serous ovarian cancer biopsies. The library was made with a custom hybridization kit (EZ-Cap, Roche) spanning 7 genes.	NextSeq 500	19
EGAD00001006065	Most patients with rare diseases do not receive a molecular diagnosis and the aetiological variants and mediating genes for more than half such disorders remain to be discovered. We implemented whole-genome sequencing (WGS) in a national healthcare system to streamline diagnosis and to discover unknown aetiological variants, in the coding and non-coding regions of the genome. In a pilot study for the 100,000 Genomes Project, we generated WGS data for 13,037 participants, of whom 9,802 had a rare disease, and provided a genetic diagnosis to 1,138 of the 7,065 patients with detailed phenotypic data. We identified 95 Mendelian associations between genes and rare diseases, of which 11 have been discovered since 2015 and at least 79 are confirmed aetiological. Using WGS of UK Biobank1, we showed that rare alleles can explain the presence of some individuals in the tails of a quantitative red blood cell (RBC) trait. Finally, we reported 4 novel non-coding variants which cause disease through the disruption of transcription of ARPC1B, GATA1, LRBA and MPL. Our study demonstrates a synergy by using WGS for diagnosis and aetiological discovery in routine healthcare.	Illumina HiSeq 4000	1
EGAD00001006066	Maps of H3K27ac from normal 2nd- and 3rd-trimester cytotrophoblasts, preterm severe preeclampsia cytotrophoblasts, and 2nd-trimester amnion. Maps of H3K27me3, H3K27me3, H3K36me3, and H3K4me1 from 2nd- and 3rd-trimester cytotrophoblast. Maps of H3K9me3 from 2nd- and 3rd-trimester cytotrophoblast, smooth chorion, and basal plate. RNA-seq from 2nd- and 3rd-trimester cytotrophoblasts.	Illumina HiSeq 2000 Illumina HiSeq 2500	32
EGAD00001006067	We generated global skeletal muscle transcriptomic data from long-term endurance (9 men, 9 women) and strength (7 men) trained individuals. These data were compared with healthy age-matched untrained controls (7 men, 8 women). All 40 samples were then multiplexed in 1 lane and sequenced (2x50bp paired end) on the Illumina NovaSeq 6000.	unspecified	40
EGAD00001006069	The dataset consists of FASTQ files from Seq-Well and some Chromium (10x Genomics) libraries from 10 control donors and 10 COPD GOLD2 patients.	NextSeq 500	37
EGAD00001006070	Targeted sequencing analyses was made on samples of PDX engrafted with breast cancer bone metastases, 2 PDX acquired resistance to palbociclib.	Illumina HiSeq 2500	11
EGAD00001006071	Exome sequencing analyses obtained from 11 samples of PDX engrafted with bone metastases, match primary tumors and/or metastases and normal tissus.	Illumina NovaSeq 6000	20
EGAD00001006072	Raw sequencing data (PE, fastq.gz) from NovaSeq 6000 sequencing runs of: 1. Blood-derived cell-free DNA from six healthy controls, enzymatic cytosine conversion (between 1 and 3 replicates each) 2. Urine-derived cell-free DNA from three healthy controls, enzymatic cytosine conversion (between 1 and 3 replicates each) 3. Blood-derived cell-free DNA from six patients with acute or chronic kidney disease with/without other relevant organ dysfunction, enzymatic cytosine conversion (between 1 and 3 replicates each) 4. Urine-derived cell-free DNA from three patients with acute kidney disease, enzymatic cytosine conversion (between 1 and 3 replicates each) 5. Blood-derived cell-free DNA from three healthy controls, bisulfite cytosine conversion (single datasets) 6. Conventional whole-genome sequencing on genomic DNA of two healthy kidney donors for two of the studied patients (single datasets).	Illumina NovaSeq 6000	45
EGAD00001006073	35 paired samples ressected HCC	unspecified	70
EGAD00001006074	Leukemic bone marrow in primary ETV6-RUNX1 positive acute lymphoblastic leukemia samples (Six diagnostic and two 15 days after treatment) using Chromium 3' single cell RNA-seq. Samples sequenced on 1-3 lanes and raw fastq files provided for each sample.	Illumina HiSeq 3000	8
EGAD00001006075	This dataset contains Whole-exon-sequencing (WES) of human acute erythroid leukemia patient samples.	Illumina HiSeq 2000	22
EGAD00001006076	This dataset contains RNAseq performed on 35 human acute erythroid leukemia patient samples and Xenografts.	Illumina HiSeq 2000	35
EGAD00001006077	WES (N=18) and WGS (N=2) of OSCC tumors and normals with the aim of identifying novel mutational signatures in Asian tumors	HiSeq X Ten Illumina HiSeq 3000	40
EGAD00001006079	The dataset comprises of muscle samples from three patients with mitochondrial disease: Patient 1, age9; Patient 2, age16; and Patient 3, age 58.	Illumina NovaSeq 6000	3
EGAD00001006080	4 samples of 2000000 HSPCs in 2 ml media each were then prepared and treated for 24 hours: carboplatin-high: 150 µg/ml carboplatin carboplatin-low: 18.75 µg/ml carboplatin gemcitabine: 25 ng/ml gemcitabine control: no drug Single-cells were subsequently extracted using the ddSEQ™ Single-Cell Isolator (Bio-Rad) and later sequenced using the SureCell™ Whole Transcriptome Analysis 3' Library Prep Kit (Illumina) on the NextSeq 500 System (Illumina) using the NextSeq 500/550 High Output Kit v2.5 150 Cycles (Illumina) all following the manufacturer's instructions. The raw FASTQ-files are available as read 1 and read 2 files from 4 lanes for each sample giving a total of 16 FASTQ-files. The FASTQ-files were also processed following the ddSeeker (Romagnoli, D., et al., BMC Genomics 2018, doi:10.1186/s12864-018-5249-x) and Drop-seq (Macosko, E.Z., et al., Cell 2015, doi:10.1016/j.cell.2015.05.002) protocols for processing scRNA-seq data to yield the final digital gene expression (dge) for each cell of each sample. This has resulted in one dge text file and one dge summary text file per sample. These 8 text-files with dge data are found in the zip-compressed analysis data-file.	NextSeq 500	4
EGAD00001006081	PURPOSE: To determine the impact of basal-like and classical subtypes in advanced PDAC and to explore GATA6 expression as a surrogate biomarker. EXPERIMENTAL DESIGN: Within the COMPASS trial patients proceeding to chemotherapy for advanced PDAC undergo tumour biopsy for RNA sequencing. Overall response rate (ORR) and overall survival (OS) were stratified by subtypes and according to chemotherapy received. Correlation of GATA6 with the subtypes using gene expression profiling, in situ hybridization (ISH) were explored. RESULTS: Between December 2015-May 2019, 195 patients (95%) had enough tissue for RNA sequencing; 39 (20%) were classified as basal-like and 156 (80%) as classical. RECIST response data were available for 157 patients; 29 basal-like and 128 classical where the ORR was 10% vs. 33% respectively (p=0.02). In patients with basal-like tumours treated with modified FOLFIRINOX (mFFX) (n=22) the progression rate was 60% compared to 15% in classical PDAC (p= 0.0002). Median OS in the intention to treat population (n=195) was 9.3 months for classical vs. 5.9 months for basal-like PDAC (HR 0.47 95% CI 0.32-0.69, p=0.0001). GATA6 expression by RNAseq highly correlated with the classifier (p<0.001) and ISH predicted the subtypes with sensitivity of 89% and specificity of 83%. In a multivariable analysis, GATA6 expression was prognostic (p=0.02). In exploratory analyses, basal-like tumours, could be identified by keratin 5, were more hypoxic and enriched for a T cell inflamed gene expression signature. CONCLUSIONS: The basal-like subtype is chemoresistant and can be distinguished from classical PDAC by GATA6 expression.	Illumina HiSeq 2500 unspecified	101
EGAD00001006083	The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, the International Agency for Research on Cancer is coordinating the recruitment of 5000 individuals with cancer (colorectal, renal, pancreatic, oesophageal adenocarcinoma or oesophageal squamous cancers) across 5 continents to explore whether different mutational signatures explain marked variation in incidence. In brief, through an international network of collaborators around the world, biological materials are collected, along with demographic, histological, clinical and questionnaire data. Whole genome sequences of tumour-germline DNA pairs are generated at the Wellcome Trust Sanger Institute (40X and 20X depth respectively). Somatic mutational signatures are subsequently extracted by non-negative matrix factorisation methods and correlated with risk factors data. Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development. This dataset contains all the data available for this study on 2021-09-27.	Illumina NovaSeq 6000	520
EGAD00001006084	The dataset comprises of an aggregate level VCF (version 4.1), containing the somatic point and indel mutations found across the glioblastoma cohort (SweGBM-1, n=38 samples). The VCF file is in accordance with the HTS format specifications (https://samtools.github.io/hts-specs/).		38
EGAD00001006085	RNA sequencing of frozen tumor biopsies from patients with primary cutaneous CD8+ aggressive epidermotropic cytotoxic T-cell lymphoma. 6 samples. Illumina HiSeq 4000.	Illumina HiSeq 4000	6
EGAD00001006086	Whole-genome sequencing of frozen tumor biopsies from patients with primary cutaneous CD8+ aggressive epidermotropic cytotoxic T-cell lymphoma. 12 samples. Illumina HiSeq X-Ten.	HiSeq X Ten	12
EGAD00001006087	18 DLBCL genomes		18
EGAD00001006088	Somatic mutations accumulate in healthy tissues as we age, giving rise to cancer and potentially contributing to ageing. To study somatic mutations in non-neoplastic tissues, we developed a series of protocols to sequence the genomes of small populations of cells isolated from histological sections. Here, we describe a complete workflow that combines laser-capture microdissection (LCM) with low-input genome sequencing, whilst circumventing the use of whole-genome amplification (WGA). The protocol is subdivided broadly into 4 steps: tissue processing, LCM, low-input library generation and mutation calling and filtering. The tissue processing and LCM steps are provided as general guidelines which may require tailoring based on the specific requirements of the study at hand. Our protocol for low-input library generation utilises enzymatic rather than acoustic fragmentation to generate WGA-free whole-genome libraries. Finally, the mutation calling and filtering strategy has been adapted from previously published protocols to account for artefacts introduced via library creation. To date, we have used this workflow to perform targeted and whole-genome sequencing of small populations of cells (typically 100-1,000 cells) in thousands of microbiopsies from a wide range of human tissues. The low-input DNA protocol is designed to be compatible with liquid handling platforms and make use of equipment and expertise standard to any core sequencing facility. However, obtaining low-input DNA material via LCM requires specialized equipment and expertise. The entire protocol from tissue reception through whole-genome library generation can be accomplished in as little as a week, though 2-3 weeks would be a more typical turnaround time.	HiSeq X Ten Illumina NovaSeq 6000	18
EGAD00001006090	Serial samples from one AT-AML patient as described in publication Goldgraben et al Pediatric Blood & Cancer 2020. Whole exome sequencing of a AT-'germline' blood sample, one bone marrow sample (at AML diagnosis) and 3 AML blood samples. Library preped using the Illumina Nextera Rapid Capture Exome Enrichment Kit, and sequenced as PE150 on HiSeq4000. Provided: 5 BAM files (GRCh37); 2 VCF analyses (germline and somatic)	Illumina HiSeq 4000	5
EGAD00001006092	Then individuals from three families segregating esophagus atresia.		10
EGAD00001006093	Targeted panel sequencin on Illumina HiSeq X Ten of brainstem glioma primary tumor and blood samples	HiSeq X Ten	42
EGAD00001006094	RNAseq on Illumina HiSeq X Ten of brainstem glioma primary tumor sample	HiSeq X Ten	75
EGAD00001006095	RNA-seq data for three Glioblastoma stem cell (GSC) lines exposed to PRMT5 inhibitor and control samples.	Illumina HiSeq 2500	6
EGAD00001006096	The plasma samples and white blood cell samples were collected from 30 non-small-cell lung cancer patients and 3 healthy individuals. The solid tumor biopsy samples from 14 patients (a subset of the 30 patients) were collected. The cfDNA was extracted from their plasma samples using the QIAamp circulating nucleic acid kit from QIAGEN (Germantown, MD). The cfDNA WES library was constructed with the SureSelect XT HS kit from Agilent Technologies (Santa Clara, CA) according to the manufacturer’s protocol. In brief, 10ng of cfDNA was used as input material. After end repair/dA-tailing of cfDNA, the adaptor was ligated. The ligation product was purified with Ampure XP beads (Beckman-Coulter, Atlanta, GA) and the adaptor-ligated library was amplified with index primer in 10-cycle PCR. The amplified library was purified again with Ampure XP beads, and the amount of amplified DNA was measured using the Qubit 1xdsDNA HS assay kit (ThermoFisher, Waltham, MA). 700-1000 ng of DNA sample was hybridized to the Agilent SureSelect Human All Exon V6 (Agilent) capture library and pulled down by streptavidin-coated beads. After washing the beads, the DNA library captured on the beads was re-amplified with 10-cycle PCR. The final libraries were purified by Ampure XP beads. The library concentration was measured by Qubit, and the quality was further examined with Agilent Bioanalyzer before the final step of 2x150bp paired-end sequencing on the Illumina HiSeq X10 platform (Illumina) at an average coverage of 200. Whole-exome capture libraries of genomic DNA of the 30 non-small cell lung cancer patients were constructed via Roche SeqCap EZ Exome V3 (Roche); whole-exome capture libraries of genomic DNA of the 3 healthy individuals were constructed via Agilent SureSelect Human All Exon V6 (Agilent). Enriched exome libraries were sequenced on the Illumina HiSeq 3000 platform (Illumina) to generate 2x100bp paired-end reads at an average coverage of 200.	HiSeq X Ten Illumina HiSeq 3000	83
EGAD00001006097	We profiled 16 high-grade gliomas patient tumour samples by single-cell and single-nuclei RNA-seq and 3 normal-matched single-cell RNA-seq using 10X Chromium 3'. The fastq files are provided.	Illumina HiSeq 4000 Illumina NovaSeq 6000	21
EGAD00001006098	We profiled 18 high-grade gliomas patient tumor samples by bulk RNA-seq. The raw fastqs are provided.	Illumina HiSeq 2000 Illumina HiSeq 4000 Illumina NovaSeq 6000 unspecified	20
EGAD00001006099	We profiled 23 high-grade gliomas patient tumor samples and 4 normal-matched patient samples by whole exome sequencing. The raw fastq files are provided.	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000 unspecified	27
EGAD00001006100	We profiled 16 patient tumour samples by ChIP-seq. H3K27ac and Input are provided for 16 samples and H3K27me3 is provided for 14 samples. Among the 16 samples, 9 are G34WT and 7 are G34R/V. The raw fastq or bam files are provided.	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina NovaSeq 6000	78
EGAD00001006101	Paired end (47/98) and single end (51/98) shallow whole genome sequencing (sWGS) data for the identification of somatic copy number alterations (SCNA) and the estimation of tumor fractions in plasma DNA of colorectal cancer (CRC) patients.	Illumina MiSeq NextSeq 550	52
EGAD00001006102	The Genomic DNA Clean & Concentrator kit (ZYMO Research) was used to remove EDTA from the DNA samples. Sample libraries were prepared using 100 ng of input according to the KAPA HyperPlus Kit (Roche) using Unique Dual Index adapters (Integrated DNA Technologies, Inc.). Exomes were captured using the SeqCap EZ MedExome (Roche Nimblegen) according to SeqCap EZ HyperCap Library v1.0 Guide (Roche) with the xGen Universal blockers – TS Mix (Integrated DNA Technologies, Inc.). The amplified captured sample libraries were paired-end sequenced (2x100 bp) on the Novaseq 6000 platform (Illumina) and aligned to the hg19 reference genome using the Burrows-Wheeler Aligner (BWA).	Illumina NovaSeq 6000	10
EGAD00001006103	Mutation analysis of 17 genes (ALK, APC, BRAF, BRCA1, BRCA2, DPYD, EGFR, ERBB2, KIT, KRAS, MET, NRAS, PDGFRA, RET, ROS1, TP53, UGT1A1) in plasma DNA of CRC patients using the AVENIO ctDNA Targeted Kit.	NextSeq 550	26
EGAD00001006104	Mutation analysis in plasma samples with low ctDNA levels using a molecular barcoding technology, i.e. the single target approach SiMSen-seq (Simple, multiplexed, PCR-based barcoding of DNA for sensitive mutation detection using sequencing).	Illumina MiSeq NextSeq 550	53
EGAD00001006105	Targeted deep sequencing for the KRAS p.Gly12Asp, p.Gly12Val and p.Ala146Thr mutations in plasma samples of CRC patients.	Illumina MiSeq	27
EGAD00001006106	RNA was isolated using phenol-chloroform extraction followed by DNase digestion or using the Qiagen Allprep DNA/RNA kit and protocol (Qiagen, #80204). cDNA synthesis was done using the SuperScript II Reverse Transcriptase kit (Invitrogen). Quantitative real-time PCR was performed by using primers as described previously13,21 on the 7500 Fast Real-time PCR System (Applied Biosystems). Relative levels of gene expression were calculated using the ΔΔCt method	Illumina NovaSeq 6000	26
EGAD00001006107	Whole Exome Sequencing was performed on radical prostatectomy formalin-fixed paraffin-embedded sample pairs (n = 6). Library prep was done by exon capture using the Illumina Truseq Exome kit and sequenced as 75bp paired end on Illumina NextSeq 500. Sequences were aligned to the human genome (hg38) using BWA.	NextSeq 500	10
EGAD00001006108	RNA-Seq was performed on radical prostatectomy formalin-fixed paraffin-embedded sample pairs (n = 27). Library prep was done by removal of rRNA and sequenced as 75bp paired end on Illumina NextSeq 500. Sequences were aligned to the human genome (hg38) using STAR-Fusion.	NextSeq 500	27
EGAD00001006109	smMIP-Seq was performed on radical prostatectomy formalin-fixed paraffin-embedded sample pairs (n = 18). Library prep was done by capture of targeted sequences with single molecule (unique molecular index) tagged molecular inversion probes (MIP) and sequenced as 75bp paired end on Illumina NextSeq 500. Sequences were aligned to the human genome (hg38) using BWA.	NextSeq 500	36
EGAD00001006110	Each run contains single cell RNA-seq data from unbiased sampling of single cells from the indicated human tissue. Single cell suspensions were prepared using enzymatic dissociation followed by tituration. The samples were processed using the 10XChromium 3' v3 sequencing pipeline, sequenced on an Illumina NovaSeq 6000, and analyzed using the cellranger software and aligned to the human GRCh38 genome version 93.	Illumina NovaSeq 6000	11
EGAD00001006111	Targeted long-read nanopore sequencing. Abstract: Fusion genes are hallmarks of various cancer types and important determinants for diagnosis, prognosis and treatment. Fusion gene partner choice and breakpoint-position promiscuity restricts diagnostic detection, even for known and recurrent configurations. To accurately and impartially identify fusions, we developed FUDGE: FUsion Detection from Gene Enrichment. FUDGE couples target-selected and strand-specific CRISPR/Cas9 activity for fusion gene driver enrichment - without prior knowledge of fusion partner or breakpoint-location – to long-read Nanopore sequencing with the bioinformatics pipeline NanoFG. FUDGE has flexible target-loci choices and enables multiplexed enrichment for simultaneous analysis of several genes in multiple samples in one sequencing run. We observe on-average 665 fold breakpoint-site enrichment and identify nucleotide resolution fusion breakpoints - within two days. The assay identifies cancer cell line and tumor sample fusions irrespective of partner gene or breakpoint-position. FUDGE is a rapid and versatile fusion detection assay, providing unparalleled opportunity for diagnostic pan-cancer fusion detection.	GridION	17
EGAD00001006112	RNA-Sequencing of 27 functionally validated LSC and blast fractions from 9 AML patients. Three healthy hematopoietic stem and progenitor cells from age-matched controls.	Illumina HiSeq 2000	30
EGAD00001006113	In this study, we aim to characterise the landscape of mutation and clonal selection in the human bladder. The data in this study will be generated by whole-genome sequencing of laser-dissected microbiopsies from the bladder. The samples utilised in this study will include urothelium from transplant donors with no history of bladder cancer and cystectomy specimens from patients with bladder cancer. . This dataset contains all the data available for this study on 2020-05-05.	HiSeq X Ten	84
EGAD00001006114	In this study, we aim to characterise the landscape of mutation and clonal selection in the human bladder. The study includes targeted sequencing of laser-dissected microbiopsies from the bladder. The samples utilised in this study will include urothelium from transplant donors with no history of bladder cancer and cystectomy specimens from patients with bladder cancer. . This dataset contains all the data available for this study on 2020-05-05.	Illumina HiSeq 4000	1916
EGAD00001006115	In this study, we aim to characterise the landscape of mutation and clonal selection in the human bladder. The data in this study will be generated by whole-exome sequencing of laser-dissected microbiopsies from the bladder. The samples utilised in this study will include urothelium from transplant donors with no history of bladder cancer and cystectomy specimens from patients with bladder cancer. . This dataset contains all the data available for this study on 2020-05-05.	Illumina HiSeq 4000	103
EGAD00001006116	In this study, we aim to characterise the landscape of mutation and clonal selection in the human bladder. The data in this study will be generated by whole-genome sequencing of laser-dissected microbiopsies from the bladder. The samples utilised in this study will include urothelium from transplant donors with no history of bladder cancer and cystectomy specimens from patients with bladder cancer. . This dataset contains all the data available for this study on 2020-05-05.	Illumina NovaSeq 6000	24
EGAD00001006117	n this study, we aim to characterise the landscape of mutation and clonal selection in the human bladder. The study includes targeted sequencing of laser-dissected microbiopsies from the bladder. The samples utilised in this study will include urothelium from transplant donors with no history of bladder cancer and cystectomy specimens from patients with bladder cancer. . This dataset contains all the data available for this study on 2020-05-05.	Illumina NovaSeq 6000	575
EGAD00001006118	In this study we will perform targeted sequencing on the bulk samples of in vitro colonies. This dataset contains all the data available for this study on 2020-05-05.	Illumina HiSeq 4000 Illumina NovaSeq 6000	595
EGAD00001006119	25 Whole genome sequencing data cases	Illumina NovaSeq 6000	24
EGAD00001006120	Homologous recombination DNA repair deficiency and PARP inhibition activity in primary triple negative breast cancer. RNA-Seq data for paired baseline and end of treatment samples	Illumina HiSeq 2500	39
EGAD00001006121	This dataset contains the fastq files used for this study	Illumina HiSeq 4000	154
EGAD00001006122	This dataset contains all available targeted exon sequencing bam files from our study, "Activating AKT1 and PIK3CA mutations in metastatic castration-resistant prostate cancer". Patient identifiers are denoted by the first segment of the sample aliases (e.g. "P1"), and additional information is appended to reflect which serial sample is referenced (1st, 2nd, 3rd, etc.), and whether the sample represents cell-free DNA ("cfdna") or paired white-blood cell control ("WBC"). All samples were sequenced using Illumina technology.	Illumina HiSeq 4000 Illumina MiSeq	114
EGAD00001006123	3q-capture DNA sequencing was performed as we described previously 13. In summary, genomic DNA was fragmented using the Covaris shearing device (Covaris), and sample libraries were assembled following the TruSeq DNA Sample Preparation Guide (Illumina). After ligation of adapters and an amplification step, target sequences of chromosomal regions 3q21.1-q26.2 were captured using custom in-solution oligonucleotide baits (Nimblegen SeqCap EZ Choice XL). The design of target sequences was based on the human genome assembly hg19: chr3q21.1:126036241-130672290 - chr3q26.2:157712147-175694147. Amplified captured sample libraries were paired-end sequenced (2x100 bp) on the HiSeq 2500 platform (Illumina) and aligned against the hg19 reference genome using the Burrows-Wheeler Aligner (BWA)25	Illumina NovaSeq 6000	33
EGAD00001006124	The samples are from patients of multiple cancer types. The library preparation protocol was developed by the laboratory. The DNA libraries were then sequenced with 150bp paired-end reads.	HiSeq X Ten	388
EGAD00001006125	Whole exome sequencing of an alveolar rhabdomyosarcoma patient with RET germline mutation and subsequent analysis of potential therapeutic mechanisms associated with the patient's rare germline mutation. Patient sampels were sequenced from an initial biopsy and from a relapse biopsy, in addition to normal blood as the matched normal DNA. RNA sequencing was performed on the relapse sample as initial sample was unable to produce usable RNA for analysis.	Illumina HiSeq 4000	4
EGAD00001006126	Contains FASTQs for cells from 20 10x channels across 2 NovaSeq runs, 42 384-well plates across 6 NovaSeq runs, and 12 96-well plates across 4 NextSeq runs.	Illumina NovaSeq 6000 NextSeq 500	3
EGAD00001006127	Contains FASTQs for cells from 20 10x channels across 2 NovaSeq runs, 42 384-well plates across 6 NovaSeq runs, and 12 96-well plates across 4 NextSeq runs.	Illumina NovaSeq 6000 NextSeq 500	3
EGAD00001006128	Contains FASTQs for cells from 20 10x channels across 2 NovaSeq runs, 42 384-well plates across 6 NovaSeq runs, and 12 96-well plates across 4 NextSeq runs.	Illumina NovaSeq 6000 NextSeq 500	3
EGAD00001006129	Exome sequencing of 22 Pheochomocytoma/Paraganglioma (PPGL) primary tumors, both malignant and non-malignant. Tumor material was from snap-frozen (SF) or formalin-fixed-paraffin-embedded (FFPE) .	Illumina HiScanSQ NextSeq 500	22
EGAD00001006130	The COVID-19 pandemic urgently needs therapeutic and prophylactic interventions. Here we report the rapid identification of SARS-CoV-2 neutralizing antibodies by high-throughput single-cell RNA and VDJ sequencing of antigen-enriched B cells from 60 convalescent patients.	Illumina HiSeq 2500	1
EGAD00001006131	Here, we performed a characterisation of 12 tumours and matched normal samples from 3 syCRC patients by whole genome sequencing: Patient A (tumours A1 and A2), Patient B (tumours B1-B5), and Patient C (tumours C1-C5). Somatic SNVs, indels and stuructural variants were called.		3
EGAD00001006132	fastq files for shallow whole genome sequencing data as described in Mouliere et al, 2018. Files are those not included in STM2 i.e. these are largely non-ovarian cancer samples	Illumina HiSeq 4000	292
EGAD00001006133	Bulk RNA-seq for cALL patient-derived PDX samples	NextSeq 500	74
EGAD00001006134	Single cell RNA-seq for PT1-derived PDX samples	NextSeq 500	757
EGAD00001006135	Single cell RNA-seq for primary samples	NextSeq 500	285
EGAD00001006136	Single cell WGS (low pass) for chord blood samples	NextSeq 500	24
EGAD00001006137	Single cell WGS (low pass) for PT1-derived PDX samples	NextSeq 500	539
EGAD00001006138	Single cell WGS (low pass) for primary samples	NextSeq 500	444
EGAD00001006139	Biopsies from the terminal ileum and rectum of healthy individuals are digested on ice to single cells and processed for single-cell RNA-sequencing (10X Genomics and Illumina) This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2020-05-12.	Illumina HiSeq 4000 Illumina NovaSeq 6000	19
EGAD00001006141	16S sequencing data from 2259 Flemish Gut Flora Project (FGFP) samples	Illumina HiSeq 2500	2259
EGAD00001006142		Illumina HiSeq 2000	9
EGAD00001006143		Illumina HiSeq 2000	29
EGAD00001006144		Illumina NovaSeq 6000	7
EGAD00001006145		Illumina NovaSeq 6000	70
EGAD00001006146	Genomic rearrangement calls generated using Delly in VCF format from the CPCGene 666PG study		304
EGAD00001006147	Indel calls generated using Pindel in VCF format from the CPCGene 666PG study		304
EGAD00001006148	SNV calls generated using SomaticSniper in VCF format from the CPCGene 666PG study		304
EGAD00001006149	This is the data from the eQTLs InsPIRE study. This dataset includes RNAseq and genotypes from pancreatic islets and FAC sorted beta-cells, as well as RPKM values, covariates and cell count estimates.	Illumina HiSeq 2000	255
EGAD00001006150	SMARTer Stranded Total RNA-Seq method of human platelet-rich plasma, platelet-free plasma, urine, conditioned medium, and extracellular vesicles (EVs) from these biofluids. Including a titration experiment with spikes.	NextSeq 500	30
EGAD00001006151	Caracterization of somatic variants in patients with OpSCC	Ion Torrent PGM	51
EGAD00001006152	Bam files from WGS of PDAC samples described in: Transcription phenotypes of pancreatic cancer are driven by genomic events events during tumour evolution		-
EGAD00001006156	Dataset consisting of sequence data from 36 glioma patients. Data includes; -Whole exome sequencing of tumour tissue and matched germline -Shallow whole genome sequencing of urine cell-free DNA -Targeted capture sequencing of plasma and CSF cell-free DNA Additional sequencing data are provided from non-glioma and healthy controls	Illumina HiSeq 4000	213
EGAD00001006157	The impact of genetic variants on molecular pathways that give rise to neurodegenerative diseases such as Alzheimer's and Parkinson's is best elucidated in the appropriate cell types and molecular contexts. Existing studies have focused on bulk profiling of mixed cell types, but have ignored assaying genetic effects across development and cell differentiation. At the core of this proposal is the idea to use single-cell assays to study genetic effects during differentiation of dopaminergic and cortical neurons to identify the sequence of molecular events from variants to healthy and diseased cell states in a cell-specific manner. 1) This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2020-05-18.	Illumina HiSeq 4000	42
EGAD00001006158	The MYOSEQ project focuses on the application of next generation sequencing, in particular whole exome sequencing (WES), in a large cohort of patients with unexplained limb‐girdle weakness (LGW). Focusing on undiagnosed patients with a clearly defined clinical phenotype enables increased diagnostic rates for known genes (in particular Pompe disease, GNE related pathologies and other known LGMD subtypes) in this cohort, while the use of WES provides scope both for new gene discovery and for additional research into disease modifiers and genotype‐phenotype correlation with substantial cost effectiveness. The LGW patient cohort was collated by Newcastle University in collaboration with clinical centers across Europe. The sequencing was performed at the Broad Institute and jointly analyzed with Newcastle University.	HiSeq X Ten Illumina Genome Analyzer IIx Illumina HiSeq 2000	888
EGAD00001006159	Tumor and normal exomes for 51 MCL patients and tumor and normal genomes for 34 MCL patients.		170
EGAD00001006160	Inherited cardiac conditions (ICC) panel sequencing data of Egyptian healthy volunteers.	NextSeq 550	391
EGAD00001006161	This dataset comprises of 76 cancer and normal whole genomes obtained from 11 SI-NET patients, in the form of two fastq files (forward and reverse reads) for each genome containing sequences generated by Illumina NovaSeq 6000 system.	Illumina NovaSeq 6000	76
EGAD00001006162	In this study we will perform whole genome sequencing on in vitro colonies.	HiSeq X Ten Illumina HiSeq 4000 Illumina NovaSeq 6000	616
EGAD00001006163		Illumina HiSeq 2500 Illumina NovaSeq 6000	8
EGAD00001006164	BAM files of RNA sequencing (RNA-seq) experiment on multi-regional colorectal cancer (CRC) samples. 58 samples corresponding to 16 patients were sequenced and there is one BAM file for each sample.	Illumina HiSeq 4000	88
EGAD00001006165	BAM files of Whole Exome Sequencing (WES) experiment on multi-regional colorectal cancer (CRC) samples. 32 tumour and 16 normal samples corresponding to 16 patients were sequenced, which makes 32 tumour BAMs and 16 normal BAMs (48 BAMs in total).	Illumina HiSeq 4000	48
EGAD00001006166	Whole exome sequencing (WES) was performed on the matched tumor and organoid pairs from 7 cervical cancer patients. The DNA was sequenced on NovaSeq6000 platform with 8Gb sequencing coverage. WES data was mapped against human reference genome GRCh38 by using BWA (v0.7.5) mapping tool.	Illumina NovaSeq 6000	17
EGAD00001006167	This dataset contains the raw fastq-files and the VCF files of single cell targeted DNA sequening with the MissionBio Tapestri platform. This was performed on 8 male pediatric T-ALL cases: X09-XB37-XB47-XD83-XF91-XF97-XF100-XF121. For some patients we have timepoints during treatment: XF100 and XG121. XD83 is a patient that relapsed twice.	Illumina NovaSeq 6000	48
EGAD00001006170	Data supporting: "The mutREAD method detects mutational signatures from low quantities of cancer DNA." Perner et al. WGS, sWGS, WES, and reduced representation sequencing data tumour and normal samples BAM files	Illumina HiSeq 4000	48
EGAD00001006171	Germline blood DNA sequencing data generated in routine diagnostics of hereditary cancer using the I2HCP gene panel (~135 genes). There are 130 samples sequenced in a MiSeq machine and 108 sequenced in a HiSeq machine. There is a partial overlap between those two sets, meaning that some samples were sequenced in both machines. There is a strong enrichment in samples with copy-number variants (CNV), both single- and multi-exon, since this dataset was compiled for a benchmarking effort of CNV calling tools for genetic diagnostics.	Illumina HiSeq 2500 Illumina MiSeq	188
EGAD00001006172	Single-cell RNA sequencing was performed on a total of 20 PC specimens.	Illumina NovaSeq 6000	15
EGAD00001006173	Single-cell RNA sequencing was performed on bone marrow mononuclear cells from 8 acute myeloid leukemia patients at diagnosis. The profiling was performed using 10x Genomics 3' (5 samples) and 5' (3 samples) platforms. The raw data are available as fastq files.	Illumina HiSeq 2500 Illumina NovaSeq 6000	128
EGAD00001006175	Single Cell RNAseq of PBMC from renal cancer patients	Illumina HiSeq 4000	8
EGAD00001006176	Whole-genome sequencing (10X Genomics) of frozen tumor biopsies from patients with primary cutaneous anaplastic large cell lymphoma. 12 samples. Illumina HiSeq X-Ten.	HiSeq X Ten	12
EGAD00001006177	Whole-exome sequencing of matched frozen tumor biopsies/granulocytes from patients with primary cutaneous anaplastic large cell lymphoma. 7 paired tumor/germline samples. BGISEQ-500.	unspecified	14
EGAD00001006178	RNA sequencing of frozen tumor biopsies from patients with primary cutaneous anaplastic large cell lymphoma. 12 samples. Illumina HiSeq 4000.	Illumina HiSeq 4000	12
EGAD00001006179	These are some selected exomes from the first year of the Childrens Rare Disease Cohorts initiative at Boston Children's Hospital. Patients were drawn from the following cohorts: immunodeficiency, epilepsy, IBD, hearing loss, and orphan diseases. Raw sequencing was performed on Illumina NovaSeq 6000 machines, and aligned to hs37d5.	Illumina NovaSeq 6000	30
EGAD00001006180	This dataset contains three bam files. One normal blood sample and two matched FF and FFPE samples from the same metastatic prostate tumor.	HiSeq X Ten	3
EGAD00001006181	The dataset includes raw RNA-seq data (fastq files) for miRNAs of monocytes before and after 6-hour exposure to four different immune stimuli, measured in 200 African- and European-descent healthy donors from Belgium. The stimuli include ligands for TLR4 (LPS), TLR1/2 (Pam3CSK4) and TLR7/8 (R848) and to a human seasonal influenza A virus (IAV).	Illumina HiSeq 2000	977
EGAD00001006182	This dataset contains single cell RNA sequencing data of PBMC samples from 20 bladder cancer patients. cDNAs and single cell RNA libraries were prepared following manufacturer’s user guide (10x Genomics). Each library was sequenced in HiSeq4000 (Illumina) to achieve ~300 million reads following manufacturer’s sequencing specification.	Illumina HiSeq 4000	20
EGAD00001006183	25 high coverage complete genome sequences of southern African Khoe-San individuals. Bam file format.	Illumina HiSeq 2000	25
EGAD00001006184	linking MASTER H021-Cohort to EGAS0001004157	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000	10
EGAD00001006185	ALI culture bronchial cells and alveolar lung surgical resection scRNA-Seq	Illumina HiSeq 4000	16
EGAD00001006186	Transcriptomic (N = 18) and epigenomic (N = 6) characterization of macrophages using RNA-sequencing and ChIP-sequencing (bait = SP140) in the presence of absence of SP140 inhibition.	Illumina HiSeq 2500 Illumina HiSeq 4000	30
EGAD00001006187	This dataset contains BAM files for RNA-sequencing of stage I lung adenocarcinomas from Asian patients. In total, there are 107 patients and 107 tumor samples.	Illumina HiSeq 2500	107
EGAD00001006188	This dataset contains BAM files for whole exome-sequencing of stage I lung adenocarcinomas from Asian patients. In total, there are 113 patients and 262 samples, including 113 tumor samples, 113 adjacent normal samples and 36 buffy coat samples.	Illumina HiSeq 2500	262
EGAD00001006189	Paired WGS and RNA-Seq data of patients with multiple myeloma (MM) refractory to immunomodulatory agents (IMiDs) and proteasome inhibitors (PIs). We performed whole genome and transcriptome sequencing of 39 heavily pretreated RRMM patients with at least double refractoriness revealing complex structural changes and a high mutational load.	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 4000	116
EGAD00001006190	Single-cell transcriptomes for 10 hepatocellular carcinoma (HCC) patients from 21 sample of four relevant sites: primary tumor (T), portal vein tumor thrombus (P), metastatic lymph node (L) and non-tumor liver (N). Single cells were sequenced using Chromium Single Cell 3’ Library (10x Genomics).	Illumina NovaSeq 6000	21
EGAD00001006193	Tregs were sorted as CD4+CD25+CD127- cells from peripheral blood of 14 healthy individuals, 8 patients with systemic lupus erythematosus, 9 patients with rheumatoid arthritis, and 11 patients with multiple sclerosis. RNA was extracted and polyA libraries were prepared using the Illumina Truseq sample preparation kit v.2. Single-end sequencing was performed on NextSeq500.	NextSeq 500	42
EGAD00001006194	The risk of getting non-melanoma skin cancer varies over 40-fold across the body. Here we map mutations in normal skin in high and low risk sites in normal donors and those with an increased risk of skin cancer. The density of mutations varied widely, with evidence of positive and negative genetic selection. Regional differences in mutational signatures in high and low cancer risk sites and preferential selection of mutants of TP53 in high risk skin and FAT1 in lower risk skin were observed. 10% of clones had copy number changes in cancer associated genes and the largest had multiple driver mutations with loss of heterozygosity. In hair follicles, a proposed site of origin of skin cancers, mutations in the upper follicle resembled adjacent skin, but the lower follicle was sparsely mutated. We conclude cancer risk reflects the efficiency of transformation of oncogenic mutants rather than the density of mutant clones.	HiSeq X Ten Illumina HiSeq 2500 Illumina NovaSeq 6000	805
EGAD00001006195	RNA-seq data for 54 Glioblastoma stem cell (GSC) lines. Fastq files of the strand-specific paired-end RNA-seq data are available.	Illumina HiSeq 2500	54
EGAD00001006196	Additional Neuroblastoma whole genome sequencing data	Illumina HiSeq 2000	3
EGAD00001006197	This dataset contains RNA-seq (Illumina Hiseq 4000) from macroscopically preserved and lesioned OA subchondral bone of patients that underwent joint replacement surgery due to OA (N=48).	Illumina HiSeq 4000	48
EGAD00001006198	Whole genome sequencing data of 84 Nama individuals with KhoeSan ancestry from southern Africa in phased VCF format. Variants were called/phased as part of the African Genome Resource. The data has been aligned to GRCh37.		84
EGAD00001006199	ABACUS is a single arm phase 2 study that investigated 2 cycles of atezolizumab (1200mg Q3) prior to cystectomy in 95 patients with muscle invasive transitional cell cancer (T2-4N0M0). Pathological complete response (pCR) occurring in ≥20% of patients was the primary endpoint. Biomarker analysis on sequential tissue was a co-primary endpoint. This dataset includes the TPM and raw counts tables.		-
EGAD00001006200	ABACUS is a single arm phase 2 study that investigated 2 cycles of atezolizumab (1200mg Q3) prior to cystectomy in 95 patients with muscle invasive transitional cell cancer (T2-4N0M0). Pathological complete response (pCR) occurring in ≥20% of patients was the primary endpoint. Biomarker analysis on sequential tissue was a co-primary endpoint. This dataset includes clinical phenotype data.		-
EGAD00001006201	ABACUS is a single arm phase 2 study that investigated 2 cycles of atezolizumab (1200mg Q3) prior to cystectomy in 95 patients with muscle invasive transitional cell cancer (T2-4N0M0). Pathological complete response (pCR) occurring in ≥20% of patients was the primary endpoint. Biomarker analysis on sequential tissue was a co-primary endpoint. This dataset includes the processed data from FMOne.		-
EGAD00001006202	This case represented the genomic findings of a pediatric glioblastoma patient who underwent multiple surgical resections and treated with standard chemoradiation, as well as a novel recombinant poliovirus vaccine therapy. The results present the preservation of a STAG2 mutated clone, besides elimination and emergence of other clones with oncogenic mutations through disease progression under different treatment modalities. Although STAG2 deficiency comprises only a small subset of gliomas, this case adds clinical evidence to existing preclinical data supporting a role for STAG2 mutations in gliomagenesis and resistance to standard therapies.		3
EGAD00001006203	This data set contains the raw .fastq files from one RNA-sequencing experiment. Endothelial cells of the basilar artery and endothelial cells of the carotid artery were post-mortem derived with laser microdissection and sequenced. For this analysis both the basilar- and the carotid artery endothelial cells of eleven individuals were sequenced. For more details please see: DMA Hermkens et al. "Profiling the Unique Protective Properties of Intracranial Arterial Endothelial Cells" Acta Neuropathol Commun. 2019 Oct 14; PMID: 31610812.	NextSeq 500	22
EGAD00001006204		Illumina HiSeq 1500 MinION PromethION	5
EGAD00001006205	ABACUS is a single arm phase 2 study that investigated 2 cycles of atezolizumab (1200mg Q3) prior to cystectomy in 95 patients with muscle invasive transitional cell cancer (T2-4N0M0). Pathological complete response (pCR) occurring in ≥20% of patients was the primary endpoint. Biomarker analysis on sequential tissue was a co-primary endpoint. This dataset includes the raw RNA-seq data.	Illumina HiSeq 2500	148
EGAD00001006206	10xGenomics single-cell RNA sequencing of glioblastoma patient tumor	Illumina HiSeq 4000	25
EGAD00001006207	This dataset contains Mate Pair Sequencing data from 15 samples from 13 patients. Mate pair DNA library preparation was carried out using the Illumina MP v.2 reagents and protocol. In brief, fragmentation of genomic DNA was performed using a Hydroshear device to an insert size of 4.5 kb followed by sequencing with Illumina HiSeq 2000 instruments resulting in 30 Fastq files (paired end).	Illumina HiSeq 2000	15
EGAD00001006208	This dataset contains panel sequencing data from 33 samples. Targeted sequencing was performed by creating libraries using the Agilent SureSelect XT technology. Libraries were sequenced using molecular barcode-indexed ligation-based sequencing using a NextSeq500 (Illumina) instrument. Between three and six lanes per sample have been sequenced resulting in 262 Fastq files (paired end).	NextSeq 500	33
EGAD00001006209	The dataset contains two samples from one patient. As a representative FFPE tissue sample, ET174 was histologically iden- tified, targeted and microdissected with a puncher for nucleic acid extraction. RNA was extracted using the automated Maxwell system with the Maxwell 16 LEV RNA FFPE Kit (Promega), according to the manufacturer’s instructions. To evaluate FFPE RNA quality, we used the percentage of RNA fragments >200 nt fragment determination value (DV200). Only RNA samples with DV200 > 70% were included for sequencing on a NextSeq 500 (Illumina). Eight lanes have been sequenced resulting in 16 Fastq files (paired end).	NextSeq 500	2
EGAD00001006210	This dataset contains exome sequencing data from 21 samples. Sequencing of samples using whole-exome sequencing was per- formed by creating libraries using the IlluminaTruSeq exome enrich- ment kit following the manufacturer’s instructions after size selection. Size selection was performed by fractionation using a Covaris ultra- sonicator and subsequent selection was performed using a 1.5% gel Pippin Prep cassette (Sage Science). One lane per sample has been sequenced resulting in 42 Fastq files (paired end).	Illumina HiSeq 2000	21
EGAD00001006211	This dataset contains whole genome sequencing data from 59 samples. WGS libraries were prepared using the Illumina TruSeq Nano DNA LT Library Prep or TruSeq Nano DNA HT Library Prep Kit following the manufacturer’s instructions. In brief, 100 ng of genomic DNA was fragmented to approximately 350 bp using a Covaris ultrasonicator (Covaris). The fragmented DNA was then end-repaired, size-selected using magnetic beads, extended with an ‘A’ base on the 3′ end and ligated with TruSeq paired-end indexing adapters. Up to four lanes per sample have been sequenced resulting in 222 Fastq files (paired end).	HiSeq X Ten Illumina HiSeq 2000	59
EGAD00001006212	Mutation accumulation over time in normal somatic cells contributes to cancer development and is proposed as a cause of ageing. DNA polymerases POLE and POLD1 replicate DNA with high fidelity during normal cell divisions. However, in some cancers defective proofreading due to acquired mutations in the exonuclease domains of POLE or POLD1 causes markedly elevated somatic mutation burdens with distinctive mutational signatures. POLE and POLD1 exonuclease domain mutations also cause familial cancer predisposition when inherited through the germline. Here, we sequenced normal tissue DNA from individuals with germline POLE or POLD1 exonuclease domain mutations. Increased mutation burdens with characteristic mutational signatures were found to varying extents in all normal adult somatic cell types examined, during early embryogenesis and in sperm. Mutation burdens were further markedly elevated in neoplasms from these individuals. Thus human physiology is able to tolerate ubiquitously elevated mutation burdens. Indeed, with the exception of early onset cancer, individuals with germline POLE and POLD1 exonuclease domain mutations are not reported to show abnormal phenotypic features, including those of premature ageing. The results, therefore, do not support a simple model in which all features of ageing are attributable to widespread cell malfunction directly resulting from somatic mutation burdens accrued during life.	Illumina HiSeq 4000 Illumina NovaSeq 6000	211
EGAD00001006213	WES data of paired primary and metastatic tumors	HiSeq X Ten Illumina HiSeq 4000	179
EGAD00001006215	Human dnase1l3 deficiency-Mouse AAV samples	NextSeq 500	5
EGAD00001006216	Plasma DNA profile in DNASE1L3 deficiency	NextSeq 500 Sequel	37
EGAD00001006217	The dataset contains Whole Exome Sequencing data (BAM files) of 22 samples from HER2+ metastatic breast patients. For 9 of the 13 tumours samples there are paired controls available from normal tissue. There are 8 tumours samples that are from treatment-responder patients and 5 tumours samples from non responder patients.	unspecified	15
EGAD00001006218	This dataset contains miRNA-seq data from 10 patients. Small RNAs were isolated as described previously57,58 from fresh-frozen tumour material. In brief, total RNA was extracted using guanidinium isothiocyanate/phenol extraction followed by 3′-adaptor ligation of barcoded adenylated adaptors. Samples were pooled in two sets of five samples. Subsequently, gel electrophoresis was used to isolate small RNAs (19–35 nt) and purified using ethanol precipitation. Fragments were then amplified using standard PCR, isolated using gel electropho- resis and purified using ethanol precipitation. Samples were sequenced on a HiSeq 2000 v.4 machine resulting in 10 Fastq files.	Illumina HiSeq 2000	10
EGAD00001006219	This dataset contains DRIP-seq data from 2 patients. DNA–RNA hybrids were extracted from tissue derived from ETMR patient-derived xenograft (PDX) models (BT183) that were treated using topotecan or saline as described previously27. Tumours were subsequently frozen and pelleted using ultracentrifugation. DNA–RNA hybrids were extracted as described previously using the same protocol that is applied for cultured cells21. DNA was extracted using proteinase K followed by phenol–chloroform extraction and ethanol precipitation. Subsequently the DNA was fragmented using the restriction enzymes HindIII, EcoRI, BsrGI, XbaI and SspI (New England Biolabs). Digested DNA was subsequently incubated with the anti-DNA–RNA hybrid anti- body S9.6 (Merck, MABE1095) and immunoprecipitated using agarose beads. Bound DNA–RNA hybrids were eluted and incubated with pro- teinase K and cleaned with an additional phenol–chloroform–ethanol extraction. The DNA was subsequently sonicated and sequenced using a Hiseq 2000 machine with a 50-bp single-read protocol. Each treat- ment condition was performed in duplicate and both RNase H and the input was included as negative controls resulting in 10 Fastq files.	Illumina HiSeq 2000	2
EGAD00001006220	February 2020 data update (fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	HiSeq X Ten Illumina HiSeq 2500	28
EGAD00001006221	This dataset contains merged data from the 22 Hodgkin lymphoma and 5 reactive lymph node samples, including count data and cell cluster assignments.		28
EGAD00001006222	SPECTA Lung cancer RNA FASTQ files (Illumina TST170 targeted analysis)	Illumina HiSeq 2500	120
EGAD00001006223	SPECTA Lung cancer DNA FASTQ files (Illumina TST170 targeted analysis)	Illumina HiSeq 2500	154
EGAD00001006224	Whole Exome Sequencing data from a retrospective paediatric HIV-disease progression cohort defined on the World Health Organization's criteria for paediatric HIV progression. Data comprises of BAM and VCF files for 314 participants from 2 countries: Botswana and Uganda. DNA and RNA samples are linked in bio-repository.	Illumina HiSeq 2500	314
EGAD00001006226	Formalin-fixed, paraffin-embedded samples from 19 PSC-IBD-CRCs, 15 adjacent (non-tumour) mucosa samples and 18 non-mucosal DNA samples were collected via the nationwide network and registry of histo- and cytopathology in the Netherlands (PALGA). DNA was extracted for molecular analysis.	Illumina HiSeq 2500	52
EGAD00001006227	VCF for 87 Argentinean samples. Only SNPs (no indels) that passed the Affymetrix QC. Data from Luisi et al. 2020. Plos One. Fine-Scale Genomic Analyses Of Admixed Individuals Reveal Unrecognized Genetic Ancestry Components In Argentina. Reference Allele column does NOT contain reference allele from genome assembly.		87
EGAD00001006228	The following samples were generated from the patient samples used in the study: - Bulk DNA sequencing specifically targeting mutated sites of interest derived from clonal cell populations (288 monoclonal colonies - 2 replicates). - Targeted Muta-seq method (Patient P342: 2208 cells, Patient HRK: 1066 cells, Patient LAK: 618 cells, Patient P101: 1080 cells) - Smart-seq2 method (Patient P342 - 768 individual cells).	Illumina MiSeq NextSeq 500	4
EGAD00001006229	Data from a study of 148 samples from IPMNs, MCNs, and small associated invasive carcinomas from 18 patients using whole exome or targeted sequencing. Sequencing data from 77 samples out of the 148 samples in the complete study are available in this dataset, based on the permissions given by the participating patients and the informed consents.	Illumina HiSeq 2500 Illumina MiSeq	77
EGAD00001006230	Blood-based assays have shown increasing ability to detect circulating tumour DNA (ctDNA) in patients with early-stage cancer. However, detection of ctDNA in patients with non-small cell lung cancer (NSCLC) has continued to prove challenging. We performed retrospective analysis to quantify ctDNA levels in a cohort of 100 patients with early-stage NSCLC prior to treatment with curative intent. Where tumour tissue was available for whole exome sequencing, mutations identified were used to define patient-specific sequencing assays. For those 90 patients, plasma cell-free DNA was sequenced to high depth across capture panels targeting a median of 328 mutations specific to each patient. Data was analysed using Integration of Variant Reads (INVAR), detecting ctDNA in 66.7% of patients, including 52.7% (29 of 55) patients with stage I disease and >88% detection for patients with stage II and III disease (16/18 and 15/17). ctDNA was detected in plasma at fractional concentrations as low as 9.1x10-6, and in patients with tumour volumes as low as 0.23 cm3. A 36-gene sequencing panel (InVisionFirst-LungTM) was used to analyse plasma DNA in 27 samples including the 10 cases without tumour exome data, and detected ctDNA in 59% of samples tested (16 of 27). Across the entire cohort, detection rates were higher in squamous cell carcinoma patients compared to adenocarcinoma patients (81% vs. 59%). Detection of ctDNA prior to treatment was associated with significantly shorter time free from relapse, across all patients and in patient subgroups, with Hazard Ratios ranging from 2.25 to >11. Our analysis indicates that for patients with stage I NSCLC, the median ctDNA fraction in plasma is approx. 12 parts per million (0.0012%). This indicates the limits of detection that would be required for ctDNA-based liquid biopsies to detect ctDNA in the majority of patients with early-stage NSCLC.	Illumina HiSeq 4000	29
EGAD00001006231	Identification of patients with life-threatening diseases including leukemias or infections such as tuberculosis or COVID-19 is an important goal of modern precision medicine. However, there is an increasing divide between what is technically possible and what is allowed because of privacy legislation. We have recently illustrated that classical machine learning can identify leukemia patients based on their blood transcriptomes. To facilitate integration of any omics data from any data owner world-wide without violating privacy laws, we here introduce Swarm Learning (SL), a decentralized machine learning approach uniting edge computing, artificial intelligence (AI), blockchain and privacy protection without the need for a central coordinator thereby going beyond federated learning. To illustrate its feasibility, using more than 12,000 transcriptomes from peripheral blood mononuclear cells and more than 2,000 peripheral blood transcriptomes we demonstrate that SL of omics data distributed across different individual sites leads to disease classifiers that outperform those developed at individual sites. Yet, SL completely protects local privacy regulations by design. We propose this approach to noticeably accelerate the introduction of precision medicine.	NextSeq 500	650
EGAD00001006232	RNA, T cell receptor and B cell receptor single cell sequencing data generated on T and B cells derived from patients with CNS autoimmune disease	Illumina HiSeq 2500	4
EGAD00001006233	Whole genome sequencing for individualized cancer interpretation		94
EGAD00001006234	PolyA selection transcriptome profiling by high-throughput for individualized cancer interpretation		22
EGAD00001006235	Total RNA transcriptome profiling by high-throughput for individualized cancer interpretation		19
EGAD00001006236	Exome sequencing for individualized cancer interpretation		90
EGAD00001006237	Genome-wide cell-free DNA mutational integration enables ultra-sensitive cancer monitoring	HiSeq X Ten	167
EGAD00001006238	In this study, a total of 300 patients with MIBC receiving chemotherapy were included; 62 received NAC before cystectomy and 245 received first-line chemotherapy upon detection of locally-advanced (T4b) or metastatic disease. Treatment response, defined as pathological downstaging (< pTa,CIS,N0) after NAC or complete or partial response after first-line treatment (RECIST criteria). RNA-seq was performed using the QuantSeq kit FWD HT kit (Lexogen) using 500 ng input RNA from 121 tumor samples. Data provided here consist of 780 fastq files for RNA-seq.	NextSeq 500	121
EGAD00001006239	In this study, a total of 300 patients with MIBC receiving chemotherapy were included; 62 received NAC before cystectomy and 245 received first-line chemotherapy upon detection of locally-advanced (T4b) or metastatic disease. Treatment response, defined as pathological downstaging (< pTa,CIS,N0) after NAC or complete or partial response after first-line treatment (RECIST criteria). WES was performed using DNA from 165 tumors (76x median coverage) and associated germline DNA (46x median coverage). Data provided here consist of 5,828 fastq files for WES.	Illumina NovaSeq 6000 NextSeq 500	330
EGAD00001006241	This dataset includes 14 bulk RNA sequencing data (28 fastq files) in the study entitled "Three-dimensional human alveolar stem cell culture models reveal infection response to SARS-CoV-2". RNA sequencing library was generated with Truseq stranded total RNA Gold kit.	Illumina HiSeq 2500	14
EGAD00001006242	This dataset includes a total of 5 single cell RNA sequencing (scRNAseq) data of SARS-CoV-2 infected human alveolar stem cell culture models. Two scRNAseq data were obtained from SARS-CoV-2 infection with MOI of 1.0 (1 for control and 1 for infected case) and the other three were obtained from SARS-CoV-2 infection with MOI of 0.1 in the study entitled "Three-dimensional human alveolar stem cell culture models reveal infection response to SARS-CoV-2". The 10x Chromium Single Cell 3' Reagent kits were used to generate libraries.	Illumina HiSeq 2500	5
EGAD00001006243		HiSeq X Ten	2
EGAD00001006244	Phenotype data for shotgun metagenomic sequencing data of nasopharyngeal fluid for studying nasopharyngeal colonization dynamics with Streptococcus pneumoniae and associated antimicrobial-resistance in a South African birth cohort. https://www.ebi.ac.uk/ena/data/view/PRJEB37312		196
EGAD00001006245	RNA-seq data of the HCI011 and HCI011R models, GDC032 treated and control (total of 21 samples), from the paper: FOXM1 is a biomarker of resistance to PI3Kα inhibition in ER+ breast cancer that is detectable using metabolic imaging (Ros et al, 2020)	Illumina HiSeq 4000	21
EGAD00001006246	RNA-seq of skin from human subjects with and without lymphedema	Illumina HiSeq 2500	9
EGAD00001006247	Raw untargeted metabolomics profiled by Metabolon Inc. for 540 samples from healthy individuals. Files include sample names and run details which can be matched to their metagenomic sequencing samples from PRJEB11532 and PRJEB17643. Information regarding metabolite metadata is also available, including		3
EGAD00001006248	Longitudinal and germline exome sequencing analysis of a mother and son pair who both developed adult-onset diploid AML identified a novel germline missense mutation DNMT3A p.P709S.	Illumina HiSeq 2000	9
EGAD00001006249	This study aims to use RNAseq to identify differentially expressed transcripts in human melanoma cells that over-express the cell surface protein, LRRN4CL, relative to empty-vector control cells, to provide mechanistic insight into how LRRN4CL over-expression confers enhanced pulmonary metastatic colonisation abilities.	Illumina HiSeq 2500	48
EGAD00001006250	NABUCCO cohort 1 sequencing data. The dataset includes: * Whole exome DNAseq pre-treatment on tumor samples (n=24) matched with blood samples (n=24) * RNAseq pre-treatment on tumor samples (n=18) * RNAseq post-treatment on tumor samples (n=18). Not all pre-treatment samples are linked with pre-treatment samples * High coverage Whole exome DNAseq on pre-treatment tumor samples (n=3) matched with post-treatment metastasized lymph nodes isolated with laser microdissection (n=3) * All samples are labelled with the response phenotype (Complete Responder or Non-Complete Responder)	Illumina HiSeq 2500	69
EGAD00001006251	Multiregional whole-exome sequencing was done using 48 tumor samples (range: 4-10 tumor samples/patient) from 9 patients with adenocarcinomas of the stomach and gastroesophageal junction (GC)	Illumina HiSeq 4000	56
EGAD00001006253	WES from 51 cases initially diagnosed as Malignant Nerve Sheath Tumours (MPNST) and RNA sequencing data from 10 MPNST cases. Find more information in article: Lyskjær et al, 2020, J Pathol, "H3K27me3 expression and methylation status in histological variants of malignant peripheral nerve sheath tumours".	Illumina HiSeq 2500	98
EGAD00001006255	Chronic liver disease is associated with metabolic dysregulation, liver failure and hepatocellular carcinoma. We analysed somatic mutations from 1202 genomes across 32 liver samples, including normal controls, alcohol-related and non-alcoholic fatty liver disease. Five of 27 patients with liver disease carried hotspot driver mutations in FOXO1, the major transcription factor downstream of insulin signalling. FOXO1 mutations were independently acquired by up to 5 distinct clones within the same patient’s sample, and impaired insulin-mediated nuclear export of FOXO1. GPAM, which produces storage triacylglycerol from dietary calories, also had significant excess of mutations, similarly exhibiting convergent evolution within biopsies. Telomeres were shorter in diseased than normal liver, with attrition more pronounced in larger clones. Multiple independent acquisitions of drivers within one small liver sample imply that such mutations could affect hundreds of grams of tissue across the whole organ, potentially contributing to systemic metabolic dysfunction.	HiSeq X Ten Illumina NovaSeq 6000	1111
EGAD00001006257	Whole transcriptome RNA-Sequencing was performed on 148 bone marrow or peripheral blood samples of B-ALL patients.	NextSeq 500	148
EGAD00001006258	RNA paired end sequencing of 59 adrenocortical tumors and 4 controls.	NextSeq 500	63
EGAD00001006259	Chronic hepatitis C virus (HCV) infection is associated with CD8+ T-cell exhaustion characterized by limited effector functions and thus compromised anti-viral activity. Exhausted HCV-specific CD8+ T cells are comprised of memory-like and terminally exhausted CD8+ T-cell subsets. So far, little is not known about the molecular profile and fate of these cells after elimination of chronic antigen stimulation by direct acting antiviral therapy (DAA). Here, we report an antigen-driven molecular core signature underlying exhausted CD8+ T-cell subset heterogeneity in chronic viral infection with a progenitor/progeny relationship of memory-like and terminally exhausted HCV-specific CD8+ T cells via an intermediate stage. Furthermore, transcriptional profiling reveals that the memory-like cells remain after DAA-mediated cure while terminally exhausted HCV-specific CD8+ T-cell subsets are lost. Thus, the memory polarization of the overall HCV-specific CD8+ T-cell response after cure does not result from re-differentiation of exhausted T cells. Consequently, antigen elimination has little impact on the exhausted core signature of memory-like CD8+ T cells that remains clearly different from bona fide T-cell memory. These results identify a molecular signature of T-cell exhaustion that is imprinted like a chronic scar in HCV-specific CD8+ T cells even after HCV cure, highlighting the requirement of re-programming to elicit full effector potential of exhausted T cells.	NextSeq 500	19
EGAD00001006260	SPECTA Lung cancer VCF files		154
EGAD00001006261	Bam and fastq files from RNA-seq of PDAC samples used in the PCSI mismatch repair study	Illumina HiSeq 2500 unspecified	4
EGAD00001006262	Bam files from WGS of PDAC samples used in the PCSI mismatch repair study		-
EGAD00001006263	linking 3 samples out of EGAD00001002528 to EGAS0001004517	Illumina HiSeq 2000	3
EGAD00001006264	18 samples of RNA-Seq of serially passaged TIC-enriched spheres of colorectal cancer (CRC), sequenced on HiSeq2000 and HiSeq2500	Illumina HiSeq 2500	8
EGAD00001006265	WGS data of serially passaged TIC-enriched spheres of colorectal cancer	HiSeq X Ten	6
EGAD00001006266	WES data of serially passaged TIC-enriched spheres of colorectal cancer (CRC)	Illumina HiSeq 2000 Illumina HiSeq 2500	15
EGAD00001006268	RNAseq BAM files for Coding and non-coding mantle cell lymphoma driver mutations		102
EGAD00001006269	This dataset includes microRNA profiling of 61 early-passage metastatic melanoma cell lines. The data are provided as single-end small RNA seq fastq files.	Illumina HiSeq 2500	61
EGAD00001006270	This dataset includes transcriptome profiling of 68 early-passage metastatic melanoma cell lines. The data are provided as paired-end RNA seq fastq files.	Illumina HiSeq 3000	68
EGAD00001006271	This dataset includes whole exome profiling of 65 early-passage metastatic melanoma cell lines. The data are provided as BAM files for tumor and normal samples.	Illumina HiSeq 2000	126
EGAD00001006272	We retrospectively collected 150 non-metastatic, pretreatment, formalin-fixed, paraffin-embedded (FFPE) nasopharyngeal carcinoma (NPC) samples as validation cohort 1. Also, we prospectively collected 32 FFPE samples from NPC patients enrolled in a trial evaluating anti-PD-1 antibody as validation cohort 2. Total RNA was extracted and hybridised to an Affymetrix HTA 2.0 microarray. In this study, we investigated the immune status of the tumour microenvironment (TME) based on gene expression profiles to classify NPC into biologically distinct immune subtypes, and clarify their associations with prognosis and immunotherapy response.	unspecified	32
EGAD00001006273	We retrospectively collected 150 non-metastatic, pretreatment, formalin-fixed, paraffin-embedded (FFPE) nasopharyngeal carcinoma (NPC) samples as validation cohort 1. Also, we prospectively collected 32 FFPE samples from NPC patients enrolled in a trial evaluating anti-PD-1 antibody as validation cohort 2. Total RNA was extracted and hybridised to an Affymetrix HTA 2.0 microarray. In this study, we investigated the immune status of the tumour microenvironment (TME) based on gene expression profiles to classify NPC into biologically distinct immune subtypes, and clarify their associations with prognosis and immunotherapy response.	unspecified	150
EGAD00001006274	We retrospectively collected 150 non-metastatic, pretreatment, formalin-fixed, paraffin-embedded (FFPE) nasopharyngeal carcinoma (NPC) samples as validation cohort 1. Also, we prospectively collected 32 FFPE samples from NPC patients enrolled in a trial evaluating anti-PD-1 antibody as validation cohort 2. Total RNA was extracted and hybridised to an Affymetrix HTA 2.0 microarray. In this study, we investigated the immune status of the tumour microenvironment (TME) based on gene expression profiles to classify NPC into biologically distinct immune subtypes, and clarify their associations with prognosis and immunotherapy response.		32
EGAD00001006275	We retrospectively collected 150 non-metastatic, pretreatment, formalin-fixed, paraffin-embedded (FFPE) nasopharyngeal carcinoma (NPC) samples as validation cohort 1. Also, we prospectively collected 32 FFPE samples from NPC patients enrolled in a trial evaluating anti-PD-1 antibody as validation cohort 2. Total RNA was extracted and hybridised to an Affymetrix HTA 2.0 microarray. In this study, we investigated the immune status of the tumour microenvironment (TME) based on gene expression profiles to classify NPC into biologically distinct immune subtypes, and clarify their associations with prognosis and immunotherapy response.		150
EGAD00001006276	Whole genome sequencing data on D19-0702 (AUS1), presented in Martin et al. 2020 (AUS1). WGS (Illumina HiSeq) was performed at Kinghorn Centre for Clinical Genetics, Garvan Institute of Medical Research. Data was analyzed using the Seave bioinformatic analysis pipeline (https://www.seave.bio).	HiSeq X Ten	1
EGAD00001006278	HiChIP experiments with two sequencing libraries each. Illumina HiSeq 4000/2500.	Illumina HiSeq 4000	10
EGAD00001006279	ChIP-seq experiments: fastq files; both ChIP and Input for each sample. Illumina HiSeq 2500. ChIP-seq alignment files for trimmed, mapping q20 and nonredundant reads; both ChIP and Input for each sample. Software: Trim Galore v0.3.7; Bowtie 2 v2.1.0; samtools v1.7	Illumina HiSeq 2500	1
EGAD00001006280	Endometrial carcinoma, the most common gynecologic cancer, develops from endometrial epithelium which is composed of secretory and ciliated cells. Pathologic classification is unreliable and there is a need for prognostic tools. We used single cell sequencing to study organoid model systems derived from normal endometrial endometrium to discover novel markers specific for endometrial ciliated or secretory cells. We performed single cell sequencing on endometrial and ovarian tumours, and on organoids both treated with DBZ and normal and found both secretory-like and ciliated-like tumour cells.	NextSeq 550	18
EGAD00001006281	Temozolomide (TMZ) is an oral alkylating agent used for the treatment of glioblastoma and is now becoming a chemotherapeutic option in patients diagnosed with high-risk low-grade gliomas. The O-6-methylguanine-DNA methyltransferase (MGMT) is responsible for the direct repair of the main TMZ-induced toxic DNA adduct, the O6-Methylguanine lesion. MGMT promoter hypermethylation is currently the only known biomarker for TMZ response in glioblastoma patients. Here we show that a subset of recurrent gliomas carry MGMT genomic rearrangements that lead to MGMT overexpression, independently from changes in its promoter methylation. By leveraging the CRISPR/Cas9 technology we generated some of these MGMT rearrangements in glioma cells and demonstrated that they lead to TMZ resistance both in vitro and in vivo. Lastly we showed that such fusions can be detected in tumor-derived exosomes and could potentially represent an early detection marker of tumor recurrence in a subset of patients treated with TMZ.	Illumina HiSeq 2500	136
EGAD00001006282	We analyzed baseline and on-therapy tumor biopsies from 101 patients with advanced melanoma treated with nivolumab (anti-PD-1) alone or combined with ipilimumab (anti-CTLA-4). Analysis of whole transcriptome data showed that T cell infiltration and interferon-gamma signaling signatures corresponded most highly with clinical response to therapy, with a reciprocal decrease in cell cycle and WNT signaling pathways in responding biopsies. Clinical outcome differences were likely not due to differential melanoma cell responses to interferon-gamma, as 57 human melanoma cell lines exposed in vitro to this cytokine showed a conserved interferon-gamma transcriptome response unless they had mutations that precluded signaling from the interferon-gamma receptor. Therefore, the magnitude of the antitumor T cell response and the corresponding downstream interferon-gamma signaling are the main drivers of clinical response or resistance to immune checkpoint blockade therapy.	Illumina HiSeq 2000	54
EGAD00001006283	Whole exome sequencing of eight affected skin biopsies (“lesional”) from five giant CMN patients (age range, 4-58) with matching unaffected skin (not available in one patient) along with germline DNA. Agilent SureSelect V5+UTRs.	Illumina NovaSeq 6000	17
EGAD00001006284	We analyzed baseline and on-therapy tumor biopsies from 101 patients with advanced melanoma treated with nivolumab (anti-PD-1) alone or combined with ipilimumab (anti-CTLA-4). Analysis of whole transcriptome data showed that T cell infiltration and interferon-gamma signaling signatures corresponded most highly with clinical response to therapy, with a reciprocal decrease in cell cycle and WNT signaling pathways in responding biopsies. Clinical outcome differences were likely not due to differential melanoma cell responses to interferon-gamma, as 57 human melanoma cell lines exposed in vitro to this cytokine showed a conserved interferon-gamma transcriptome response unless they had mutations that precluded signaling from the interferon-gamma receptor. Therefore, the magnitude of the antitumor T cell response and the corresponding downstream interferon-gamma signaling are the main drivers of clinical response or resistance to immune checkpoint blockade therapy.	Illumina HiSeq 2000	70
EGAD00001006285	In the absence of recurrent gene mutations, evidence accumulates that epigenetic deregulation plays a prominent role in neuroblastoma biology. Here we provide genome wide H3K27ac profiles in 60 primary neuroblastoma samples.	Illumina HiSeq 2000	60
EGAD00001006286	In the absence of recurrent gene mutations, evidence accumulates that epigenetic deregulation plays a prominent role in neuroblastoma biology. Here we provide RNAseq profiles in 71 primary and relapse neuroblastoma samples.	Illumina HiSeq 2000	71
EGAD00001006287	NGS data of 12 patients enrolled in the Chinese Patient Assistance Program from multiple centers who received pemetrexed alone or combined with platinum as initial chemotherapy and continued pemetrexed maintenance therapy for advanced lung adenocarcinoma from November 2014 to June 2017.	HiSeq X Ten Illumina HiSeq 4000	12
EGAD00001006288	This is the raw data obtained from shallow whole-genome sequencing of plasma DNA (plasma-Seq) for calling of somatic copy number alterations as well as focal amplifications and deletions from patients with breast, colorectal and non-small cell lung cancer.	Illumina MiSeq NextSeq 550	48
EGAD00001006289	Targeted sequencing (t-NGS) of frozen advanced cancers tissue was established with three panels covering 395 to 560 candidate cancer genes. Sequencing was done using the 2x150-bp paired-end technology on the Illumina MiSeq and NextSeq500 platforms. The DNA libraries of all coding exons were done with the HaloPlex Target Enrichment System (Agilent, Santa Clara, CA, USA).	Illumina MiSeq NextSeq 500	735
EGAD00001006291	RNA sequencing data from breast cancers (n=18) and their matched HN tissues (n=36), healthy breast from cosmetic reduction mammoplasty (RM; n=5), and risk reducing mastectomies (RR, n=5), with peritumoral samples excised proximal to (TP, less than 2 cm) and distal from (TD, 5-10 cm) the primary tumor.	unspecified	66
EGAD00001006292	Single cell sequencing of fathers who have had children with autism and fathers who have had multiple children, but no children with autism. The data was process on a 10X Chromium and sequenced on a NextSeq	NextSeq 500	6
EGAD00001006293	Circulating tumor-derived DNA (ctDNA) can be used to monitor cancer dynamics noninvasively. Detection of ctDNA can be challenging in patients with low-volume or residual disease, where plasma contains very few tumor-derived DNA fragments. We show that sensitivity for ctDNA detection in plasma can be improved by analyzing hundreds to thousands of mutations that are first identified by tumor genotyping. We describe the INtegration of VAriant Reads (INVAR) pipeline, which combines custom error-suppression methods and signal-enrichment approaches based on biological features of ctDNA. With this approach, the detection limit in each sample can be estimated independently based on the number of informative reads sequenced across multiple patient-specific loci. We applied INVAR to custom hybrid-capture sequencing data from 176 plasma samples from 105 patients with melanoma, lung, renal, glioma, and breast cancer across both early and advanced disease. By integrating signal across a median of >105 informative reads, ctDNA was routinely quantified to 1 mutant molecule per 100,000, and in some cases with high tumor mutation burden and/or plasma input material, to individual parts per million. This resulted in median Area Under the Curve (AUC) values of 0.98 in advanced cancers, and 0.80 in early stage and challenging settings for ctDNA detection. We generalized this method to whole-exome and whole-genome sequencing, showing that the INVAR may be applied without requiring personalized sequencing panels, so long as a tumor mutation list is available. As tumor sequencing becomes increasingly performed, such methods for personalized cancer monitoring may enhance the sensitivity of cancer liquid biopsies.	Illumina NovaSeq 6000	65
EGAD00001006294	Whole genome sequencing data of isogenic ATRX/TP53 knockout clones of the neuroblastoma cell line SK-N-SH	HiSeq X Ten	5
EGAD00001006295	SNPs and INDELs of novel hereditary neurological disease genes in Mali using Beckman 8800, ABI 3730/3730xl.		10
EGAD00001006296	Like many childhood cancers, malignant rhabdoid tumours (MRT) are thought to arise from aberrant foetal development. Although MRT predominantly exhibit a mesenchymal phenotype, it has been suggested that the foetal root of MRT lies in neural crest development. Here, we combine phylogenetic analyses of MRT, single cell mRNA assays, and functional experiments in patient-derived MRT organoids, to define the embryological origin of MRT and explore therapeutic avenues that may drive MRT differentiation. Phylogenetic analyses from the distribution of somatic mutations revealed that MRT were related to neural crest-derived, but not to mesodermal tissues, providing direct evidence of the neural crest origin of MRT in humans. In MRT organoids, reversal of the principal driver event underpinning MRT, SMARCB1 loss, induced differentiation along mesenchymal pathways. Together, these findings placed MRT cells on a developmental trajectory of neural crest to mesenchyme conversion, and defined the transcriptional changes underpinning MRT differentiation. Searching perturbation databases for agents that mimic these mRNA changes, we identified HDAC and mTOR inhibition as potential differentiation agents. Treatment of MRT organoids with this drug combination induced proliferation arrest with transcriptional changes akin to SMARCB1 re-expression. Our study defines the embryological root of MRT and proposes a differentiation treatment for this often fatal childhood cancer.	HiSeq X Ten Illumina HiSeq 4000 Illumina NovaSeq 6000 NextSeq 500	30
EGAD00001006297	270 samples with ALK-positiv non-small cell lung cancer, targeted sequencing (198 kb panel size)	NextSeq 500	270
EGAD00001006298	268 samples with ALK-positiv non-small cell lung cancer, ultra-low coverage whole genome sequencing	Illumina HiSeq 4000	268
EGAD00001006299	Dataset consists of 207 glioma samples of WHO grades II, III and IV. Dataset consists 182 tumor derived genomic data of ~700 cancer-related and epigenetic-related genes with matched blood samples for 48 specimens. Dataset also consists transcriptomic data for 105 specimens. In total 335 bam files were deposited.	Illumina HiSeq 1500	335
EGAD00001006301	The AVENIO ctDNA Expanded Kit is a next-generation sequencing (NGS) liquid biopsy assay with a 77 gene panel (192 kb) containing genes in U.S. National Comprehensive Cancer Network (NCCN) Guidelines and emerging cancer biomarkers. This pan-cancer assay was applied to 48 plasma samples from patients with breast, colorectal and non-small cell lung cancer. After sequencing 150bp paired-end, reads were aligned to the hg38 genome with the AVENIO Oncology Analysis Software (version 2.0). These files are the deduplicated alignments generated by the analysis software used for subsequent variant, indel and CNV calling.	NextSeq 550	48
EGAD00001006302	WES data from clonal pahtologic bone marrow plasma cells of multiple myeloma patients. From each patient there are three samples, pathologyc bone marrow plasma cells at diagnosis, pathologyc bone marrow plasma cells after VRD treatment and T lymphocytes as germline control. There are 14 patients in total	Illumina NovaSeq 6000	10
EGAD00001006303	The valve methylation dataset consists of 12 bam files of human non-diseased valve tissue samples that are free from calcification (6 aortic and 6 mitral valves - matched; 10 males: 2 females; age range 42 – 64 years, mean age 52.2 years, SD 9.9682). Donor hearts are free from cardiovascular and valvular complications.	Illumina HiSeq 2500	12
EGAD00001006304	FASTQ files of the polyA+ (oligo-dT) RNA-Seq dataset from the POPS SGA (Small for Gestational Age) samples and their matched controls. The POP study placental biopsies were collected within 30 minutes of birth and flash frozen in RNAlater (ThermoFisher). For each biopsy, total placental RNA was extracted from approximately 5 mg of tissue using the “mirVana miRNA Isolation Kit” (Ambion) followed by DNase treatment (“DNA-free DNA Removal Kit”, Ambion). RNA quality was assessed with the Agilent Bioanalyzer and all the samples with RIN values ≥ 7.0 were used in the downstream experiments. RNA-libraries were prepared from 1g of total placental RNA with the TruSeq Stranded mRNA Library Prep Kit (Illumina) which captures polyA-tailed transcripts by oligo-dT beads, then pooled and sequenced (single-end, 50bp) using a Single End V4 cluster kit and Illumina HiSeq2500	Illumina HiSeq 2500	1
EGAD00001006305	To further understand the biology of Sonic hedgehog medulloblastoma and its molecular subtypes, we studied 250 human Shh-MB using strand-specific RNA sequencing. We identified novel alterations within the cAMP dependent pathway and found that 18% of tumors have genetic events that directly target the abundance and/or stability of MYCN. We also discovered an extensive network of fusions in focally amplified regions, and several loss-of-function fusions in tumor suppressor genes PTCH, SUFU and NCOR1. Molecular convergence on a core of specific genes by nucleotide variants, copy number aberrations, and gene fusions highlights key roles of specific pathways in the pathogenesis of Sonic hedgehog medulloblastoma.	Illumina HiSeq 2000 Illumina HiSeq 2500	82
EGAD00001006306	Purpose Exploratory analyses of CheckMate 066 and 067 trials were conducted to investigate associations of tumor mutational burden (TMB), a 4-gene inflammatory gene expression signature, and BRAF mutation status with tumor response, progression-free survival (PFS), and overall survival (OS) in patients with advanced melanoma. Patients and Methods Patients with known programmed death ligand 1 (PD-L1) expression and BRAF mutation status received nivolumab (NIVO) or dacarbazine in CheckMate 066 and either NIVO, ipilimumab (IPI), or NIVO+IPI in CheckMate 067. Whole exome sequencing and RNA sequencing were used to determine TMB and inflammatory gene expression signature scores, respectively. These biomarkers were evaluated in terms of their association with PFS and OS. Results In the NIVO, NIVO+IPI, and IPI arms of CheckMate 067, longer survival was associated with high (> median) versus low (≤ median) TMB with hazard ratios (HRs) (95% confidence interval [CI]) for PFS of 0.45 (0.30–0.65), 0.55 (0.38–0.81), and 0.60 (0.43–0.82), and for OS of 0.46 (0.30–0.71), 0.53 (0.34–0.82), and 0.52 (0.36–0.74), respectively. For NIVO-treated patients, these results were confirmed in CheckMate 066. A survival benefit was observed with high TMB and absence of BRAF mutation. Survival was associated with high versus low inflammatory signature scores with HRs (95% CI) for PFS of 0.56 (0.34–0.94), 0.40 (0.23–0.72), and 0.43 (0.27–0.70), and for OS of 0.37 (0.20–0.66), 0.38 (0.19–0.74), and 0.46 (0.27–0.79), in the NIVO, NIVO+IPI, and IPI arms, respectively. Weak correlations were observed between PD-L1, TMB, and the inflammatory signature. Conclusions Combined assessment of TMB, inflammatory gene expression signature, and BRAF mutation status may be predictive for response to immunotherapy in advanced melanoma.	Illumina HiSeq 2500	38
EGAD00001006307	RNA-SEQ for the Caldas Lab breast cancer PDTX collection. This includes both single and paired end runs	Illumina HiSeq 4000	117
EGAD00001006309	Purpose Exploratory analyses of CheckMate 066 and 067 trials were conducted to investigate associations of tumor mutational burden (TMB), a 4-gene inflammatory gene expression signature, and BRAF mutation status with tumor response, progression-free survival (PFS), and overall survival (OS) in patients with advanced melanoma. Patients and Methods Patients with known programmed death ligand 1 (PD-L1) expression and BRAF mutation status received nivolumab (NIVO) or dacarbazine in CheckMate 066 and either NIVO, ipilimumab (IPI), or NIVO+IPI in CheckMate 067. Whole exome sequencing and RNA sequencing were used to determine TMB and inflammatory gene expression signature scores, respectively. These biomarkers were evaluated in terms of their association with PFS and OS. Results In the NIVO, NIVO+IPI, and IPI arms of CheckMate 067, longer survival was associated with high (> median) versus low (≤ median) TMB with hazard ratios (HRs) (95% confidence interval [CI]) for PFS of 0.45 (0.30–0.65), 0.55 (0.38–0.81), and 0.60 (0.43–0.82), and for OS of 0.46 (0.30–0.71), 0.53 (0.34–0.82), and 0.52 (0.36–0.74), respectively. For NIVO-treated patients, these results were confirmed in CheckMate 066. A survival benefit was observed with high TMB and absence of BRAF mutation. Survival was associated with high versus low inflammatory signature scores with HRs (95% CI) for PFS of 0.56 (0.34–0.94), 0.40 (0.23–0.72), and 0.43 (0.27–0.70), and for OS of 0.37 (0.20–0.66), 0.38 (0.19–0.74), and 0.46 (0.27–0.79), in the NIVO, NIVO+IPI, and IPI arms, respectively. Weak correlations were observed between PD-L1, TMB, and the inflammatory signature. Conclusions Combined assessment of TMB, inflammatory gene expression signature, and BRAF mutation status may be predictive for response to immunotherapy in advanced melanoma.	Illumina HiSeq 2500	32
EGAD00001006311	SF11940 snATAC Sequencing.Anaplastic Astrocytoma, IDH-mutant. Tumor location: Left Frontal. Age: 29. Sex: Male .	Illumina NovaSeq 6000	1
EGAD00001006312	SF10679 snATAC Seq. Oligodendroglioma, Anaplastic (WHO gr. 3). Tumor Location: Frontal Age: 43. Sex: Male.	Illumina NovaSeq 6000	1
EGAD00001006313	SF11310 Oligodendroglioma, IDH-mutant.Tumor Location: Frontal. Age:22. Sex: Female	Illumina NovaSeq 6000	1
EGAD00001006314	SF10320 Unknown.	Illumina NovaSeq 6000	1
EGAD00001006315	SF12374 snATAC Oligodendroglioma. Tumor Location: Right frontal. Age: 33. Sex: Male.	Illumina NovaSeq 6000	1
EGAD00001006316	SF10619 snATAC Oligodendroglioma (WHO gr. 2).Tumor Location: Parietal. Age: 52. Sex: Male	Illumina NovaSeq 6000	1
EGAD00001006317	SF4007 snATAC Seq. Oligodendroglioma (WHO gr. 2) Tumor Location: Left frontotemporal. Age:33. Sex: Female.	Illumina NovaSeq 6000	1
EGAD00001006318	Astrocytoma (WHO gr. 2)	Illumina NovaSeq 6000	2
EGAD00001006319	SF10207 snATAC Seq. Oligodendroglioma, Anaplastic (WHO gr. 3). Tumor Location:Frontal. Age:43. Sex: Male	Illumina NovaSeq 6000	1
EGAD00001006321	RNA sequencing during time series	Illumina HiSeq 2000	58
EGAD00001006322	In the current study we report for the first time the unique collection of 6 leukemias and two sarcomas from XP-C. Comprehensive WGS-based mutational analysis provides genetic explanation for the increased incidence of leukemia in XP-C and describes an unique mutational process in internal tumors associated with NER deficiency. Raw data are provided in FASTQ format and variant analysis as VCF files.	Illumina HiSeq 2500 unspecified	15
EGAD00001006324	Exome sequences of primary tumor/metastatic/germline DNA trios. Tumoral and germline samples were sequenced to an expected depth of respectively 150M and 50M reads in order to obtain differential depth (>100X versus >30X). Submitted data are paired-end fastq files.	Illumina HiSeq 2000	81
EGAD00001006325	Single cell full transcriptome sequencing of CD19 CAR T-cell infusion products used for standard of care treatment for relapsed/refractory large B-cell lymphoma.	Illumina HiSeq 4000	24
EGAD00001006326	The SARS-CoV-2 pandemic has led to increasing numbers of COVID-19 patients all over the world. Aetiopathologies range from no symptoms, mild flu-like to severe cases succumbing to respiratory failure. Reports on a dysregulated immune system in the severe cases, showing similarities to cytokine release syndrome, calls for better characterization and understanding of the changes in the immune system as well as their variance across COVID-19 patients in order to be able to design according to host-directed therapies. Here, we profiled blood transcriptomes of 39 COVID-19 patients and 10 control donors. Enriched granulocyte signatures in whole blood samples were verified in granulocyte samples from 49 COVID-19 patients in a second cohort.	NextSeq 500	79
EGAD00001006327	Single cell hybrid-capture targeted sequencing of CD19 CAR T-cell infusion products used for standard of care treatment for relapsed/refractory large B-cell lymphoma.	Illumina MiSeq	24
EGAD00001006328	Tumor and matching normal exomes for 28 GZL cases (n=56) and targeted capture sequencing for 42 GZL cases with 3 matching normals and 2 pooled normals (n=47)		103
EGAD00001006329	RNA-Seq samples from the BELOB clinical trial study to find transcriptome associations with response to Bevacizumab and CCNU in glioblastoma patients	Illumina HiSeq 2500	96
EGAD00001006330	Whole RNA-sequencing of CD34+ cells and neutrophils derived from MPN patients before hydroxycarbamide treatment and after 9-months of treatment. CD34+ cells= 5 patients, 10 samples; neutrophils= 7 patients, 14 samples. Fastq files provided.	Illumina HiSeq 4000	16
EGAD00001006331	Biopsies from the terminal ileum and rectum of healthy individuals are digested on ice to single cells and processed for single-cell RNA-sequencing (10X Genomics and Illumina) This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/.		1
EGAD00001006332	This dataset contains single cell RNA sequencing data from Organoids grown in high nutrient (H) and low nutrient (L) medium. Organoids were grown for 110 days. Organoids were grown from a patient IPS cell line with a heterozygous mutation in TSC2 (Patient 1 TSC2+/- iPSCs) and an isogenic control cell line (TSC2+/+).	NextSeq 550	6
EGAD00001006333	This dataset contains whole genome sequencing data of a patient IPS cell line with a heterozygous mutation in TSC2 (Patient 1 TSC2+/- iPSCs) . This dataset further contains whole genome sequencing data of two tumors showing CNLOH. Tumors were isolated from organoids grown using the patient IPS cell line (Patient 1 TSC2+/- iPSCs )	Illumina NovaSeq 6000	3
EGAD00001006335	RNA-seq data for ALL patients as described in "The application of RNA sequencing for the diagnosis and genomic classification of pediatric acute lymphoblastic leukemia"	Illumina HiSeq 4000	133
EGAD00001006336	Paired whole exome sequencing data of the HIPO head and neck cancer (HNC) (n=83), using Agilent SureSelect V4+UTRs and V6+UTRs with the sequencing platforms HiSeq2000 and HiSeq2500. The reads were aligned to hg19. This is part of project H019.	Illumina HiSeq 2000 Illumina HiSeq 2500	166
EGAD00001006337	The human placenta harbours chromosomal aberrations that are absent from the fetus in one to two percent of pregnancies. This confined mosaicism suggests that embryonic genetic bottlenecks exist, which phylogenetically segregate placental tissue. Here, we studied the somatic genetic landscape of human placentas by whole genome sequencing of 86 placental biopsies and of 106 microdissections.	HiSeq X Ten Illumina NovaSeq 6000	278
EGAD00001006338	Whole genome sequencing data (Illumina HiSeq and NovaSeq) of clonal cultures derived from pediatric human bone marrow-derived hematopoietic stem and multipotent progenitor cells (in total 44 samples from 10 donors) and bulk pediatric acute myeloid leukemia blasts (in total 6 samples from 6 patients) to study the mutation accumulation.	HiSeq X Ten Illumina NovaSeq 6000	118
EGAD00001006339	To investigate the immune response and mechanisms associated with severe COVID-19, we performed single-cell RNA-seq on nasopharyngeal and bronchial samples from 19 clinically well-characterized patients with moderate or critical disease and from 5 healthy controls. We identified airway epithelial cell types and states vulnerable to SARS-CoV-2 infection. In COVID-19 patients, epithelial cells showed an average threefold increase in expression of the SARS-CoV-2 entry receptor ACE2, which correlated with interferon signals by immune cells. Compared with moderate cases, critical cases exhibited stronger interactions between epithelial and immune cells, as indicated by ligand–receptor expression profiles, and activated immune cells , including inflammatory macrophages expressing CCL2, CCL3, CCL20, CXCL1, CXCL3, CXCL10, IL8, IL1B and TNF . The transcriptional differences in critical cases compared with moderate cases likely contribute to clinical observations of heightened inflammatory tissue damage, lung injury and respiratory failure. Our data suggest that pharmacologic inhibition of the CCR1 and/or CCR5 pathways may suppress immune hyperactivation in critical COVID-19.	Illumina NovaSeq 6000	36
EGAD00001006340	Dataset contains paired-end Whole Exome sequencing data from 2 glioma patients (1 oligodendroglioma and 1 astrocytoma) , derived cultured cells, and derived murine xenografts.		28
EGAD00001006341	This dataset contains high-throughput RNA-sequencing of 14 samples, each sample comprising oligodendrocytes derived from human induced pluripotent stem cells, from individuals with and without a balanced t(1;11) translocation which substantially increases risk of major mental illness. 5 samples derive from 2 control individuals, and 9 samples from 3 individuals carrying the translocation. Libraries were prepared from each total-RNA sample using the TruSeq Stranded Total RNA with Ribo-Zero kit. Libraries were then sequenced using the NextSeq 500/550 High-Output v2 (150 cycle) Kit on the NextSeq 550 platform. Raw paired-end sequencing data is stored in two FASTQ files per sample.	NextSeq 550	14
EGAD00001006342	This dataset was used to characterise T cell gene expression and clonality at sites of active inflammation within the joints of psoriatic arthritis (PsA) patients, and to compare these results with T cells from the peripheral blood of those same patients. Freshly sorted CD45RA negative CD3+CD4+ and CD3+CD8+ single cells from four patients were individually flow sorted into 96-well full-skirted plates (Eppendorf) containing 10µL of a 2% Dithiothreitol (DTT, 2M Sigma-Aldrich), RTL lysis buffer (Qiagen) solution. Cell lysates were sealed, mixed and spun down before storing at -80 ºC. Paired-end multiplexed sequencing libraries were prepared following the Smart-seq 2 protocol using the Nextera XT DNA library prep kit (Illumina). A pool of barcoded libraries from four different plates were sequenced across two lanes on the Illumina HiSeq 2500.	Illumina HiSeq 2500	4703
EGAD00001006343	Whole genome sequencing of HSPC and SI clones of 3 disomy- and 3 trisomy 21 fetuses samples and 2 TMD samples (Novaseq 6000 samples). 22 disomy clones, 20 trisomy clones were included in this experiment. 11 bulk samples were also included.	Illumina NovaSeq 6000	53
EGAD00001006344	Additional Neuroblastoma whole genome sequencing data	Illumina HiSeq 2000	30
EGAD00001006345	Raw FastQ Files of 69 samples of endometrial tissue from uterus (rudiments) of patients diagnosed with MRKH Type 1/2 or healthy controls. Each sample consists of 2 lanes paired-end RNA sequencing data.	Illumina NovaSeq 6000	69
EGAD00001006346	The dataset consists of a multisample VCF (version 4.1) and the corresponding annotated MAF file, containing the somatic point mutations found from exome sequencing across the high-grade T1 bladder cancer cohort (HGT1, n=61 samples). The VCF file is in accordance with the HTS format specifications (https://samtools.github.io/hts-specs/). The dataset comprises also a CSV file with clinical data.		61
EGAD00001006349	Data supporting: “Multi-omic cross-sectional cohort study of pre-malignant Barrett’s esophagus reveals structural variation and retrotransposon activity occur early in cancer evolution.” Katz-Summercorn, Jammula et al. WGS (BAM files)	Illumina HiSeq 2000	1
EGAD00001006350	Human organoids recapitulating the cell-type diversity and function of their target organ are valuable for basic and translational research. We developed light-sensitive human retinal organoids with multiple nuclear and synaptic layers, and functional synapses. We sequenced the RNA of 285,441 single cells from these organoids at seven developmental time points and from the periphery, fovea, pigment epithelium and choroid of light-responsive adult human retinas.	Illumina HiSeq 2500	124
EGAD00001006351	Here, we profiled the gut microbiota in a discovery (n = 1,011) and validation (n = 484) cohort comprising Swedish subjects naive for diabetes treatment and grouped by glycemic status.	Illumina HiSeq 4000	1495
EGAD00001006352	We performed whole genome sequencing to detect possible off-target mutations induced by prime editing. Liver organoids, derived from a healthy control, were transfected with either control (GFP) plasmids or prime editing plasmids (GFP+PE2+pegRNA+nickRNA) to induce a 6-bp deletion in CTNNB1. One control and two prime-edited organoid lines were clonally expanded from single cells. High-throughput sequencing was performed on the complete genomic DNA isolated from these clonal lines, as well as the starting culture (bulk). After correction for germline mutations in the starting culture, new mutations in the control and prime-edited lines were compared. The same approach was followed in small intestinal organoids, derived from a patient with disease-causing 3-bp deletion in DGAT1. In these small intestinal organoids, prime editing was used to insert the 3 missing nucleotides. Two corrected clones were compared to one control clone.	Illumina NovaSeq 6000	8
EGAD00001006353	Data supporting: “Multi-omic cross-sectional cohort study of pre-malignant Barrett’s esophagus reveals structural variation and retrotransposon activity occur early in cancer evolution.” Katz-Summercorn, Jammula et al. RNAseq (BAM files)	Illumina HiSeq 2000	1
EGAD00001006354	Phenotypic data for 475 human samples, including: Demographics Anthropometrics Diet data Clinical data Time of day Season in which serum sample was taken		1
EGAD00001006355	Whole Transcriptiome Rnaseq of 25 UPS samples - raw FastQ sequences, 125x2 nc Paired End Reads, min 30M PE, HiSEq technology	Illumina HiSeq 2000	25
EGAD00001006356	RNA-seq of Bone Metastasis from breast and prostate cancer (4 breast and 5 prostate samples). Dataset contains BAM files from RNA-seq performed using Illumina HiSeq 2500.	Illumina HiSeq 2500 Illumina NovaSeq 6000 Ion Torrent S5 XL	288
EGAD00001006357	Dataset with 81 whole exome sequences from Iberian Roma samples.	unspecified	81
EGAD00001006358	VCF file with genome-wide data for 62 Iberian Roma samples.		62
EGAD00001006359	A set of 56 EpCAM-positive cells derived from bone marrow aspirates of breast cancer patients or patients without a cancererous disease (30 cells from 21 M0-stage and 11 cells from five M1-stage breast cancer patients, 15 cells from seven non-cancer patients serving as controls). EpCAM-positive cells from breast cancer patients were considered disseminated tumor cells as they harbored copy number alterations and showed high expression of the epithelial marker EpCAM and the mammary luminal progenitor marker KIT in comparison to EpCAM-positive bone marrow cells from non-cancer patients. Paired-end RNA-Sequencing of the samples was performed on Illumina NovaSeq6000, raw data are provided in the Fastq format.	Illumina NovaSeq 6000	56
EGAD00001006360	Single-cell RNA sequencing was performed on bone marrow mononuclear cells of 2 acute myeloid leukemia patients at refractory stage. The profiling was performed using 10x Genomics Chromium Single Cell 3ʹ Gene Expression platform. The raw data are available as fastq files.	Illumina HiSeq 2500	2
EGAD00001006362	Sequencing data from patients with bladder cancer. BAM files from targeted DNA sequencing of bladder cancer driver genes in 344 circulating tumor DNA and tumor tissue samples. BAM files from whole exome sequencing of 49 circulating tumor DNA and tumor tissue samples. Paired FASTQ files from RNA sequencing of 86 tumor tissue samples.	Illumina HiSeq 2500 unspecified	344
EGAD00001006363	The hematological malignancy multiple myeloma (MM), also called Kahler's disease or plasma cell (PC) myeloma, is characterized by a clonal expansion of PCs originating in the bone marrow (BM). The expansion of these cells leads to an overproduction of antibodies and results in typical symptoms such as anemia, renal failure and bone lesions. All cases of MM are preceded by the asymptomatic, non-malignant pre-stage monoclonal gammopathy of undetermined significance (MGUS). Of all MGUS patients, only 1% per year will progress to MM. Despite efforts to elucidate the molecular mechanisms underlying the MGUS-to-MM progression, its pathogenesis still remains largely unknown. Additionally, the genetic profiles of MGUS patients have only been limitedly investigated due to the only incidental finding of MGUS, the difficulties in BM sampling and isolating a sufficient number of aberrant PCs from the BM aspirates of MGUS patients. Consequently, reliable biomarkers to individually predict which MGUS patients will progress to MM and which will not, are lacking. Therefore, it is highly required to study the molecular pathogenesis of MGUS and the role of genetic events in relation to the malignant transformation to MM.	Illumina NovaSeq 6000	42
EGAD00001006364	DNA extraction from human stool samples was performed at the Center for Microbiome Innovation (CMI) at University of California, San Diego. DNA sequencing libraries were prepared using Nextera XT (Illumina). Shotgun DNA sequencing was performed on the Illumina HiSeq4000 platform. Raw fastq reads were quality-checked. Skewer (version 0.2.2) was utilized with the paired-end mode. Human reads were identified and removed by Bowtie2 mapping against the human genome reference (hg19), followed by bam2fastq with --unaligned --no-aligned --force options.	Illumina HiSeq 4000	162
EGAD00001006365	Case report of an ER+ Her2- breast cancer patient. Whole exome and transcriptome sequencing at time of diagnosis and relapse, targeted DNA sequencing of a liver met	Illumina HiSeq 2500	6
EGAD00001006366	PCa-LINES: rRNA-minus RNA-seq of PCa cell-lines (VCaP & PC346c) and 4 additional patient samples	Illumina HiSeq 2500	6
EGAD00001006367	To study the evolution of DNA methylation at genome level and methylation intra-tumor heterogeneity (ITH) during early lung carcinogenesis, we performed multiregional reduced representation bisulfite sequencing (RRBS) of 127 resected lung samples from 39 patients using single end library Hiseq3000.	Illumina HiSeq 3000	127
EGAD00001006368	This dataset consists of 106 bam files. Each sample from 10-20 consecutive patient extractions were combined into one DNA pool, generating a total of 106 DNA pools. We sequenced 11 genes implicated in hereditary breast cancer using the SureSelect Custom kit.	Illumina HiSeq 2500	106
EGAD00001006369	Whole genome sequencing of tumour-normal pairs in eight patients with clinically localised disease undergoing prostatectomy. A bespoke DNA capture and amplification panel against the highest prevalence, highest confidence aberrations for each individual was designed and used to interrogate ctDNA isolated from plasma prospectively obtained pre- and post- (24 hours and 6 weeks) surgery. Tagged-amplicon deep sequencing (TAm-Seq) across the TP53 gene in ctDNA in a cohort of 189 individuals.	HiSeq X Ten Illumina MiSeq NextSeq 500	224
EGAD00001006370	exome sequencing files from 25 alopecia areata samples from spain.	Illumina HiSeq 2500	26
EGAD00001006371	Exome sequencing data for 14 Vitiligo samples	Illumina HiSeq 2500	14
EGAD00001006373	Data supporting: “Longitudinal tracking of 97 esophageal adenocarcinomas (EAC) using liquid biopsy sampling.” Ococks, Frankell, Masque Soler et al. ctDNA (BAM files) 333 samples	NextSeq 500	48
EGAD00001006374	A study looking at Germline and Somatic biomarkers using WES data only		88
EGAD00001006375	A radiomics study integrating PET/CT, WES and RNAseq data		99
EGAD00001006376	This dataset contains DNA from B-lymphocytes from 2 Coriell families and 4 individuals hybridized to HumanKaryomap BeadChip Array. Single cells from subjects GM12878 and GM7228 were amplified using multiple displacement amplification (SureMDA) according to Infium Karyomapping Assay Guide. Bulk DNA was processed and hybridized to an array for subject GM12878, GM07224 and GM07225.		4
EGAD00001006379	Paired-end RNA-seq of follicular T cell lymphoma for the discovery of fusion transcripts	Illumina NovaSeq 6000	3
EGAD00001006380	RNASeq files for paper titled "Molecular classification improves risk assessment in adult B-lineage ALL: Patients on the international UKALLXII-ECOG2993 trial."	Illumina HiSeq 2000	57
EGAD00001006381	Illumina Nextseq total RNA sequencing profiles of skeletal muscle biopsies of 5 affected patients (F3/2M, F4/1F, F5/1M, F2/2F, F2/1M) compared to 6 control (Control040500, Control3509, Control3934, Control3949, Control4994, Control5106) and comparing 3 patients with additional EARS2 mutations (F2/2F, F2/1M, F5/1M) to 2 patients without (F3/2M, F4/1F). Illumina MiSeq total RNA sequencing profiles of skeletal muscle biopsies from patient F3/1M during affected disease phase (4 replicates: F3-1M_Affected_1, F3-1M_Affected_2, F3-1M_Affected_3, F3-1M_Affected_4) compared to recovered phase (F3-1M_Recovered_1, F3-1M_Recovered_2, F3-1M_Recovered_3, F3-1M_Recovered_4).	NextSeq 550	10
EGAD00001006382	Illumina MiSeq total RNA sequencing profiles of skeletal muscle biopsies from patient F3/1M during affected disease phase (4 replicates: F3-1M_Affected_1, F3-1M_Affected_2, F3-1M_Affected_3, F3-1M_Affected_4) compared to recovered phase (F3-1M_Recovered_1, F3-1M_Recovered_2, F3-1M_Recovered_3, F3-1M_Recovered_4).	NextSeq 550	8
EGAD00001006383	August 2020 data update (fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	HiSeq X Ten Illumina HiSeq 2500	4
EGAD00001006384	Shallow whole-genome sequencing (sWGS) data for the identification of somatic copy number alterations (SCNA) and the estimation of tumor fractions in plasma DNA of metastatic colorectal cancer patients (mCRC).	Illumina MiSeq NextSeq 550	45
EGAD00001006385	Modified Fast Aneuploidy Screening Test-Sequencing System (mFAST-SeqS) was applied to stratify samples based on their overall tumor fraction in cfDNA.	Illumina MiSeq	59
EGAD00001006386	All baseline samples and when available EOT were processed for high-resolution mutation analysis. We designed a SureSelectXT-HS custom panel (Agilent) covering 68 genes with a total size of 260kb using the Agilent SureDesign platform.	NextSeq 550	44
EGAD00001006387	Whole exome sequencing: 24 samples matched tumor-normal and one matched CSF. Focused exome sequencing: 17 samples matched tumor-normal-2 time point CSF.	Illumina HiSeq 2500 NextSeq 550	35
EGAD00001006388	RNA sequencing of frozen resected specimens of desmoplastic small round cell tumors (DSRCTs). Four patients have specimens from multiple tissue sites included in this dataset.	Illumina HiSeq 2500	24
EGAD00001006389	Whole Exome sequencing data of tumour samples for 112 patients with endometrioid ovarian carcinoma in FASTQ format. Data was derived as summarized below: Library Preparation: Libraries were prepared from each DNA sample using the Illumina TruSeq Exome Library Prep kit (#FC-150-1002) according to the provided protocol using modifications for working with FFPE sourced material. Libraries were quantified using the Qubit 2.0 Fluorometer and the Qubit DNA HS assay (#Q32854) and the size distribution of fragments was assessed using the Agilent Bioanalyser with the DNA HS Kit (#5067-4626). Library QC: Exome-captured sequencing library pools were quantified using the Qubit 2.0 Fluorometer and the Qubit DNA HS assay (#Q32854) and the size distribution of fragments was assessed using the Agilent Bioanalyser with the DNA HS Kit (#5067-4626). Fragment size and quantity measurements were used to calculate molarity for each library pool. Sequencing: Sequencing was performed using the NextSeq 500/550 High-Output v2 (150 cycle) Kit (# FC-404-2002) on the NextSeq 550 platform (Illumina Inc, #SY-415-1002).	NextSeq 550	112
EGAD00001006390	PitNET white blood cell DNA - tumor DNA exome sequencing samples with Illumina exome sequencing. Fifteen patients and total of 30 exomes.	NextSeq 500	30
EGAD00001006391	Whole exome sequencing of neuroendocrine cervical cancer	Illumina HiSeq 2000	29
EGAD00001006392	An investigation of clonal haematopoiesis in patients with neurodegenerative disease.	Illumina HiSeq 2500	181
EGAD00001006393	The dataset contains the targeted sequencing (TS) and the whole genome low pass (WGS) BAM files of the study. For the TS: Samples: TS_XXX There are normal and tumor DNA samples. Each DNA strand has been sequenced independently (PoolA and PoolB). A TruSeq Custom Amplicon panel of 20 genes frequently mutated in ILC and/or ER-positive BC in general was designed using DesignStudio from Illumina: AKT1, ARID1A, CDH1, ERBB2, ERBB3, ESR1, FOXA1, GATA3, IGF1R, JAK2, MAP2K4, MAP3K1, NF1, PIK3CA, PTEN, RB1, RUNX1, STAT3, TBX3, and, TP53. For the WGS: Samples: WGS_XXX There are normal and tumor DNA samples. Samples were sequenced to an average target coverage of 0.5X.	Illumina HiSeq 4000 NextSeq 500	730
EGAD00001006394	Exome sequencing of frozen resected specimens of desmoplastic small round cell tumors (DSRCTs). Four patients have specimens from multiple tissue sites included in this dataset.	Illumina HiSeq 2000	39
EGAD00001006395	Whole-exome sequencing (WES) in a well-characterized sample of 14 matched EP tumour/healthy surrounding tissue samples. The sequencing was done with paired EXOME sequencing on Illumina HiSeq 4000 using Agilent SureSelect XT HS + Human All Exon V7.	Illumina HiSeq 4000	28
EGAD00001006396	To investigate the cellular composition of the human pancreas, we performed single-nucleus sequencing from snap frozen biopsies of pancreata from adult, neonatal and diseased (chronic pancreatitis) human donors.	Illumina HiSeq 4000 NextSeq 500	27
EGAD00001006397	Transcriptome analysis of nontumorous human breast tissues. 196 cases were included in the dataset.	Illumina HiSeq 4000	196
EGAD00001006398	ChIP-Seq files accompanying the paper titled "Identification of Therapeutic Targets in Rhabdomyosarcoma Through Integrated Genomic, Epigenomic, and Proteomic Analyses".	Illumina HiSeq 2000	242
EGAD00001006399	The dataset contains a full genomics characterization of 527 Asian breast tumours. This includes whole-exome sequencing of tumour tissue at 80X, whole-exome sequencing of matched normal (blood) tissue at 40X, shallow-whole genome sequencing at 0.1X for copy number analyses, and RNA-seq of tumour tissue at 40X coverage (>15 million reads). Whole-exome libraries were prepared using the Nextera Rapid Capture Exome Kit; exome capture was performed in pools of 3 and subjected to paired end 75 sequencing on a HiSEQ4000 platform. RNA libraries were prepared using the TruSeq Stranded Total RNA HT kit with Ribo-Zero Gold as per manufacturer’s instructions and also subjected to paired end 75 sequencing on a HiSEQ4000 platform. Uploaded bam files have been mapped to the hs37d5 human genome and processed using the standard GATK pipelines. Paired clinical, demographic, genotyping, and overall survival data for these patients are available from the associated publications or by request.	Illumina HiSeq 4000	2235
EGAD00001006400	1 cell line and 123 patient samples including 38 normal (22 paired normal and 16 unpaired), 85 tumor-initial, FASTQ file types, Agilent SureSelect Human All Exon V6 Kit	Illumina HiSeq 2500	124
EGAD00001006401	This dataset contains RNA-Seq data from 204 primary melanomas and 177 regional lymph nodes. More details can be found in the manuscript: "Tumour gene expression signature in primary melanoma predicts long-term outcomes: A prospective multicentre study"	unspecified	381
EGAD00001006402	Whole genome sequencing of 29 donors of healthy mammary tissue. BAM files of stromal and epithelial DNA are included.	unspecified	58
EGAD00001006403	Dataset contains genomic sequencing of 87 samples (blood germline, normal prostate tissues, human tumors, PDOs and PDXs). Sequencing was performed by whole-exome sequencing or targeted sequencing of prostate cancer genes.	Illumina HiSeq 2500 Illumina NovaSeq 6000 Ion Torrent S5 XL	87
EGAD00001006404	Dataset contains RNA-seq of 30 samples (normal prostate tissue, human prostate cancer, PDX and organoids).	unspecified	26
EGAD00001006406	RNA was extracted from eight diagnostic ETV6-RUNX1 positive acute lymphoblastic leukemia samples collected in PAXgene blood RNA tubes using PAXgene Blood RNA kit (cat #762174, Qiagen GmbH, Hilden, Germany), following the version 2 instructions for manual purification. Samples were processed with Globin-Zero Gold rRNA Removal Kit (Illumina) and directional libraries were prepared using NEBNext Ultra Directional RNA Library Prep kit (New England Biolabs). The library preparation and paired end (150 bp) sequencing were performed by Novogene (HK) Company Limited (Hong Kong, China) using Illumina Novaseq 6000 aiming at 70 million read pairs per sample.	Illumina NovaSeq 6000	8
EGAD00001006407	Here we provide a catalogue of variants called after sequencing the exomes of 45 babies from the State of Rio Grande do Nord in Brazil. Our data set provides a useful reference point for diagnosis of rare diseases in Brazil.		45
EGAD00001006408	This dataset includes whole-exome sequencing data for multifocal ileal tumor samples from two patients. Exonic sequences were enriched using the Agilent V2 capture probe set and sequenced by 76-bp paired-end reads using the Illumina Genome Analyzer IIx system with a mean coverage of 80x for each base	Illumina Genome Analyzer IIx	31
EGAD00001006409	Formalin-fixed, paraffin-embedded samples from 27 FIT interval CRC and 54 screen-detected CRCs collected in a pilot-program of FIT-based CRC screening in the southwest and northwest regions in the Netherlands, were used in this study. DNA was extracted for 1) Shallow Sequencing (copy number analysis) and 2) TSACP Amplicon Cancer Gene Panel (mutations) of 22 FIT Interval CRCs and 45 screen-detected CRCs.	Illumina HiSeq 2500	66
EGAD00001006410	Files from whole exome sequencing of matched normals and multiple tumors from 7 melanoma patients. The tumors include primary tumors and distant metastases.	Illumina HiSeq 2500	127
EGAD00001006414	1075 members of the LBC1936 were sequenced using the Illumina HiSeq X platform. This dataset contains the gvcfs.		1
EGAD00001006415	Tregs were sorted as CD4+CD25+CD127- cells from peripheral blood of patients with advanced metastatic melanoma, stage III(B-D)-IV, who were receiving treatment with anti-PD1 (n =26); and patients with kidney, non-small cell lung, liver and bladder cancer who were receiving treatment with anti-PD1. RNA was extracted and polyA libraries were prepared using the Illumina Truseq sample preparation kit v.2. Single-end sequencing was performed on NextSeq500.	NextSeq 500	49
EGAD00001006416	297 members of the LBC1921 were sequenced using the Illumina HiSeq X platform. This dataset contains the gvcfs.		1
EGAD00001006417	single cell sequencing esophagus, stomach and duodenum of : 4 esophagus samples 9 gastric samples 5 duodenum samples	NextSeq 500	18
EGAD00001006418	These samples were sequenced at the Broad Institute on an Illumina HiSeqX at 30x -- PCR Free. The CRAMS and VCF are as produced by Broad. The VCFs produced were generated by the Broad using GATK.		1
EGAD00001006419	WGS data of plasma samples from CRC patients (N=12)	Illumina HiSeq 4000	12
EGAD00001006420	WGS data of plasma samples from BRCA patients (N=10)	Illumina HiSeq 4000	10
EGAD00001006421	WGS data of plasma samples from healthy individuals (N=29)	Illumina HiSeq 4000	29
EGAD00001006422	This dataset includes 289 samples from 46 high grade serous epithelial ovarian cancer patients. Data are from both tissue samples (either primary tumor, or synchronous metastases) and circulating cell-free DNA (cfDNA) of plasma samples taken during therapy and follow-up.	NextSeq 500	289
EGAD00001006423	Leukaemia and related blood cancers occur due to genetic changes that typically accumulate over many years. This study will employ targeted next-generation sequencing to retrace the preclinical evolution of several types of haematological malignancy. Investigating the progression of the earliest pre-malignant ancestral clones promises to offer valuable insights into early leukaemia evolution and therapeutic vulnerabilities of leukaemia stem cells.	HiSeq X Ten Illumina NovaSeq 6000	1
EGAD00001006424	Leukaemia and related blood cancers occur due to genetic changes that typically accumulate over many years. This study will employ targeted next-generation sequencing to retrace the preclinical evolution of several types of haematological malignancy. Investigating the progression of the earliest pre-malignant ancestral clones promises to offer valuable insights into early leukaemia evolution and therapeutic vulnerabilities of leukaemia stem cells.	Illumina HiSeq 2500 Illumina HiSeq 4000	35
EGAD00001006425	The phenotypic data for ~12500 samples of the AWI-Gen Phase 1 Population cross-sectional study of older adults (mostly between 40 and 60 years), men and women. Six study sites in four sub-Saharan African counties including Ghana, Burkina Faso, Kenya and South Africa. Some groups are missing data for specific variables. Data includes questionnaire data (demography, health history, family health history, behaviour and infection data); anthropometry; and laboratory assays on blood and urine.		1
EGAD00001006426	This study contain the WGS and WEX aligned bam files and RNA-seq fastq files for human liver tumors.	HiSeq X Five Illumina HiSeq 2000 Illumina HiSeq 4000 Illumina NovaSeq 6000	183
EGAD00001006427	The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, the International Agency for Research on Cancer is coordinating the recruitment of 5000 individuals with cancer (colorectal, renal, pancreatic, oesophageal adenocarcinoma or oesophageal squamous cancers) across 5 continents to explore whether different mutational signatures explain marked variation in incidence. In brief, through an international network of collaborators around the world, biological materials are collected, along with demographic, histological, clinical and questionnaire data. Whole genome sequences of tumour-germline DNA pairs are generated at the Wellcome Trust Sanger Institute. Somatic mutational signatures are subsequently extracted by non-negative matrix factorisation methods and correlated with risk factors data. Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development.	Illumina NovaSeq 6000	-
EGAD00001006428		NextSeq 550	12
EGAD00001006429	Whole genome sequencing of EBV Associated Nasopharyngeal Carcinoma	Illumina HiSeq 2000	138
EGAD00001006431	Background: The development of retinoblastoma is thought to require pathological genetic changes in both alleles of the RB1 gene. However, cases exist where RB1 mutations are undetectable suggesting alternative pathways to malignancy. Methods: We applied comprehensive whole genome sequencing (WGS) and transcriptomics to sporadic retinoblastomas derived from twenty patients attending our clinic, contrasting these results to that obtained through customary clinical testing. We sought RB1 and other driver mutations, investigated mutation burden, mutational signatures and phylogenetic relatedness in one case of bilateral retinoblastoma. Results: At least one RB1 mutation was identified in all retinoblastomas. We confirmed RB1 mutations previously identified by clinical screening, identified three new RB1 mutations and provided clarity to the mechanism behind a further six mutations. Eight tumours carried structural rearrangements involving RB1 ranging from relatively simple to extremely complex rearrangement patterns, including a chromothripsis-like pattern in one tumour. Potential driver mutations included mutations in BCOR (5/20) and amplification of MYCN (2/20) and MDM4 (1/20). We show that RB1 mutations are not mutually exclusive of MYCN amplifications, and further reveal that all tumours demonstrate increased MYCN expression suggesting a universal role in retinoblastoma tumorigenesis. Bilateral tumours obtained from one patient harboured conserved germline but divergent somatic RB1 mutations, indicating independent evolution. In-keeping with previous WGS of paediatric cancers, the mutation burden in retinoblastomas was extremely low. Mutational signature analysis showed a predominance of signatures associated with cell division and an absence of ultraviolet-related DNA damage. In a tumour exposed to chemotherapy prior to enucleation, a profound platinum-related mutational signature was observed. Conclusions: WGS provides a complete picture of the genomic landscape of retinoblastomas, allowing the discovery of mutations otherwise undetected by conventional clinical screening approaches. The presence of at least one RB1 mutation in all retinoblastomas and the relative paucity of driver mutations in other genes suggests mutations beyond RB1, MYCN and BCOR are rare. Whilst most RB1 mutations are identifiable by clinical screening, the increased resolution and ability to detect otherwise elusive rearrangements of RB1 by WGS, confirming whether they are somatic or germline, has important repercussions on clinical management and advice on recurrence risks.	HiSeq X Ten	41
EGAD00001006433	Shallow whole genome sequencing of 29 BIA-ALCL patients for copy number analysis and 24 Alk-negative ALCL samples as control cohort. 7 Whole exome sequencing BIA-ALCL samples.	Illumina HiSeq 4000	66
EGAD00001006434	Whole exome seq (N=21) and RNA-seq (N=36) data of additional T-ALL	Illumina NovaSeq 6000	44
EGAD00001006435	The dataset contains metadata for all cells before scRNA-seq quality control and for cells passing quality control. It also contains a count matrix with Salmon gene counts for all cells passing quality control, and reconstructed B-cell receptor sequences using the computational tool BraCeR. The scRNA-seq data was generated using the Smart-seq2 protocol and sequenced on Illumina NextSeq500.		12
EGAD00001006436	This dataset contains scRNA-seq fastq files (trimmed for quality and adapters using Trim Galore) for 3739 intestinal plasma cells of known or unknown antigen specificities from in total 12 individuals (4 untreated coeliac disease patients, 3 treated coeliac disease patients, 5 controls). The data was generated using the Smart-seq2 protocol and sequenced on the Illumina NextSeq500 platform with 75 bp paired-end reads.	NextSeq 500	12
EGAD00001006438	Contains data for all cells sequenced for this study. Data is organized as one bam-file per sample. Individual cells can be identified through the CB tag in the bam-files.	NextSeq 500	7
EGAD00001006439	Illumina RNASeq sequencing of tumour samples from 53 cases of cutaneous melanoma and 61 cases of acral melanoma		-
EGAD00001006440	This dataset includes whole exome sequencing reads from 10 normal and 14 cell lines based on Agilent SureSelect XT Human All Exon v6. They are all 2*100bp reads sequenced using Illumina HiSeq4000.	Illumina HiSeq 4000	24
EGAD00001006441	This dataset includes whole transcriptome sequencing reads from 8 cell lines based on TruSeq stranded mRNA kit (Illumina). They are all 2*75bp reads sequenced using Illumina NextSeq500.	NextSeq 500	8
EGAD00001006442	WGS files for paper titled "Integrative Analysis of Pediatric Acute Leukemia Identifies Acute Myeloid/T-Lymphoblastic Leukemia Subtype that Spans a T Lineage and Myeloid Continuum with Distinct Prognoses"	Illumina HiSeq 2000	184
EGAD00001006443	WXS files for paper titled "Integrative Analysis of Pediatric Acute Leukemia Identifies Acute Myeloid/T-Lymphoblastic Leukemia Subtype that Spans a T Lineage and Myeloid Continuum with Distinct Prognoses"	Illumina HiSeq 2000	260
EGAD00001006444	RNASeq files for paper titled "Integrative Analysis of Pediatric Acute Leukemia Identifies Acute Myeloid/T-Lymphoblastic Leukemia Subtype that Spans a T Lineage and Myeloid Continuum with Distinct Prognoses"	Illumina HiSeq 2000	132
EGAD00001006445	Glioma is the most common and aggressive brain cancer in adults. While primary glioma has been widely studied, molecular characterization of recurrent glioma is still rare. The high-quality sequencing data that we generated provides a useful resource for the community. The CGGA project contains over 2,000 samples from Chinese cohorts. It totally includes the whole-exome sequencing (286), DNA methylation (159), mRNA sequencing (1,018), mRNA microarray (301) and microRNA microarray (198) and matched clinical data. CGGA removes the barriers to researchers, providing rapid and convenient access to high-quality functional genomic data resources for biological research and clinical applications.	Illumina HiSeq 2000	572
EGAD00001006446	Dataset contains fastq files of tumor transcriptomes of 12 pituitary neuroendocrine tumors. Patients with and without somatostatin analogue treatment before tumor surgery can be compared. Sequencing was performed on MGISEQ-2000.	unspecified	12
EGAD00001006447	To elucidate the epigenetic changes which occur when human long-term hematopoietic stem cells (LT-HSC) become activated we performed Bulk ATAC-Seq on 13 sorted bulk hematopoietic populations from cord bloodas well as single-cell ATAC-Seq upon CD34+CD38-CD45RA- cells enriched for HSC as well as CD34+/CD38+ progenitor cells both from cord blood. These studies revealed gains of chromatin accessibility around CTCF binding sites during HSPC activation, as such we additionally performed Low-C to directly profile the 3D conformation of human cord-blood derived LT-HSC and Short-term hematopoietic stem cells (ST-HSC), as well as Hi-C , ATAC-Seq and CTCF ChIP-Seq upon the OCIAML-2 cell line in which CTCF sites gained during LT-HSC activation are enriched. Finally we transduced human cord-blood LT-HSC with an shCTCF vector; in-vitro cultured LT-HSC cells harbouring shCTCF were used to perform RNA-Seq, and scATAC-Seq was performed on CD34+/CD38- human CB cells transduced with shCTCF, four weeks post xeno-transplantation into mice. Collectively these studies have helped us demonstrate the role of 3D chromatin conformation changes during human LT-HSC activation.	Illumina HiSeq 2000 Illumina HiSeq 2500 NextSeq 500 unspecified	62
EGAD00001006448		NextSeq 500	25
EGAD00001006449	37 surgical samples were interrogated by WXS, and 50 formalin-fixed, paraffin-embedded samples were interrogated by target-seq. Agilent SureSelect XT kit and SureSelect Human Exon V6 were used to generate exome libraries. Agilent SureSelect XT low input kit and custom capture panel designed on SureDesign were used to generate target-seq libraries. All libraries were sequenced on Illumina HiSeq 2500 platform.	Illumina HiSeq 2500	124
EGAD00001006450	We characterised H3K27M-mutant diffuse intrinsic pontine glioma (DIPG, n=21) and RNA-Seq (n=26 DIPG, 12 normal brain)	Illumina HiSeq 2500 Ion Torrent Proton	59
EGAD00001006451	A total of 9 brain metastasis were sequenced. For 6/9 a matched cerebrospinal fluid sample, prior to surgery and in two cases after surgery (+1 month from surgery) and after treatment (+3 month) were collected. Single-cell T cell receptor clonotypes were produced using the Chromium Single Cell 5’ Library and sequenced on an Illumina NovaSeq 6000.	Illumina NovaSeq 6000	17
EGAD00001006452	A total of 9 brain metastasis were sequenced. For 6/9 a matched cerebrospinal fluid sample, prior to surgery and in two cases after surgery (+1 month from surgery) and after treatment (+3 month) were collected. Single-cell gene expression was produced using the Chromium Single Cell 5’ Library and sequenced on an Illumina NovaSeq 6000.	Illumina NovaSeq 6000	19
EGAD00001006453	Whole exome sequencing of matched tumor (brain metastasis) -normal (blood) from 6 patients.	Illumina HiSeq 2500	12
EGAD00001006456	RNA sequencing data of over 200 HGSOC samples at diagnosis, after chemotherapy and during progression.	unspecified	212
EGAD00001006457	Genomic analysis between pre-invasive and invasive components of malignant pulmonary nodule (MPN) facilitates the description of lung adenocarcinoma (LUAD) evolutionary patterns. We conduct an analysis of gene-panel sequencing on 53 T1 stage LUAD cases, which extend the understanding of evolutionary trajectories during invasiveness acquisition in early LUAD.		174
EGAD00001006458	Whole genome sequencing was performed on 24 patients (tumor DNA paired to constitutional DNA). WGS libraries were subjected to paired-end (2 x 100 bp) sequencing on NovaSeq (Illumina). The 96 files are in FASTQ format.	Illumina NovaSeq 6000	48
EGAD00001006459	Bottleneck sequencing of human tissue including neurons, cord blood, sperm. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2020-10-20.	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina NovaSeq 6000	43
EGAD00001006460	Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0006_001 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006461	Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0080_001 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006462	Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0080_002 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006463	Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0141_004 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006464	Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0142_001 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006465	Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0142_003 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006466	Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0146_002 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006467	Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0149_001 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006468	Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0149_002 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006469	Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0150_001 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006470	Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0150_002 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006471	Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0152_001 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006472	Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0152_002 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006473	Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0163_001 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006474	Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0163_002 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006475	Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0172_001 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006476	Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0172_002 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006477	Transcriptome profiling by high-throughput sequencing for single cells for library TENX062 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006478	Transcriptome profiling by high-throughput sequencing for single cells for library SCRNA10X_SA_CHIP0063_000 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006479	Transcriptome profiling by high-throughput sequencing for single cells for library TENX064 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006480	Transcriptome profiling by high-throughput sequencing for single cells for library TENX065 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006481	Transcriptome profiling by high-throughput sequencing for single cells for library TENX066 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006482	Transcriptome profiling by high-throughput sequencing for single cells for library TENX068 1 samples; filetype=fastq	Illumina HiSeq 2500	1
EGAD00001006483	Transcriptome profiling by high-throughput sequencing for single cells for library TENX069 1 samples; filetype=fastq	NextSeq 550	1
EGAD00001006484	Whole-transcriptome characterization of cfRNA in cancer (stage III breast [n=46], lung [n=30]) and non-cancer (n=89) participants from the Circulating Cell-free Genome Atlas (NCT02889978). Dataset includes collapsed BAM files for plasma cfRNA from each patient, as well as collapsed BAM files for RNA from matched tumor tissue (when available).		303
EGAD00001006485	Raw FASTQ files obtained from in situ Hi-C of 16 normal B cells (3 naive B cells, 3 germinal center B cells, 3 plasma cells, and 3 memory B cells, together with a merge file for each subpopulation), 7 chronic lymphocytic leukemias (2 unmutated IGHV and 5 mutated IGHV), and 5 mantle cell lymphomas (2 conventional and 3 leukemic non-nodal).	Illumina HiSeq 2500	28
EGAD00001006486	Valid reads obtained after analyzing in situ Hi-C data of 16 normal B cells (3 naive B cells, 3 germinal center B cells, 3 plasma cells, and 3 memory B cells, together with a merge file for each subpopulation), 7 chronic lymphocytic leukemias (2 unmutated IGHV and 5 mutated IGHV), and 5 mantle cell lymphomas (2 conventional and 3 leukemic non-nodal).		28
EGAD00001006487	The dataset consists of BAM files of 2 pairs of matched tumor/normal samples of a men with advanced prostate cancer. One pair is whole genome sequencing: WGS_T/WGS_N for tumor and normal samples, respectively; and the other pair is whole exome sequencing: WES_T/WES_N for tumor and normal specimens, respectively. Details can be found at the publication titled: "Molecular Medicine Tumor Board: Whole Genome Sequencing to Inform on Personalized Medicine for a Man with Advanced Prostate Cancer"	HiSeq X Ten Illumina HiSeq 2500	4
EGAD00001006488	WGS data for 20 Glioblastoma stem cell (GSC) lines and matched blood samples. Fastq files are available. For 10 GSC samples and the 10 matched blood samples the reads are on 3 fastq files per sample	HiSeq X Five	80
EGAD00001006537		NextSeq 500	12
EGAD00001006538	WGBS data for EGAS00001004660, "Aggressive PDACs show hypomethylation of repetitive elements and the execution of an intrinsic IFN program linked to a ductal cell-of-origin"	Illumina HiSeq 2000	13
EGAD00001006539	RNA data for EGAS00001004660, "Aggressive PDACs show hypomethylation of repetitive elements and the execution of an intrinsic IFN program linked to a ductal cell-of-origin"	Illumina HiSeq 2000	23
EGAD00001006540	This dataset contains the WGS of 35 samples (high grade osteosarcoma). All cases were reviewed by an expert bone pathologist and have a tumour content of 50% minimum. Paired-end libraries from fresh frozen tumour samples were prepared using the Agilent SureSelectXT HumanV5 kit for whole-genome sequencing (WGS). These were sequenced together with a tumour complementary DNA on an Illumina HiSeq2500 (paired-end 100 bp). Sequencing reads were mapped to the GRCh37 human reference genome using HISAT2	Illumina HiSeq 2500	35
EGAD00001006541	The data provided here was critical in establishing that human long-term hematopoietic stem cells (LT-HSC), previously described as the most primitive HSC population, is actually composed of distinct subsets that can be prospectively isolated. Via mechanistic studies centering around the Rho-GTPase effector kinase PAK4 and its inhibitor INKA1, we identified the immune checkpoint ligand CD112 as a marker for hematopoietic stem and progenitor cells, that is highest expressed on LT-HSC. More importantly, CD112 can be used to stratify functionally distinct subsets within LT-HSC: In response to regeneration-mediated stress, the CD112low subset exhibits a transient restraint (termed latency) before contributing to hematopoietic reconstitution, while the CD112high subset is primed to respond rapidly. High resolution RNA-seq of the CD112 surface expression spectrum within rare LT-HSC subsets (human umbilical cord blood) demonstrated that more genes are differentially upregulated in the deeper quiescent and less metabolic active subset. Genes enriched in this subset centre around cell adhesion and Rho-GTPase signaling. This is in agreement with the scRNAseq data from human G-CSF mobilized peripheral blood (mPB) generated here that was used as an model of in vivo activation/priming revealing via RNA-velocity and pseudo-time analysis that INKA1high versus PAK4high, CDK6high and CD112high enrichment are either detected early or late in diffusion pseudotime indicative of quiescent versus primed cell status, respectively. RNAseq following INKA1 overexpression in LT-HSC and ST-HSC revealed by GSEA an overall stemness preserving phenotype and particularly in LT-HSC, but not in short-term HSC (ST-HSC), suppression of transcriptional programs linked to activation. Collectively, our data decipher the molecular intricacies underlying HSC heterogeneity and self-renewal regulation and point to latency as an orchestrated physiological response that integrates quiescence control with HSC fate choices to preserve a stem cell reservoir.	Illumina HiSeq 2500	26
EGAD00001006542	This dataset contains: Targeted proximity-ligation assay, enriched using capture probes (1092 samples) Targeted proximity-ligation assay, enriched using 4C (1230 samples) Genome-wide proximity-ligation assay, enriched using HiC ( 6 samples)	Illumina MiniSeq Illumina NovaSeq 6000	2328
EGAD00001006543	WGS BAMs of 19 adult patients with T-acute lymphoblastic leukemia with primary, remission and relapse sample per patient. Total = 58 samples sequenced with HiSeq 4000 or NovaSeq 6000 (Illumina).	Illumina HiSeq 4000	57
EGAD00001006544	ATAC-seq data for 2 glioblastoma cell lines (LN229, ZH487), NT and SOX10KD.	Illumina HiSeq 2000	2
EGAD00001006545	Whole genome sequencing data for 20 human glioblastoma patients.	HiSeq X Ten	20
EGAD00001006546	Whole Genome Bisulfite data for human glioblastoma patients, EGAS00001003953. 68 human samples	Illumina HiSeq 2000	68
EGAD00001006547	RNA data for human glioblastoma patients, EGAS00001003953. 64 human samples, 2 cell lines (LN229, ZH487).	Illumina HiSeq 2000 Illumina HiSeq 4000	66
EGAD00001006548	ChIPseq data for human glioblastoma patients, EGAS00001003953. Mix of input, H3K27ac, H3K27me1, H3K27me3, H3K36me3, H3K4me1, H3K4me3, H3K9me3 and BRD, 20 human samples, 2 cell lines (LN229, ZH487).	Illumina HiSeq 2000	22
EGAD00001006550	In a dual-center, two-cohort study, we performed single-cell RNA-sequencing of whole blood and peripheral blood mononuclear cells to determine changes in immune cell composition and activation in mild vs. severe COVID-19 over time. This study provides detailed insights into the systemic immune response to SARS-CoV-2 infection and reveals profound alterations in the myeloid cell compartment associated with severe COVID-19.	Illumina NovaSeq 6000	141
EGAD00001006551	RNA-seq profiling of 2 prostate cancer xenograft mouse models, each at the intact state (n=3), castrated state (n=4) and castrated + AR replacement (n=3).	Illumina NovaSeq 6000	20
EGAD00001006552	Raw data from cancer panel sequencing of lung adenocarcinomas from admixed Latin American populations. Predominantly samples carrying known oncogene mutations (n=581).	Illumina HiSeq 2500	578
EGAD00001006553	Off-target amplification can lead to false positive human brain microbiome detection. 16s rRNA amplicon samples from brain tissue of healthy and Parkinson's disease patients.	Illumina MiSeq	114
EGAD00001006554	This dataset includes 1,359 paired-end shotgun metagenomics samples from 946 healthy donors of the Milieu Intérieur cohort. 413 of the donors provided two samples (V1 and V2).	Illumina HiSeq 2500	1359
EGAD00001006555	To investigate the mechanism by which GATA1s and STAG2 deficiency contribute to Down Syndrome leukemogenesis, specifically within the propagating CD34/CD117 cell fractions from primary xenografts, we carried out transcriptional and epigenetic profiling by RNAseq and ATACseq. The chromatin accessibility landscape was compared to bulk ATACseq of individually sorted N-FL HSPC subpopulations. To investigate the mechanism underlying the synergy between T21 and GATA1s in driving preleukemia development, we analyzed the binding occupancy of GATA1. We performed Cut&Run assays to profile genome-wide GATA1 binding sites and also to quantify binding changes upon GATA1s editing in N-FL and T21-FL CD34+ enriched HSPCs. Lastly, we profiled miRNAs from N-FL and T21-FL CD34+ enriched HSPCs by miRNA-Seq.	Illumina HiSeq 2500 NextSeq 500	91
EGAD00001006556	This dataset contains 19 scRNAseq realized on neuroblastoma patients biopsies straight after surgical act.	Illumina NovaSeq 6000	19
EGAD00001006557	This dataset contains 4 samples (2x input and 2x H3K27ac ChIPseq) in the IC-pPDXC-63 cell line.	Illumina NovaSeq 6000	4
EGAD00001006558	This dataset contains 72 RNAseq (BAM and Fastq files are available for each sample).	Illumina NovaSeq 6000	114
EGAD00001006559	RNASeq gene expression profiles from 3 icas9 human iPSC derived cortical neurons treated with and without doxycycline.	NextSeq 500	6
EGAD00001006560	paired RNA-Seq data of VDH15 cells with and without deletion of NSUN3. The 11 samples were sequenced on HiSeq 4000 and prepared with SmarTer low input RNA and Chip NEBNext kit.	Illumina HiSeq 4000	11
EGAD00001006561	We performed whole- genome sequencing, rare variant filtering, segregation analysis and functional validation of PD cosegregating rare genetic variation in two families (6 samples) segregating PD associated GBA variants c.115+1G>A (ClinVar ID: 93445, ) and p.L444P (ClinVar ID: 4288) respectively. The paired WGS sequencing was run on HiSeq X Ten and the library preparation kit was Illumina TruSeq DNA nano.	HiSeq X Ten	6
EGAD00001006562	Updated INSPIRE whole exome sequencing data: PBMC controls	Illumina HiSeq 2500	45
EGAD00001006563	INSPIRE whole exome sequencing of tumors updated	Illumina HiSeq 2500	46
EGAD00001006564	INSPIRE whole transcriptome sequencing of tumors	Illumina HiSeq 2500	65
EGAD00001006565	The mutational status of 112 recurrently mutated genes in B-cell lymphoma was examined by targeted next-generation sequencing (NGS). Libraries were performed with 150 ng of genomic DNA (gDNA) obtained from formalin-fixed paraffin-embedded (FFPE) biopsy using molecular-barcoded library adapters (ThruPLEX Tag-seq kit; Takara) coupled with a custom hybridization capture based method (SureSelect XT Target Enrichment System Capture strategy, Agilent Technologies Inc.) and sequenced in a MiSeq instrument (Illumina, 2x150bp).	Illumina MiSeq	45
EGAD00001006566	The mutational status of 112 recurrently mutated genes in B-cell lymphoma was examined by targeted next-generation sequencing (NGS). Libraries were performed with 15-30 ng of cfDNA obtained from plasma using molecular-barcoded library adapters (ThruPLEX Tag-seq kit; Takara) coupled with a custom hybridization capture based method (SureSelect XT Target Enrichment System Capture strategy, Agilent Technologies Inc.) and sequenced in a MiSeq instrument (Illumina, 2x150bp).	Illumina MiSeq	79
EGAD00001006567	This dataset is composed of 86 samples: 15 samples of bronchoalveolar lavage fluid (BAL), 17 samples of non-malignant lung tissue, 14 samples of peritumoural tissue, 16 tumour tissues, 8 negative DNA extraction controls, 16 negative sampling controls for BAL. Samples were obtained from 17 NSCLC patients (average age 68 years). Sequenced region was 16S V3-V4. Fastq files are provided.	Illumina MiSeq	86
EGAD00001006568	This dataset contains 20 whole genome sequences from 10 tumor-normal pairs from conjunctival melanomas.	HiSeq X Ten	20
EGAD00001006569	Somatic SNVs and Indels for INSPIRE Tumor WES called using Mutect, Mutect2, Varscan2, Vardict, and Strelka2		-
EGAD00001006570	The clinical relevance of immune landscape intratumoural heterogeneity (immune-ITH) and its role in tumour evolution remain largely unexplored. Here, we uncover significant spatial and phenotypic immune–ITH from multiple tumour sectors and decipher its relationship with tumour evolution and disease progression in hepatocellular carcinomas (HCC). Immune–ITH is associated with tumour transcriptomic-ITH, mutational burden, and distinct immune microenvironments. Tumours with low immune–ITH experience higher immunoselective pressure and escape via loss of heterozygosity in human leukocyte antigens and immunoediting. Instead, the tumours with high immune-ITH evolve to a more immunosuppressive/exhausted microenvironment. This gradient of immune pressure along with immune-ITH represents a hallmark of tumour evolution, which is closely linked to the transcriptome-immune networks that contributes to disease progression and immune inactivation. Remarkably, high immune-ITH and its transcriptomic signature are predictive for worse clinical outcome in HCC patients. This in-depth investigation of ITH provides evidence on tumour-immune co-evolution along HCC progression.	HiSeq X Ten Illumina HiSeq 4000	70
EGAD00001006571	Raw data from cancer panel sequencing of lung adenocarcinomas from admixed Latin American populations, predominantly samples without known oncogene mutations (n=532).	Illumina HiSeq 2500	532
EGAD00001006572	The dataset contains somatic variants in 344 colorectal cancer samples. Variants are called with Mutect2 (GRCh38). Important: VCF-files include also variants, which have been annotated as "str_contraction" and "panel_of_normals". Please, use only "PASS" variants in studies, which are not microsatellite repeat related. Samples are sequenced with Novaseq 6000, HiSeq 2000, and HiSeq X Ten instruments (average coverage depth ~30+). The dataset consists of 257 MSS, 58 MSI, 25 MSS IBD, and 4 POLE mutant CRCs.		344
EGAD00001006573	Here we report successful gene knock-in (KI) in the eggs of Schistosoma mansoni by combining CRISPR/Cas9 with single-stranded oligodeoxynucleotides (ssODNs). We targeted the acetylcholinesterase (AChE) gene of S. mansoni using two synthetic guide RNAs (gRNAs), X5 and X7, respectively. Liver eggs of S. mansoni were exposed to CRISPR-vector containing X5 or X7 by electroporation. Simultaneously, eggs were transfected with a ssODN donor encoding a stop codon in all six frames. Next generation sequencing analysis revealed that CRISPR/Cas9-mediated editing in S. mansoni eggs resulted in Homology-Directed Repair (HDR) when template DNA ssODN provided. Furthermore, soluble egg antigen (SEA) from AChE-modified eggs exhibited markedly reduced AChE activity compared with controls, indicative that programmed Cas9 cleavage mutated the AChE gene. Following injection of modified schistosome eggs into the tail veins of mice, a significant decrease in granuloma size in the lungs of these animals. Notably, an enhanced Th2 response induced by eggs in lung, and splenocytes small intestine-draining mesenteric lymph node cells was also generated in mice injected with X5-KI eggs in different methods. These findings further demonstrate the power and utility of CRISPR/Cas9-based genome editing for undertaking functional genomics studies in schistosomes.	Illumina MiSeq	22
EGAD00001006574		Illumina NovaSeq 6000	30
EGAD00001006575	To identify dysfunctional neuronal subtypes underlying seizure activity in the human brain, we have performed single-nucleus transcriptomics analysis of >110,000 neuronal transcriptomes derived from temporal cortex samples of multiple temporal lobe epilepsy and non-epileptic subjects.	NextSeq 500	19
EGAD00001006576	The PMCC AML RNAseq dataset consists of 81 AML patient samples (clinical data in Supplemental Table 11 of manuscript), processed in two batches. These patient samples are able to engraft in the NSG (NOD.Cg PrkdcscidIl2rgtm1Wjl /SzJ) mouse model. Five patients (90543, 598, 90240, 110484, 100500) were included in both batches. Viaably frozen material from the Leukemia Tissue Bank at Princess Margaret Cancer Centre/ University Health Network were thawed by dropwise addition of X-VIVO + 50% fetal calf serum supplemented with DNase (100μg/mL final concentration, Roche). RNA was extracted from bulk peripheral blood mononuclear cells (PBMC) using the RNeasy Micro Kit (Qiagen Inc.). A paired-end 76 base-pair flow-cell lane Illumina High seq 2000 yielded an average of 240 million sequence reads aligning to genome per sample at the Genome Sciences Centre, BC Cancer Agency for cohort 1. Cohort 2 was subjected to 125 bp, paired-end RNA-sequencing on the Illumina HiSeq 2500 with an average of 50 million reads/sample at the Centre for Applied Genomics, Sick Kids Hospital.	Illumina HiSeq 2500	85
EGAD00001006577	RNAseq and WES of liver metastases samples (resections and biopsies) of CM and UM patients	Illumina HiSeq 2500	103
EGAD00001006578	This dataset contains three cram files for paired end sequencing of a trio, sequenced with Illumina Hiseq 2500	Illumina HiSeq 2500	3
EGAD00001006579	This dataset comprises Circle-seq data for 12 neuroblastoma cell lines supporting Koche et al. Extrachromosomal circular DNA drives oncogenic genome remodeling in neuroblastoma (2020).	Illumina HiSeq 2500 Illumina MiSeq NextSeq 500	12
EGAD00001006580	Circle-seq data for 21 primary neuroblastoma samples supporting Koche et al. Extrachromosomal circular DNA drives oncogenic genome remodeling in neuroblastoma (2020).	Illumina HiSeq 2500 MinION NextSeq 500	21
EGAD00001006581	This is human phenotype data for participants in a gut microbiome study. This data was collected at the same time as the stool samples used for the microbiome component. Participants were also part of the AWI-Gen Phase 1 main study. https://www.ebi.ac.uk/ena/data/view/PRJEB40733		1
EGAD00001006582	To investigate the molecular and biological pathways altered by S1PR3OE in human hematopoietic stem cells (HSC), we performed RNA-sequencing (RNA-seq) of LT- and ST-HSC 3 days after transduction with control or S1PR3 overexpression (OE) lentiviral vectors. LT-HSC and ST-HSC from 3 pool of CB lin- were FACS-purified, cells were prestimulated for 4 hours and transduced with lentiviral vectors. At day 3, 2000-5300 BFP+ cells were FACS-purified for RNA isolation with a PicoPure kit. We were able to isolate only 1600-1800 BFP+ cells from LT-HSC control samples as opposed to 4000-5400 BFP+ cells from S1PR3OE samples. Thus, we pooled all control BFP+ LT-HSC cells into one sample for RNA-seq analysis. BFP- LT-HSC from control vector transduction were purified from CB1 as an additional LT-HSC control. Nextera libraries generated from 10 ng RNA from 5 LT-HSC samples (2 controls, 3 S1PR3OE) and 6 ST-HSC samples (3 controls, 3 S1PR3OE) were subjected to 125 bp, paired-end RNA-sequencing on the Illumina HiSeq 2500 with an average of 50 million reads/sample at the Center for Applied Genomics, Sick Kids Hospital.	Illumina HiSeq 2500	11
EGAD00001006583	human medulloblastoma xenograft isolated from mouse brain was frozen and genomic DNA or RNA was extracted. Bisulfite converted DNA was processed and hybridised to illumina EPIC or 450K arrays using standard protocols. RNAseq was performed on total RNA.	unspecified	4
EGAD00001006584	A deeper understanding of the pathological mechanisms of SARS-CoV-2 infection is required to combat COVID-19. Through this dataset, we analyze postmortem lung cells from patients that are infected/uninfected with SARS-CoV-2 with snRNA-seq.	Illumina NovaSeq 6000	10
EGAD00001006585	In our study, we hypothetyzed that CD34progenitors from cases with undetectable Minimal residual Disease (MRD) by flow cytometry would contain cells with leukemic-initiating-potential that could be identified on genetic (rather than phenotypic) grounds by Whole Exome Sequencing.	Illumina NovaSeq 6000	30
EGAD00001006587	94 sample with multi-omics analysis of ALT-positive neuroblastoma tumors, rna sequencing	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	715
EGAD00001006589	The dataset contains transcriptional data obtained using total RNA sequencing on a Illumina machin. 59 samples are from the Hammersmith Hospital (HH) cohort of human primary ovarian tumours and 20 samples are from ovarian cancer cell lines Kuramochi (3 replicates) and Ovsaho (2 replicates) treated with 1 uM DNMTi guadecitabine or vehicle, at an early (day 5) or late (day 8) timepoint.	Illumina HiSeq 2500 Illumina NovaSeq 6000	79
EGAD00001006591	Stem cells within prostate epithelium frequently undergo malignant transformation, but there is limited information on their clonal dynamics and mutation burden in healthy human prostates. We sequenced whole genomes from 409 microdissections of prostate epithelium across 8 donors, using phylogenetic reconstruction with spatial mapping in a 59-year-old man’s prostate to provide high-resolution reconstruction of tissue dynamics across the lifespan. Somatic mutation burden increases linearly with age, at ~16 mutations/year/clone, and is higher in peripheral than peri-urethral regions. The 24-30 independent glandular subunits are established as rudimentary ductal structures during fetal development by 5-10 embryonic cells each. Puberty induces formation of further side branches and terminal acini by local stem cells disseminated through the rudimentary ducts during development. During adult tissue maintenance, clonal expansions are small, with limited geographic scope and minimal migration. Driver mutations are rare in normal ageing prostate epithelium, but the one canonical driver we did observe generated a sizable intraepithelial clonal expansion. By resolving unbiased, continuously occurring lineage-marking mutations, we define stem cell dynamics through embryogenesis, puberty and ageing, with relevance for prostate cancer.	HiSeq X Ten	49
EGAD00001006593	Whole genome sequencing data used in the manuscript: DNA polymerase and mismatch repair exert distinct microsatellite instability signatures in normal and malignant human cells.	Illumina NovaSeq 6000	72
EGAD00001006594	EORTC RP1335 SPECTA Lung cancer data - Oncomine dataset	Ion Torrent PGM	350
EGAD00001006595	This dataset contains 160 single-cell derived blood colonies from two neonates and 6 adults. It also contains 18 samples that were used as matched normals to call mutations in NanoSeq data (dataset EGAD00001006459).	HiSeq X Ten Illumina NovaSeq 6000	13
EGAD00001006596	The goal of this project was to perform long-read RNA sequencing (LR-seq, PacBio) in combination with short-read RNA-seq for systematic characterization of the isoform diversity in primary breast tumor samples. We sequenced the full-length transcriptomes of 26 breast tumors and 4 normal breast samples.	NextSeq 500	25
EGAD00001006597	The goal of this project was to perform long-read RNA sequencing (LR-seq, PacBio) in combination with short-read RNA-seq for systematic characterization of the isoform diversity in primary breast tumor samples. We sequenced the full-length transcriptomes of 26 breast tumors and 4 normal breast samples.	PacBio RS II Sequel	26
EGAD00001006598	The dataset was generated for studying metastatic mechanism of pancreatic ductal adenocarcinoma (PDAC). It is consisted of pair-end raw RNA sequencing reads of 33 fresh froze PDAC specimens, which includes 6 tumor-adjacent normal tissues (N), 13 primary tumors (PT), and 14 hepatic metastases (HM) from 14 PDAC patients (6 N-PT-HM trios, 7 PT-HM paires, and 1 HM).	HiSeq X Ten	32
EGAD00001006599	This data set contains pair-end raw whole exome sequencing data of matched primary tumors (PT) and hepatic metastases (HM) of pancreatic ductal adenocarcinoma (PDAC). Eight tumor adjacent normal tissues (N) were also evaluated. In total, there are 30 specimens generated from 11 PDAC cases, including 8 PT-HM-N trios and 3 PT-HM paires.	HiSeq X Ten	30
EGAD00001006601	This dataset consists of ATAC-seq data from human monocytes, monocyte-derived dendritic cells or monocyte-derived macrophages as well as monocyte-derived cells that were subjected to siRNA treatment targeting TET2, IRF4 and EGR2. In total, it includes 26 samples.	NextSeq 550	24
EGAD00001006602	This dataset consists of ChIP-seq data data from human monocytes, monocyte-derived dendritic cells as well as monocyte-derived cells that were subjected to mRNA transfection for PU.1, IRF4, and EGR2. In total, it the data set includes 18 samples.	Illumina HiSeq 3000	18
EGAD00001006603	This dataset consists of 5hmC capture-seq data from human monocytes, monocyte-derived dendritic cells. It includes two biological replicates and three time points. Including controls, the dataset comprises 10 samples in total.	Illumina HiSeq 1000 NextSeq 550	10
EGAD00001006604	This dataset consists of RNA-seq data data from human monocytes, monocyte-derived dendritic cells or monocyte-derived macrophages as well as monocyte-derived cells that were subjected to siRNA treatment targeting TET2, IRF4 and EGR2. In total, it includes 43 samples.	NextSeq 550	43
EGAD00001006608	Single-cell RNA and VDJ sequencing of early breast cancer	Illumina NovaSeq 6000	84
EGAD00001006609	RNASeq files for paper titled "Prognostic and therapeutic significance of leukemia subtypes and minimal residual disease measurements in pediatric acute lymphoblastic leukemia treated with contemporary risk-directed trial: a cohort study"	Illumina HiSeq 2000 Illumina NovaSeq 6000	122
EGAD00001006610	We devised an approach to disentangle the TCR and CD28 pathways upon stimulation in naive and memory primary human CD4+ T cells (Tcons) in response to defined stimulatory signals. Sorted Tcons were activated using a titration of anti-CD3 and anti-CD28 in combination as well as individually. As a control we cultured cells in the same conditions but without the stimuli. In total, we defined seven conditions from four individuals for sequencing. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/.	Illumina HiSeq 2500	74
EGAD00001006611	We devised an approach to disentangle the TCR and CD28 pathways upon stimulation in naive and memory primary human CD4+ T cells (Tcons) in response to defined stimulatory signals. Isolated memory and naïve T cells were activated using anti-CD3, anti-CD28 or both in combination. As a control we cultured cells in the same conditions but without the stimuli. We carried Chipmentation using the H3K27ac antibody on 200,000 cross-linked cells. 1) This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/ .	Illumina HiSeq 2500	18
EGAD00001006612	We devised an approach to disentangle the TCR and CD28 pathways upon stimulation in naive and memory primary human CD4+ T cells (Tcons) in response to defined stimulatory signals. Sorted Tcons were activated using a titration of anti-CD3 and anti-CD28 in combination as well as individually. As a control we cultured cells in the same conditions but without the stimuli. In total, we defined seven conditions from four individuals for sequencing. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ .	Illumina HiSeq 2500	30
EGAD00001006613	linking 82 samples/82 runs of WES from EGAS00001004338 Umbrella study to EGAS0001004786 study	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	-
EGAD00001006614	linking 58 samples/58 runs of WGS from EGAS00001004338 Umbrella study to EGAS0001004786 study	HiSeq X Ten	-
EGAD00001006615	linking 55 samples/74 RNA-Seq runs - out of EGAS00001004338 Umbrella study to EGAS0001004786	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	-
EGAD00001006616	This dataset contains somatic alteration calls summarized at the gene level for 715 patients profiled by FoundationOne (Foundation Medicine).		1538
EGAD00001006617	This dataset contains demographic, histology, PDL1 IHC, TMB and outcome data (PFS and ORR) for 836 patients, 823 of which had RNAseq, 715 had FMI, with 702 patients having both. The dataset also includes xCell deconvolution scores for patients with RNAseq data.		1538
EGAD00001006618	This dataset contains log2(TPM + 1) transformed counts for the 823 tumor samples profiled by RNAseq.		1538
EGAD00001006619	This dataset contains FASTq files for tumors from the 823 patients profiled by RNAseq.	Illumina HiSeq 4000	823
EGAD00001006620	RNAseq of tumours derived from NSCLC (n=3 PDX; n=1 CDX) and melanoma (n=1 CDX) xenograft models treated with ADC or controls (64 samples; 189 FASTQ files).	Illumina HiSeq 2500 Illumina NovaSeq 6000	64
EGAD00001006621	This dataset contains DNA sequencing data for twelve hepatoblastoma tumor samples, four of which have matched normals. Nine of the samples also have RNA sequencing data from the tumor sample. The dataset comprises six samples from patient tissues and six from cell lines; see metadata for details.	Illumina HiSeq 2500 Illumina HiSeq 4000	12
EGAD00001006622	This dataset comprises DNA sequencing for tumor and matched normal from an Alveolar Rhabdomyosarcoma patient. It also include RNA sequencing data from the tumor sample.	Illumina HiSeq 4000	1
EGAD00001006623	We report a patient with mycobacterial disease due to inherited deficiency of the transcription factor T-bet. PBMCs from the patient and his heterozygous father were analyzed with scRNA-seq. these represent 2 single cell RNA samples generated using the 0xGenomics technology and being processed through cell ranger.	Illumina HiSeq 4000	2
EGAD00001006624	31 single-cell transcriptomes of neuroblastomas and normal human developing adrenal glands at various stages of embryonic and fetal development	Illumina HiSeq 4000 Illumina NovaSeq 6000	31
EGAD00001006625	144 sample from individuals with ALT-positive neuroblastoma tumors, chip-seq sequencing	Illumina HiSeq 2000 Illumina HiSeq 2500	144
EGAD00001006626	238 samples from individuals with ALT-positive neuroblastoma tumors, high coverage whole genome sequencing	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500	238
EGAD00001006627	Fastq files for single cell RNA sequencing of cells from ovarian cancer tumor and ascites (10X chromium 5' v1.1 libraries). Multiple cases are pooled in each sequencing library.	Illumina HiSeq 4000	2
EGAD00001006628	RNA was extracted from fresh frozen LMS material for 29 untreated tumors (24 primary tumors, 5 metastatic12 relapses) and 13 tumors treated with radiation(7 primaries, 6 metastatic relapses). RNA-Seq sequencing was performed using established protocols on Illumina HiSeq 2500	Illumina HiSeq 2500	51
EGAD00001006629	The dataset contains FASTQ files referring to the study "Small RNA sequencing from CSF extracellular vesicles - PD/CTR". For this project, RNA was isolated from CSF extracellular vesicles obtained by ultracentrifugation. Libraries were prepared with the TruSeq Small RNA library prep Illumina, and sequencing conducted in the Illumina HiSeq4000.	Illumina HiSeq 4000	104
EGAD00001006630	Clinical data from IMvigor210, POPLAR, IMmotion150: Clinical data include demographics, tumor type, PD-L1 IHC, tumor mutation burden, objective response rate, overall survival and progression free survival for 611 patients across IMvigor210, POPLAR and IMmotion150. Clinical data from PCD4989g: Clinical data include tumor type, PD-L1 expression and objective response rate for 206 patients from PCD4989g.		1651
EGAD00001006631	RNAseq FASTq files from 817 bulk pre-treatment tumors from three indications (mUC, NSCLC and RCC) across three phase II (IMvigor210, POPLAR, IMmotion150) and a phase I (PCD4989g) clinical trials.	Illumina HiSeq 4000	817
EGAD00001006632	Whole exome sequencing FASTq files from 469 pre-treatment tumors from IMvigor210, POPLAR and IMmotion150, with matched PBMC samples.	Illumina HiSeq 4000	834
EGAD00001006634	We have in total 16 files, technical duplicates of 8 unique samples from Pre and Post BCG samples collected from four non muscle invasive bladder cancer patients. These are bulk RNAseq samples generated by high-throughput sequencing platform.	Illumina HiSeq 4000	16
EGAD00001006636	RNA sequencing of a total of 41 tumor biopsies taken from a total of 14 patients with colorectal cancer. Ribosomal RNA was removed using the Ribo-Zero Gold rRNA Removal Kit (Illumina, CA, USA) and Paired-end sequencing were performed using ScriptSeq v2 RNA-seq Library preparation Kit (Illumina). Data processing of the paired raw sequence reads was performed using TopHat2, with mapping to the human reference genome HG19. Forty-one BAM files with reads mapping the the human reference genome (HG19) is enclosed.	NextSeq 500	41
EGAD00001006637	89 samples of individuals with ALT-positive neuroblastoma tumors, exome sequencing	Illumina HiSeq 2500 Illumina HiSeq 4000	89
EGAD00001006638	97 samples of individuals with ALT-positive neuroblastoma tumors, low coverage whole genome sequencing	Illumina HiSeq 2500 Illumina HiSeq 4000	97
EGAD00001006639	63 samples of individuals with ALT-positive neuroblastoma tumors, high coverage whole genome sequencing	HiSeq X Ten	63
EGAD00001006640	Aligned whole-genome sequencing and RNA-seq of localised prostate cancer for study 'Loss of SNAI2 in prostate cancer correlates with clinical response to androgen deprivation therapy'.	HiSeq X Ten	109
EGAD00001006641	During the course of a lifetime normal human cells accumulate mutations. Here, using multiple samples from the same individuals we compared the mutational landscape in 29 anatomical structures from soma and the germline. Two ubiquitous mutational signatures, SBS1 and SBS5/40, accounted for the majority of acquired mutations in most cell types but their absolute and relative contributions varied substantially. SBS18, potentially reflecting oxidative damage, and several additional signatures attributed to exogenous and endogenous exposures contributed mutations to subsets of cell types. The mutation rate was lowest in spermatogonia, the stem cell from which sperm are generated and from which most genetic variation in the human population is thought to originate. This was due to low rates of ubiquitous mutation processes and may be partially attributable to a low cell division rate of basal spermatogonia. The results provide important insights into how mutational processes affect the soma and germline.	HiSeq X Ten Illumina NovaSeq 6000	1
EGAD00001006642	During the course of a lifetime normal human cells accumulate mutations. Here, using multiple samples from the same individuals we compared the mutational landscape in 29 anatomical structures from soma and the germline. Two ubiquitous mutational signatures, SBS1 and SBS5/40, accounted for the majority of acquired mutations in most cell types but their absolute and relative contributions varied substantially. SBS18, potentially reflecting oxidative damage, and several additional signatures attributed to exogenous and endogenous exposures contributed mutations to subsets of cell types. The mutation rate was lowest in spermatogonia, the stem cell from which sperm are generated and from which most genetic variation in the human population is thought to originate. This was due to low rates of ubiquitous mutation processes and may be partially attributable to a low cell division rate of basal spermatogonia. The results provide important insights into how mutational processes affect the soma and germline.	Illumina HiSeq 4000 Illumina NovaSeq 6000	-
EGAD00001006643	During the course of a lifetime normal human cells accumulate mutations. Here, using multiple samples from the same individuals we compared the mutational landscape in 29 anatomical structures from soma and the germline. Two ubiquitous mutational signatures, SBS1 and SBS5/40, accounted for the majority of acquired mutations in most cell types but their absolute and relative contributions varied substantially. SBS18, potentially reflecting oxidative damage, and several additional signatures attributed to exogenous and endogenous exposures contributed mutations to subsets of cell types. The mutation rate was lowest in spermatogonia, the stem cell from which sperm are generated and from which most genetic variation in the human population is thought to originate. This was due to low rates of ubiquitous mutation processes and may be partially attributable to a low cell division rate of basal spermatogonia. The results provide important insights into how mutational processes affect the soma and germline.	Illumina HiSeq 4000	85
EGAD00001006644	this dataset corresponds to 3 patient of HPV-driven warts single cell RNA data generated through the 10X genomics platform and aligned on GRCh38 reference using the cell ranger tools.	Illumina NovaSeq 6000 NextSeq 550	4
EGAD00001006645	10 samples (one baseline, 9 on-treatment). Fastq files containing 5'GEx data, prepared using 10x Genomics pipeline, sequenced on Illumina HiSeq4000.	Illumina HiSeq 4000	10
EGAD00001006646	The data set consists of fastq raw files from RNA-seq of seven mucosal biopsies of the colon from seven patients, among them three patients with irritable bowel syndrome with mixed type symptoms. Paired end sequencing on Illumina NovaSeq 6000 was used.	Illumina NovaSeq 6000	7
EGAD00001006648	Genetic redundancy has evolved as a way for human cells to survive the loss of genes that are single copy and essential in other organisms, but also allows tumours to survive despite having highly rearranged genomes. In this study we CRISPR screen 1,191 gene pairs, including paralogues and known and predicted synthetic lethal interactions to identify 105 gene combinations whose co-disruption results in a loss of cellular fitness. 27 pairs influence fitness across multiple cell lines including the paralogues FAM50A/FAM50B, two genes of unknown function. Silencing of FAM50B occurs across a range of tumour types and in this context disruption of FAM50A reduces cellular fitness whilst promoting micronucleus formation and extensive perturbation of transcriptional programmes. This dataset includes CRISPR screening of cancer cell lines, RNA sequencing studies of cancer cell lines and also data from the sequencing of tumour xenografts collected from mice.	Illumina HiSeq 2500	27
EGAD00001006649	Genetic redundancy has evolved as a way for human cells to survive the loss of genes that are single copy and essential in other organisms, but also allows tumours to survive despite having highly rearranged genomes. In this study we CRISPR screen 1,191 gene pairs, including paralogues and known and predicted synthetic lethal interactions to identify 105 gene combinations whose co-disruption results in a loss of cellular fitness. 27 pairs influence fitness across multiple cell lines including the paralogues FAM50A/FAM50B, two genes of unknown function. Silencing of FAM50B occurs across a range of tumour types and in this context disruption of FAM50A reduces cellular fitness whilst promoting micronucleus formation and extensive perturbation of transcriptional programmes. This dataset includes CRISPR screening of cancer cell lines, RNA sequencing studies of cancer cell lines and also data from the sequencing of tumour xenografts collected from mice.	Illumina HiSeq 4000 Illumina NovaSeq 6000	55
EGAD00001006650	This dataset contains Raw Reduced Representation DNA bisulfite-sequencing data obtained from human brain samples corresponding to 3 Young and 3 Old individuals (aging context), and 3 normal and 3 Glioblastoma samples (tumor context). RRBS libraries were prepared at Diagenode SA and samples were sequenced using the Illumina Novaseq6000 sequencing platform.The accompanying samples from this study (mouse tissues) are located at the ENA database under the accession number PRJEB41460.	Illumina NovaSeq 6000	12
EGAD00001006653	This dataset contains paired-end whole-exome sequencing data (2x50 bp) from the normal sample, three synchronous primary tumors and the recurrence of a head and neck cancer patient.	Illumina NovaSeq 6000	5
EGAD00001006654	This dataset contains paired-end RNA sequencing data (2x50 bp) from the three synchronous primary tumors and the recurrence of a head and neck cancer patient.	Illumina HiSeq 2500	4
EGAD00001006655	This dataset contains PBMC genome-wide RNAseq reads from 21 samples and one expression matrix file after alignment and aggregation of the 21 samples. The samples are case-control drawn on day 6 from long-term GFD treated CD patients after 3 day oral gluten challenge, on day 0 from patient controls on long-term GFD treatment, and on day 6 from 4 week GFD treated healthy controls after 3 day oral gluten challenge.	Illumina HiSeq 2000	21
EGAD00001006656	For this project about non-muscle invasive bladder cancer (NMIBC), we analysed total RNA-seq data from 535 patients. Sequencing of total RNA was performed using ScriptSeq-v2 RNA-Seq Library Preparation Kit (Illumina) and KAPA RNA HyperPrep Kit with RiboErase HMR (Roche). RNA input was 500 ng for both kits. The dataset is composed of 1,596 fastq files.	Illumina HiSeq 2000 Illumina MiSeq Illumina NovaSeq 6000 NextSeq 550	535
EGAD00001006657	This dataset entails 40 Bulk-RNA sequenced patient-derived gastro-intestinal neuroendocrine (GEP-NEN) neoplasms.	Illumina HiSeq 4000	40
EGAD00001006658	Germline exome tumor/control pairs for 41 medulloblastoma cases, MBG cohort sequenced on Illumina machines from the paper "Germline Elongator mutations in Sonic Hedgehog medulloblastoma" (Waszak et al. 2020 Nature).	HiSeq X Ten unspecified	53
EGAD00001006659	218 control exomes, CEF cohort, sequenced on Illumina machines from the paper "Germline Elongator mutations in Sonic Hedgehog medulloblastoma" (Waszak et al. 2020 Nature).	Illumina HiSeq 2000 Illumina HiSeq 2500 unspecified	218
EGAD00001006660	Tumor/Control pairs for 8 medulloblastoma cases, MB cohort, mixed exome and whole genome data, sequenced on Illumina machines from the paper "Germline Elongator mutations in Sonic Hedgehog medulloblastoma" (Waszak et al. 2020 Nature).	HiSeq X Ten Illumina HiSeq 4000	31
EGAD00001006661	Exome controls for 70 individuals, PAN-GATC cohort, sequenced on Illumina machines from the paper "Germline Elongator mutations in Sonic Hedgehog medulloblastoma" (Waszak et al. 2020 Nature).	Illumina HiSeq 2000	70
EGAD00001006662	3 control whole genomes, SF cohort, sequenced on Illumina machines from the paper "Germline Elongator mutations in Sonic Hedgehog medulloblastoma" (Waszak et al. 2020 Nature).	HiSeq X Ten	3
EGAD00001006663	9 tumor/control exomes, SJMB samples, sequenced on Illumina machines from the paper "Germline Elongator mutations in Sonic Hedgehog medulloblastoma" (Waszak et al. 2020 Nature).	unspecified	18
EGAD00001006664	6 control exomes, TB cohort, sequenced on Illumina machines from the paper "Germline Elongator mutations in Sonic Hedgehog medulloblastoma" (Waszak et al. 2020 Nature).	unspecified	6
EGAD00001006665	225 clinical cases, control exomes with some paired tumor data, sequenced on Illumina machines from the paper "Germline Elongator mutations in Sonic Hedgehog medulloblastoma" (Waszak et al. 2020 Nature).	unspecified	264
EGAD00001006666	This dataset contain 45 pairs of colorectal tumor and adjacent normal tissue. Four of them are from a previous study EGAS00001002477. For each sample a BAM was generated by aligning to GRCh37. Those colorectal cancer all have the MSI phenotype.	Illumina HiSeq 2000	82
EGAD00001006667	This dataset contain 133 pairs of colorectal tumor and adjacent normal tissue. For each sample paired RNA-seq fastq were generated using an Illuma Myseq-2000. Those colorectal cancer comprise 101 MSI and 32 MSS tumors.	Illumina HiSeq 2000	266
EGAD00001006668	WES performed on 15 CUP-derived samples	Illumina NovaSeq 6000	15
EGAD00001006669	This dataset contains 91 RNAseq paired reads, in fastq format. Samples were collected from fresh bone marrow and peripheral blood sample from AML patients.	Illumina HiSeq 2500	91
EGAD00001006670	This dataset contains 18 ATACseq reads, in fastq format. Samples were collected from fresh bone marrow and peripheral blood sample from AML patients.	Illumina HiSeq 2500	18
EGAD00001006671	The SARS-CoV-2 pandemic has led to increasing numbers of COVID-19 patients all over the world. Aetiopathologies range from no symptoms, mild flu-like to severe cases succumbing to respiratory failure. Reports on a dysregulated immune system in the severe cases, showing similarities to cytokine release syndrome, calls for better characterization and understanding of the changes in the immune system as well as their variance across COVID-19 patients in order to be able to design according to host-directed therapies. Here, we profiled blood transcriptomes of 39 COVID-19 patients and 10 control donors. Enriched granulocyte signatures in whole blood samples were verified in granulocyte samples from 49 COVID-19 patients in a second cohort.	NextSeq 500	41
EGAD00001006673	Please note: This synthetic data set (with cohort “participants” / ”subjects” marked with FAKE) has no identifiable data and cannot be used to make any inference about cohort data or results. The purpose of this dataset is to aid development of technical implementations for cohort data discovery, harmonization, access, and federated analysis. In support of FAIRness in data sharing, this dataset is made freely available under the Creative Commons Licence (CC-BY). Please ensure this preamble is included with this dataset and that the CINECA project (funding: EC H2020 grant 825775) is acknowledged. For any questions please contact isuru@ebi.ac.uk or cthomas@ebi.ac.uk This dataset (CINECA_synthetic_cohort_EUROPE_UK1) consists of 2521 samples which have genetic data based on 1000 Genomes data (https://www.nature.com/articles/nature15393), and synthetic subject attributes and phenotypic data derived from UKBiobank (https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.1001779). These data were initially derived using the TOFU tool (https://github.com/spiros/tofu), which generates randomly generated values based on the UKBiobank data dictionary. Categorical values were randomly generated based on the data dictionary, continuous variables generated based on the distribution of values reported by the UK Biobank showcase, and date / time values were random. Additionally we split the phenotypes and attributes into 4 main classes - general, cancer, diabetes mellitus, and cardiac. We assigned the general attributes to all the samples, and the cardiac / diabetes mellitus / cancer attributes to a proportion of the total samples. Once the initial set of phenotypes and attributes were generated, the data data was checked for consistency and where possible dependent attributes were calculated from the independent variables generated by TOFU. For example, BMI was calculated from height and weight data, and age at death generated by date of death and date of birth. These data were then loaded to the development instance of Biosamples (https://www.ebi.ac.uk/biosamples/) which accessioned each of the samples. The genetic data are derived from the 1000 Genomes Phase 3 release (https://www.internationalgenome.org/category/phase-3/). The genotype data consists of a single joint call vcf files with call genotypes for all 2504 samples, plus bed, bim, fam, and nosex files generated via plink for these samples and genotypes. The genotype data has had a variety of errors introduced to mimic real data and as a test for quality control pipelines. These include gender mismatches, ethnic background mislabelling and low call rates for a randomly chosen subset of sample data as well as deviations from Hardy Weinberg equilibrium and low call rates for a random selection of variants. Additionally 40 samples have raw genetic data available in the form of both bam and cram files, including unmapped data. The gender of the samples in the 1000 genomes data has been matched to the synthetic phenotypic data generated for these samples. The genetic data was then linked to the synthetic data in BioSamples, and submitted to EGA.	Illumina HiSeq 2000	448
EGAD00001006674	RNASeq files for paper titled "The Acquisition of Molecular Drivers in Pediatric Therapy-Related Myeloid Neoplasms"	Illumina HiSeq 2000	56
EGAD00001006675	WXS files for paper titled "The Acquisition of Molecular Drivers in Pediatric Therapy-Related Myeloid Neoplasms"	Illumina HiSeq 2000	137
EGAD00001006676	WGS files for paper titled "The Acquisition of Molecular Drivers in Pediatric Therapy-Related Myeloid Neoplasms"	Illumina HiSeq 2000	35
EGAD00001006677	Single cell RNA-sequencing of sternal bone marrow reciding Hematopoietic Stem Cells (HSCs) and Megakaryocytes (MKs) from individuals undergoing elective open heart valve replacement. HSCs were defined as Lineage-, CD34+, CD38-, CD45RA-, CD90+, CD49f+ cells. MKs where CD41a+, CD42b+ and ploidy was determined with Hoechst. A sternal bone marrow scraping was taken directly following median sternotomy using a Volkmann’s spoon. The sample was collected into an EDTA Vacutainer tube containing 1.8mg/ml EDTA. 4mL of Dulbecco’s phosphate buffered saline (PBS, Sigma) containing 10% human serum albumin (HSA, Gemini Bio Products) was added and the whole volume was resuspended by pipetting 2-3 times. The sample was then put on metallic thermal beads (ThermoFisher Scientific) at a temperature between 0-4°C and transported to the University of Cambridge for further processing. For HSC isolation the cells were stained with the following antibody cocktail: PECy5 conjugated anti-lineage specific antibodies: CD2 (BD), CD3 (BD), CD10 (BD), CD11b (BD), CD11c (BD), CD19 (BD), CD20 (BD), CD56 (BD), biotinylated CD42b (Pab5, NHS Blood and Transplant, International Blood Group Reference Laboratory [IBGRL]), biotinylated GP6 (Pab5, NHS Blood and Transplant, International Blood Group Reference Laboratory [IBGRL]) used in combination with PECy5 conjugated streptavidin (Biolegend). Alexa Fluor 700 conjugated anti-CD34 (BD), PerCP-Cy5.5 conjugated anti-CD38 (BD), Pacific Blue conjugated anti-CD45RA (Invitrogen), PECy7 conjugated anti-CD90 (BD),PE conjugated anti-CD49f (BD). After staining cells were kept at 4°C before sorting using a FACS Aria Fusion flow sorter (BD). Single HSCs defined as Lineage-, CD34+, CD38-, CD45RA-, CD90+, CD49f+ cells were sorted by FACS directly into individual wells of a 96-well plate. Index sort data was collected for each single cell. For MK isolation the cells were stained for surface MK markers with mouse anti-human CD41a APC conjugated antibody (BD) and mouse anti-human CD42b PE conjugated antibody (BD) and for ploidy analysis with 1ug/ml Hoechst 33342 (Invitrogen). After incubation at 37°C for 30 minutes, the cells were kept at 4°C before sorting using a FACS Aria Fusion flow sorter (BD). Single cells and MK pools of 20 cells were sorted by FACS according to ploidy level using a 100uM nozzle directly into individual wells of a 96-well plate. cDNA synthesis and poly(A) enrichment was performed following the G&T-seq protocol (Macaulay et al. 2015), a variation of the Smart-seq2 protocol1. ERCC spike-in RNA (Ambion) was added to the lysis buffer in a dilution of 1:4,000,000.	Illumina HiSeq 4000	2383
EGAD00001006678	HiC files for GenomePaint paper titled "Exploration of coding and non-coding variants in cancer using GenomePaint."	Illumina HiSeq 2000	8
EGAD00001006679	WGS files for GenomePaint paper titled "Exploration of coding and non-coding variants in cancer using GenomePaint."	Illumina HiSeq 2000	1
EGAD00001006680	RNASeq files for GenomePaint paper titled "Exploration of coding and non-coding variants in cancer using GenomePaint."	Illumina HiSeq 2000	1
EGAD00001006681	DNAs were genotyped on Illumina Infinium HumanCoreExome Beadchips (Illumina Inc., San Diego, CA, USA) with probes for 551,004 single nucleotide variants (SNVs): 282,373 informative across ancestries; 268,631 exome-focused. Human genome build 37 (hg19) was used.		1
EGAD00001006701	As more clinically-relevant genomic features of myeloid malignancies are revealed, it has become clear that targeted clinical genetic testing is inadequate for risk stratification. Here, we developed and validated a clinical transcriptome-based assay for stratification of acute myeloid leukemia (AML). Comparison of RNA-Seq to whole genome and exome sequencing revealed that as a standalone assay, RNA-Seq offered the greatest diagnostic return, enabling identification of expressed gene fusions, single nucleotide and short insertion/deletion variants, and whole-transcriptome expression information. Expression data were used to develop a novel risk score which, when combined with molecular risk guidelines, allowed for the re-stratification of 22.1 to 25.3% of AML patients from three independent cohorts into correct risk groups. Within the adverse-risk subgroup, we identified a subset of patients characterized by dysregulated integrin signaling and RUNX1 or TP53 mutation. We show that these patients may benefit from therapy with inhibitors of focal adhesion kinase (PTK2), demonstrating additional utility of transcriptome-based testing for therapy selection in myeloid malignancy.		275
EGAD00001006730	The dataset includes whole exome DNA sequencing on pre-treatment tumor biopsies of lymph node metastases (n=60) matched with blood samples (n=60)	Illumina NovaSeq 6000	120
EGAD00001006731	The dataset includes RNA sequencing on pre-treatment tumor biopsies of lymph node metastases (n=65)	Illumina HiSeq 2500	65
EGAD00001006732	Mutational signatures in esophageal squamous cell carcinoma from eight countries of varying incidence – patient metatdata (Mutographs)		-
EGAD00001006733	This dataset contains all available targeted exon sequencing bam files from our study, "BRCA2, ATM, and CDK12 defects differentially shape prostate tumor driver genomics and clinical aggression". Patient identifiers are denoted by the first segment of the sample aliases (e.g. "P1"), and additional information is appended to reflect which sample is referenced. These include serial cfDNA samples ("-1", "-2", "-3", etc.), paired white blood cell or benign tissue control samples ("-Control"), or primary archival tissue samples derived from a diagnostic biopsy, prostatectomy, or transurethral resection of the prostate ("-Tissue"). All samples were sequenced using Illumina technology.	Illumina HiSeq 2500	368
EGAD00001006734	Human fecal WMS data from patients treated with combined anti-CTLA-4 and anti-PD-1 immunotherapy for advanced melanoma.	NextSeq 550	46
EGAD00001006735	Human fecal 16S rRNA gene sequencing data from patients treated with combined anti-CTLA-4 and anti-PD-1 immunotherapy for advanced melanoma.	Illumina MiSeq	54
EGAD00001006736	Genotyping of 244 early RA patients and 44 vaccine recipient controls was performed using the Illumina InfiniumCoreExome-24-v1-1 according to the manufacturer’s SOP. Raw idats from the Illumina iScan instrument were imported into GenomeStudio (v2011.1). Samples < 90 % call rate were excluded. Data was exported to PLINK PED/MAP format on the forward strand. Data was converted from PED/MAP to BED/BIM/FAM using PLINK v1.07.		276
EGAD00001006737	Proteome data of neuroblastoma patients		34
EGAD00001006738	Data supporting: "Evidence that polyploidy in esophageal adenocarcinoma originates from mitotic slippage caused by defective chromosome attachments" Scott et al. WGS and RNAseq sequencing data Organoid, tumour and normal samples BAM files	Illumina HiSeq 2000	7
EGAD00001006739		HiSeq X Ten	17
EGAD00001006740	40 samples of WES and their normal controls; 33 samples of RNAseq data.	Illumina HiSeq 4000	113
EGAD00001006741	Matrices of TPM-normalized counts from RNAseq data for the three phase II clinical trials (IMvigor210, POPLAR, IMmotion150) and the phase I clinical trial PCD4989g.		817
EGAD00001006742	Pan Prostate Cancer Group UK BAM files		561
EGAD00001006743	We performed whole-exome sequencing on 46 pairs of PSCCE and matched normal sample. Somatic mutations were called using MuTect2.		46
EGAD00001006744	RNA exome	Illumina NovaSeq 6000	1
EGAD00001006745	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006746	Non-tumorous breast tissues from BRCA1 or BRCA2 carriers were subject to RNA sequencing. The total number of samples is 130.	Illumina HiSeq 4000	130
EGAD00001006747	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006748	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006749	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006750	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006751	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006752	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006753	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006754	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006755	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006756	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006757	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006758	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006759	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006760	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006761	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006762	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006763	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006764	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006765	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006766	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006767	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006768	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006769	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006770	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006771	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006772	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006773	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006774	CSC DDR dataset contains 4 bam files of two pairs of colorectal CSCs sensitive and resistant to ATR or CHK1 inhibitor	NextSeq 500	1
EGAD00001006777	We generated HPRT-only knockout lines as well as the combination of HPRT with MSH2, UNG and XPC. Whole genome sequencing was performed on generated clones and subclones. By subtracting variants present in the clones from those in the subclones, the somatic mutations, that accumulated in between the clonal steps, were determined.	HiSeq X Ten Illumina NovaSeq 6000	11
EGAD00001006778	Single cell RNA and CITE sequencing of newly-diagnosed and recurrent GBM	Illumina HiSeq 4000 Illumina NovaSeq 6000	15
EGAD00001006779	T cells were isolated from human blood and tissues: skin and fat. Subsequently, CD4+ T cells and CD25+ T cells were FACS sorted. scATAC libraries were prepared using the 10x Genomics Kit (CG000168_ChromiumSingleCell_ATAC_ReagentKits_UserGuide_RevD.pdf) and sequenced on an Illumina NextSeq550. In total 15 samples were prepared	NextSeq 500	15
EGAD00001006780	Whole genome sequencing BAMs of DNA obtained from blood/saliva of 8 patients with adult granulosa cell tumors. These patients come from four independent families, with each family having 2 affected family members.	Illumina NovaSeq 6000	8
EGAD00001006781	This dataset contains RNA-seq raw data in fastq format from 14 tumor samples. The samples are from primary tumors or metastasis and represent various cancer entities. The samples are formalin-fixed paraffin-embedded (FFPE) treated. For target enrichment SureSelect XT Human All Exon V6 was used. The libraries were sequenced in paired-end mode (2 x 50 nt) on a NovaSeq6000 S2 flow cell.	Illumina NovaSeq 6000	14
EGAD00001006782	In this study, we aimed to identify somatic structural variation of acute myeloid leukemia (AML) at the single-cell level and investigate its direct consequence on the nucleosome occupancy using scNOVA approach. For this purpose, we performed strand-specific single-cell sequencing of primary leukemia samples from 32-year-old male donor.	NextSeq 500	42
EGAD00001006783	This dataset contains RNA-seq raw data in fastq format from 9 melanoma samples. The samples are formalin-fixed paraffin-embedded (FFPE) treated. For target enrichment SureSelect XT Human All Exon V6 was used. The libraries were sequenced in paired-end mode (2 x 50 nt) on a NovaSeq6000 S2 flow cell.	Illumina NovaSeq 6000	9
EGAD00001006784	This study aims to identify novel candidate variants from human Y-chromosomal genes DAZ, BPY2 and CDY1/2 by resequencing the coding regions of these genes from male patients with spermatogenic impairment. The coding regions of the genes plus a selection of phylogenetically informative Y-chromosomal markers have been amplified by standard PCR, amplicon lengths range from 178 to 486 bp. Amplicons were quantified by gel electrophoresis and pooled in approx. equimolar concentrations per patient. For each of the 480 submitted samples, approx. 1 microgram of amplified DNA pool was provided in a total volume of 120 microlitres. The samples were indexed and libraries prepared for PE250bp Illumina MiSeq runs. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ .	Illumina MiSeq	480
EGAD00001006785		Illumina HiSeq 4000	9
EGAD00001006786	Multi region samples are collected from patients, with consent, immediately after resection of the tumour. Samples are digested and sorted using FACS as single cells into lysis buffer. Cells are then stored until further processing for G&T-seq. After sequencing, we will explore intra-tumour heterogeneity using computational approaches to integrate RNA and DNA data onto the tumour phylogeny This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ .	Illumina HiSeq 4000	672
EGAD00001006789	eQTLsummary&GeneTable from eQTL study in 299 intestinal biopsy samples from IBD		1
EGAD00001006790	intrstinal.eQTLsummary		1
EGAD00001006791	eQTL.study.release.phenotype		1
EGAD00001006792	eQTLsummary&GeneTable		1
EGAD00001006793	Whole genome sequencing of tumour (90X) - normal (30X) patient pair and bulk transcriptome sequencing (80M PE reads) of tumour sample.	HiSeq X Ten NextSeq 500	1
EGAD00001006794	Bulk RNA-seq data of tumour data.	NextSeq 500	1
EGAD00001006795	Plasmablastic lymphoma (PBL) represents a clinically heterogeneous subtype of aggressive B-cell non-Hodgkin lymphoma. Although targeted sequencing studies and a single center whole exome sequencing (WES) study in HIV+ patients recently revealed several genes, associated with PBL pathogenesis, the global mutational landscape and transcriptional profile of PBL remain elusive. To inform on disease-associated mutational drivers, mutational patterns and perturbed pathways in HIV+ and HIV- PBL we performed WES and RNA-sequencing (RNA-seq) of 34 PBL tumors.	Illumina HiSeq 2500	73
EGAD00001006796	PBMCs isolated from 27 individuals (11 narcolepsy type 1 patients, 16 healthy controls) were stimulated with the peptide Neuroaminidase 175-189 or Protein-O-mannosyl transferase 1 (POMT1) 675-689 or media as control. FACS sorted CD4+ and CD8+ lymphocytes from one patient were subjected to the same stimulation. Transcriptome profiling was done with a 3' tagging protocol. T cell receptor repertoires were profiled with amplicon sequencing (Rep-seq). The dataset contains FASTQ files with sequencing reads, transcript count matrices and TCR clonotypes.	Illumina MiSeq NextSeq 500	82
EGAD00001006797	PBMCs isolated from 27 individuals (11 narcolepsy type 1 patients, 16 healthy controls) were stimulated with the peptide Neuroaminidase 175-189 or Protein-O-mannosyl transferase 1 (POMT1) 675-689 or media as control. FACS sorted CD4+ and CD8+ lymphocytes from one patient were subjected to the same stimulation. Transcriptome profiling was done with a 3' tagging protocol. T cell receptor repertoires were profiled with amplicon sequencing (Rep-seq). The dataset contains FASTQ files with sequencing reads, transcript count matrices and TCR clonotypes.	Illumina MiSeq NextSeq 500	82
EGAD00001006798	eQTL.study.release.inflammation.eQTLsummary		1
EGAD00001006799	This dataset includes single cell amplicon based sequencing from 10 samples from SDS patient bone marrow samples, including one patient with serial samples. There are two fastq files per sample.	Illumina NovaSeq 6000	20
EGAD00001006800	This dataset includes amplicon based sequencing of myeloid malignancy associated genes as well as EIF6 in 99 patients with Shwachman-Diamond syndrome and 11 patients who are "SDS-like". SDS-like patients are those that have clinical features of the disease but do not have a confirmed disease-causing mutation. There are 421 samples from serial timepoints, denoted alphabetically or numerically. For each timepoint, there is a single BAM file.	HiSeq X Ten	421
EGAD00001006801	RNA-seq of dermal fibroblasts treated ± TGF-β from control and Shprintzen-Goldberg syndrome patients.	Illumina HiSeq 4000	36
EGAD00001006802	Single nuclei RNA-sequencing of snap frozen glioblastoma tumor tissue with 10x Genomics 3' expression (v2 chemistry). Aligned to GRCh38 reference genome with intron using CellRanger.	Illumina HiSeq 2500	10
EGAD00001006803	Single cell RNA-sequencing of fresh glioblastoma tumor biopsies with 10x Genomics 3' expression (v2 chemistry). Aligned to GRCh38 reference genome using CellRanger.	Illumina HiSeq 2500	23
EGAD00001006804	Single cell RNA-sequencing of glioblastoma stem cell (GSC) lines with 10x Genomics 3' expression (v2 chemistry). Aligned to GRCh38 reference genome using CellRanger.	Illumina HiSeq 2500	29
EGAD00001006807	Neutrophils at timepoint 0h	HiSeq X Ten	3
EGAD00001006808	Neutrophils infected with Leishmania donovani at timepoint 6h	HiSeq X Ten	6
EGAD00001006809	Neutrophils at timepoint 6h	HiSeq X Ten	3
EGAD00001006811	Dataset includes cell-free ChIP-seq data of 268 samples (from 61 self-declared healthy donors, four patients with acute myocardial infarction, 29 patients suffering from autoimmune, metabolic, or viral liver diseases and 56 metastatic colorectal carcinoma (CRC) patients). DNA libraries preparation is documented in the methods section. Libraries were paired end sequenced by Illumina NextSeq 500 and aligned to the human genome (hg19) using bowtie2 (2.3.4.3) with ‘no-mixed’ and ‘no-discordant’ flags. This dataset includes fastq and BAM files of all samples.	NextSeq 500	271
EGAD00001006812	This dataset includes the RNA sequencing of 14 samples. Samples are FACS sorted CD8+ T cells expressing or not the integrin CD103. The paired samples (TRM and non-TRM) were sorted from the tumor of 7 lung cancer patients.	Illumina HiSeq 2000	14
EGAD00001006813	RNA was extracted from GSCs using the Qiagen RNeasy Plus kit. RNA sample quality was measured by Qubit (Life Technologies) for concentration and by Agilent Bioanalyzer for RNA integrity. All samples had RIN above 9. Libraries were prepared using the TruSeq Stranded mRNA kit (Illumina). Two hundred nanograms from each sample were purified for polyA tail containing mRNA molecules using poly-T oligo attached magnetic beads, then fragmented post-purification. The cleaved RNA fragments were copied into first strand cDNA using reverse transcriptase and random primers. This is followed by second strand cDNA synthesis using RNase H and DNA Polymerase I. A single “A” base was added and adapter ligated followed by purification and enrichment with PCR to create cDNA libraries. Final cDNA libraries were verified by the Agilent Bioanalyzer for size and concentration quantified by qPCR. All libraries were pooled to a final concentration of 1.8nM, clustered and sequenced on the Illumina NextSeq500 as a pair-end 75 cycle sequencing run using v2 reagents to achieve a minimum of ~40 million reads per sample.	NextSeq 500	87
EGAD00001006814	The dataset consists of Oxford Nanopore targeted RNA-based amplicon data of 12 classical HLA genes (HLA-A, -B, -C, -DRA, -DRB1, -DRB3, -DRB4, -DRB5, -DQA1, -DQB1, -DPA1, and DPB1) of 50 healthy individuals. The 12 classical genes were sequenced in two separate gene pools on R9.4 flowcells using MinION sequencer. Per individual, gene pool 1 contains HLA-A, -B, -C, -DRB1, -DRB3, -DRB4, -DRB5, and -DPB1 and gene pool 2 HLA-DRA, -DQA1, -DQB1, and -DPA1. The dataset includes 100 fastq files of Oxford Nanopore 2D reads (50 for gene pool 1 and 50 for gene pool 2).	MinION	100
EGAD00001006815	This dataset includes paired WES from bone marrow samples of patients with SDS and paired bone marrow-derived fibroblasts as a germline reference. Some patients have multiple samples collected serially over time. There are 74 BAM files included in this dataset.	AB 5500 Genetic Analyzer Illumina HiSeq 2500	74
EGAD00001006816	The dataset includes the whole-exome sequencing (WES) of an extramedullary tumor anterior to the spinal cord at T4, which was resected and diagnosed as gliosarcoma. The patient initially diagnosed with a low-grade brain glioma via biopsy, followed by adjuvant radiation and temozolomide treatment. WES was performed using Illumina NovaSeq6000 with 2x100 bp reads. Mean coverage of 152.4x and 230.6x was achieved for normal and tumor, respectively.	Illumina NovaSeq 6000	1
EGAD00001006817	CTCF ChIP-seq of 14 leukemia patients: 6 AML without 3q rearrangements, 1 AML with 3q26, 1 AML with t(3;8) and 6 T-ALL H3K27ac ChIP-seq of AML patients: 4 cases with t(3;8), another with inv(3) and another with normal karyotype. H3K27ac ChIP-seq of CD34+ cells from one healthy donor RUNX1 ChIP-seq of one t(3;8) AML patient	Illumina HiSeq 2500	21
EGAD00001006818	The viewpoints used in the 4C-seq data were either the EVI1 promoter or the MYC super-enhancer	Illumina HiSeq 2500	1
EGAD00001006819	RNA-seq was generated to investigate differences in gene expression between t(3;8) AML and other primary AMLs. Briefly, sample libraries were prepared using 500 ng of input RNA according to the KAPA RNA HyperPrep Kit with RiboErase (HMR) (Roche) using Unique Dual Index adapters (Integrated DNA Technologies, Inc.). Amplified sample libraries were paired-end sequenced (2x100 bp) on the Novaseq 6000 platform (Illumina	Illumina HiSeq 2500 Illumina NovaSeq 6000	13
EGAD00001006820	This dataset contains DNA sequencing of the chromosome 3q region in 28 primary AML cases with 3q26 rearrangements (3q26-rearranged AML). Genomic DNA was fragmented using the Covaris shearing device (Covaris), and sample libraries were assembled following the TruSeq DNA Sample Preparation Guide (Illumina). After ligation of adapters and an amplification step, target sequences of chromosomal regions 3q21.1-q26.2 were captured using custom in-solution oligonucleotide baits (Nimblegen SeqCap EZ Choice XL). The design of target sequences was based on the human genome assembly hg19: chr3q21.1:126036241-130672290 - chr3q26.2:157712147-175694147. Amplified captured sample libraries were paired-end sequenced (2x100 bp) on the HiSeq 2500 platform (Illumina) and aligned against the hg19 reference genome using the Burrows-Wheeler Aligner (BWA).	Illumina HiSeq 2500	28
EGAD00001006821	ChIP-seq was conducted in blasts from patients with t(3;3) AML to assess differences of the GATA2 super-enhancer between the translocated allele and the non-translocated allele. The dataset includes 2x H3K27ac ChIP-seq and 1x MYB ChIP-seq. ChIP samples were processed according to the Illumina TruSeq ChIP Sample Preparation Protocol (Illumina) or Diagenode Library V3 preparation protocol (Diagenode) and either sequenced single-end (1x 50 bp) on the HiSeq 2500 platform (Illumina) or paired-end (2x100 bp) on the Novaseq 6000 platform (Illumina). Briefly, reads were aligned to the human reference genome build hg19 with bowtie for single-end runs and bowtie2 for paired-end runs.	Illumina HiSeq 2500 Illumina NovaSeq 6000	2
EGAD00001006822	In this study, we aimed to identify somatic structural variation of chronic lymphocytic leukemia (CLL) at the single-cell level and investigate its direct consequence on the nucleosome occupancy using scNOVA approach. For this purpose, we performed strand-specific single-cell sequencing of primary leukemia samples from 63-year-old female patient.	NextSeq 500	86
EGAD00001006823	Approximately 1000 trio's with varying degrees of cognitive disorders. All samples have been sequenced for the AnkyrinG interactome using MIPS technology. Data is presented as BAM and unfiltered VCF files.	NextSeq 500	1
EGAD00001006824	Whole genome sequencing data (Illumina NovaSeq 6000) of clonal cultures derived from pediatric human bone marrow-derived hematopoietic stem and progenitor cells (in total 35 samples from 7 donors), bulk pediatric acute myeloid/lymphoid leukemia blasts (in total 2 samples from 1 patient) and bulk control mesenchymal stem cell cultures (4 samples from 4 patients) to study the mutation accumulation.	Illumina NovaSeq 6000	81
EGAD00001006825	The dataset consists of the BAM-files of 745 patients and 810 controls (retained after quality control) of a set of 34 candidate genes obtained after targeted enrichment via Molecular Inversion Probes (MIPS) technology. The sequencing was performed on the NextSeq 500 (Illumina, CA, USA) using custom sequencing and index primers in three 2 x 76 bp, dual indexed runs using a 150 cycles High-Output Illumina kit (Illumina, CA, USA). Alignment of the fastq reads to the human genome was performed using BWA (v0.7.4).	NextSeq 500	1555
EGAD00001006826	This dataset comprises RNA-seq expression profiles from 57 subjects, of which 39 are DMD patients and 18 healthy controls. The data are described in the following article: Signorelli, Ebrahimpoor et al. (in review). Peripheral blood transcriptome profiling enables monitoring disease progression in dystrophic mice and patients.	Illumina HiSeq 2500	57
EGAD00001006827	We performed single-cell RNA-sequencing of cells in the bronchoalveolar lavage (BAL) fluid of severe COVID-19. In addition, we performed single-cell RNA-sequencing of SARS-CoV-2 stimulated classical blood monocytes. This study provides detailed insights into the alveolar macrophage response to SARS-CoV-2 infection and reveals a profibrotic macrophage response in severe COVID-19 patients.	Illumina NovaSeq 6000	10
EGAD00001006828	In Coronavirus Disease 2019 (COVID-19), hypertension and cardiovascular diseases are major risk factors for critical disease progression. However, the underlying reasons and the effect of the main anti-hypertensive therapies—angiotensin-converting enzyme inhibitors (ACEIs) and angiotensin receptor blockers (ARBs)—remain unclear. Combining clinical data (n = 144) and single-cell sequencing data of airway samples (n = 48) with in vitro experiments, we observed a distinct inflammatory predisposition of immune cells in patients with hypertension that correlated with critical disease progression. ACEI treatment associated with dampened COVID-19-related hyperinflammation and with increased cell intrinsic anti-viral responses, whereas ARB treatment related to enhanced epithelial–immune cell interactions. Macrophages and neutrophils of patients with hypertension, in particular under ARB treatment, exhibited higher expression of the pro-inflammatory cytokines CCL3 and CCL4 and the chemokine receptor CCR1. Although the limited size of our cohort does not allow us to establish clinical efficacy, our data suggest that the clinical benefits of ACEI treatment in patients with COVID-19 who have hypertension warrant further investigation.	Illumina NovaSeq 6000	33
EGAD00001006829	We are presenting raw and processed data of our study where we analyze fine-needle aspirate (FNA) samples of primary cutaneous B-cell lymphoma patients undergoing oncolytic virotherapy (https://clinicaltrials.gov/ct2/show/NCT03458117). We are uploading single cell RNA-sequencing and immune repertoire profiling data of 29 FNA samples from four pCBCL patients. The four patients have three different subtypes of primary cutaneous B-cell lymphoma: pCDLBCL-LT, pCFCL and pCMZL. The samples are taken at different time points following T-VEC injection (from baseline up to 91 days after injection). We are uploading the sequences in BAM format and the outputs of the cellranger pipeline (count matrices and filtered VDJ contigs) in CSV format.	NextSeq 500	34
EGAD00001006830	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006831	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006832	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006833	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006834	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006835	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006836	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006837	RNA-exome	Illumina NovaSeq 6000	1
EGAD00001006838	The dataset contains 200 fastq files of Illumina 5'end RNA sequencing data of 50 PBMC samples. The paired-end data includes 100 fastq files (R1 and R2) of 50 full-length cDNA sequencing libraries and 100 fastq files (R1 and R2) of 50 HLA amplicon sequencing libraries.	NextSeq 550	100
EGAD00001006840	RNA-seq Phase Ib of olaparib and capivasertib	Illumina HiSeq 2000	74
EGAD00001006841	T200 sequencing (Phase Ib of olaparib and capivasertib)	Illumina HiSeq 2000	162
EGAD00001006842	Microglia were derived from iPSCs and treated with mimics and inhibitors of the miRNAs hsa-miR-150-5p, hsa-miR-193a-3p and hsa-miR-19b-3p. RNA-sequencing was then performed to examine the effects of up- and down-regulation of the respective miRNAs.	NextSeq 550	30
EGAD00001006843	CAGE-sequencing was performed on frontal post-mortem human brain tissue of patients with FTD caused by mutations in GRN, MAPT or C9orf72 and healthy controls.	Illumina HiSeq 2000	57
EGAD00001006844	iPSC-derived neurons were treated with mimics and inhibitors of the miRNAs miR-150-5p, hsa-mir-193a-3p and hsa-miR-19b-3p. RNA-sequencing was then performed to examine the effects of miRNA up-regulation and inhibition.	NextSeq 550	15
EGAD00001006845	This dataset contains smRNA-seq data from human post-mortem brain tissue of the frontal lobe of patients with FTD and healthy controls. These samples depict the data generated at the DZNE Göttingen and should be used together with the data generated at the DZNE Tübingen.	NextSeq 550	33
EGAD00001006846	This dataset contains smRNA-seq data from post-mortem human brain tissue of the frontal lobe of patients with FTD and healthy controls. The smRNA-sequencing was done in two parts, this dataset depicts the data generated at the DZNE Tübingen.	NextSeq 550	9
EGAD00001006847		Illumina HiSeq 2500	12
EGAD00001006848	WGS of 17 GSC populations derived from patient tumours	HiSeq X Five HiSeq X Ten	27
EGAD00001006849	Cancer cells enter a reversible drug tolerant persister (DTP) state to evade death from both chemotherapies and targeted agents. It is increasingly appreciated that the DTP state is an important driver of therapy failure and tumor relapse. We combined cellular barcoding and mathematical modeling in patient-derived colorectal cancer xenograft models to identify and characterize the cancer cells capable of generating DTPs in response to standard-of-care chemotherapy. Barcode analysis revealed no loss in clonal complexity of tumors that entered the DTP state and recurred following treatment cessation. Our data fits a mathematical model in which all cancer cells, and not a small subpopulation, possess an equipotent capacity to enter the DTP state. Mechanistically, we determined that DTPs display remarkable transcriptional and functional similarities to diapause, a reversible state of suspended embryonic development triggered by unfavorable environmental conditions. Our study provides new insights into how cancer cells use a developmentally conserved mechanism to drive the DTP state pointing to novel therapeutic opportunities to target diapause-like DTPs.	Illumina HiSeq 2500	12
EGAD00001006850	Data from NABUCCO cohort 1 (NCT03387761). This dataset includes Whole exome DNA sequencing on bladder tumor samples (n=24) matched with blood samples (n=24). The data is pre-treatment.	Illumina HiSeq 2500	-
EGAD00001006851	Data from NABUCCO cohort 1 (NCT03387761). This dataset includes Tumor mutational burden (TMB) calculated on bladder tumor pre-treatment DNA sequencing data (n=24). Details about the Tumor Mutational Burden calculation can be found on the Methods section from the Nature Medicine paper (https://doi.org/10.1038/s41591-020-1085-z)		-
EGAD00001006852	Data from NABUCCO cohort 1 (NCT03387761). This dataset includes High coverage Whole exome DNA sequencing on pre-treatment bladder tumor samples (n=3) matched with post-treatment metastasised adjacent lymph nodes isolated with laser microdissection (n=3) for 3 unique patients	Illumina HiSeq 2500	-
EGAD00001006853	Data from NABUCCO cohort 1 (NCT03387761). This dataset includes the Response labels used for the analysis of the data. Details about the clinical definitions of Response can be found on the paper (https://doi.org/10.1038/s41591-020-1085-z)		-
EGAD00001006854	Data from NABUCCO cohort 1 (NCT03387761). This dataset includes the Transcript read counts derived from the RNA sequencing data. The samples are pre-treatment (n=18) and post-treatment (n=18), and not all samples are paired. The data processing pipeline can be found on the Methods section from the Nature Medicine paper (https://doi.org/10.1038/s41591-020-1085-z)		-
EGAD00001006855	NABUCCO cohort 1 sequencing data. The dataset includes RNA sequencing pre-treatment on tumor samples (n=18) and RNA sequencing post-treatment on bladder tumor samples (n=18).	Illumina HiSeq 2500	-
EGAD00001006856	Data from NABUCCO cohort 1 (NCT03387761). This dataset includes the pre-treatment PD-L1 staining on tumor samples (n=24). Details about the PD-L1 staining can be found on the paper (https://doi.org/10.1038/s41591-020-1085-z)		-
EGAD00001006857	A comprehensive RNA repository (both coding and non-coding) from 17 patients diagnosed with esophageal adenocarcinoma, high-grade dysplastic or non-dysplastic Barrett’s esophagus. Per patient, a blood plasma sample, and a healthy esophageal and disease tissue sample were collected. This dataset includes both mRNA and small RNA sequencing data (fastq.gz files) of all tissue and plasma samples. In total,102 RNA-seq libraries from 51 samples (17 plasma and 34 tissue samples) were sequenced (plasma mRNA libraries were sequenced twice).	NextSeq 500	102
EGAD00001006858	Whole genome sequencing data of EBV associated DLBCL of 8 matched tumor-normal patients. Additionally, targeted resequencing data of 47 patients is provided.	Illumina HiSeq 2500	63
EGAD00001006859	Osteosarcoma, the most common primary malignant tumour of bone, affects children and adults alike. No fundamental biological differences between paediatric and adult osteosarcoma are known. Here, we apply multi-region whole genome sequencing to an index case of a four-year old child whose aggressive tumour harboured high level, focal amplifications of MYC and CCNE1 connected by translocations. We re-analysed copy number readouts of 258 cases of high-grade osteosarcoma from three different cohorts and identified an additional three cases with MYC and CCNE1 co-amplification, confined to children and associated with aggressive disease. Examining the age distribution of MYC and CCNE1 amplicons across all cases revealed a significant enrichment of focal MYC amplification in children, whereas CCNE1 amplification is not strictly restricted to children. Our findings indicate that amplification of the MYC oncogene, known to be associated with a poor outcome, delineates a variant of osteosarcoma specific to childhood. When co-amplified with CCNE1, it may herald an aggressive disease course.	HiSeq X Ten	8
EGAD00001006860	RNA-seq data (44 samples) from tumor tissue specimens pre and post fasting-mimicking diet from 22 early-stage breast cancer patients.	Illumina NovaSeq 6000	44
EGAD00001006861	Smart-seq2 single cell RNA sequencing reads from kidney glomerular single cells of healthy human individuals. The dataset contains single-end fastq files of 766 single cells.	Illumina HiSeq 3000	766
EGAD00001006862	7 RNA-seq samples in total: CD19n_IgAn (x2) , CD19n_IgAp (x2) , CD19p_IgAn (x1), CD19p_IgAp (x2)	Illumina HiSeq 2500	7
EGAD00001006863	This dataset contains 11 paired-end FASTQ sequences from mRNA-Seq on single human M-II stage oocytes that were collected from gonadotropin stimulated women undergoing fertility treatments. M-II stage oocytes were collected and flash frozen prior to lysis followed by RNA extraction, full length cDNA preparation and amplification using the Ultra-low-input SMART-Seq2 v4 kit from Takara Clonetech. Further, these cDNA were used to prepare libraries for sequencing according the Nextera XT DNA library preparation kit from Illumina.	NextSeq 500	11
EGAD00001006864	16S amplicon data of nasopharyngeal swabs in a COVID-19 cohort recruited at UZ Leuven. The dataset contains a single experiment, comprising 150 runs corresponding to 125 unique samples. Runs comprise paired fastq files (2*250 bases) obtained from an Illumina MiSeq instrument.	Illumina MiSeq	125
EGAD00001006865	22 RNA-seq samples of ex-vivo (TN and Treg), cultured Treg, TET1 and untreated mCherry-MOCK	Illumina HiSeq 2500	22
EGAD00001006866	6 samples from individuals with multiple myeloma with selective elimination of immunosuppressive T cells, rna sequencing	Illumina HiSeq 4000	6
EGAD00001006867	Sequence data (paired-end FASTQ format) for 209 samples from 73 sample sites, from 7 individuals. Samples include primary melanomas, metastatic tumours and ctDNA	Illumina HiSeq 2500	209
EGAD00001006868	Mutational signatures in esophageal squamous cell carcinoma from eight countries of varying incidence – sequence data (Mutographs)	Illumina NovaSeq 6000	1145
EGAD00001006869	Dataset contains plasma DNA whole genome sequencing on 4 breast cancer patients. It also includes matched germline and tumour whole genome sequencing data. Two benign cancer patients were also sequenced and their plasma DNA and matched germline whole genome sequencing data are included in the dataset. Samples were sequenced on Illumina HiSeqX Ten.	HiSeq X Ten	16
EGAD00001006871	TAPS data from 21 patients with HCC, 23 patients with PDAC, 30 non-cancer controls, 4 patients with cirrhosis, and 7 patients with pancreatitis.	Illumina NovaSeq 6000	255
EGAD00001006873	BAM files (aligned against the hg38 genome) from a targeted amplicon sequencing (139 genes) experiment (median depth 1000X) on 218 samples from Stage 1 epithelial ovarian cancer biopises. Samples labeled "bis" or "tris" with the same ID are relapses; "left" or "right" samples indicate, in the case of bilateral tumor, from which ovary the sample was taken.	NextSeq 500	218
EGAD00001006874	Data supporting: “Deep molecular phenotyping reveals the identity of Barrett’s esophagus and its malignant transition.” Nowicki-Osuch, Zhuang et al. WGS (BAM files) 5 Barrett's samples 5 normal oesophageal samples 5 normal gastric cardia samples 5 normal duodenal samples	Illumina HiSeq 2000 Illumina NovaSeq 6000	10
EGAD00001006875	In this study, we enhanced 5mC detection using SMRT sequencing by holistically analyzing kinetic signals of a DNA polymerase and sequence context for every base within a measurement window. We employed a convolutional neural network to train a methylation classification model.	NextSeq 500 Sequel II	42
EGAD00001006876	Bulk RNA-seq data of tumours in EGAS00001004572.	NextSeq 500	1
EGAD00001006877	RNA-seq dataset for Mutation-specific non-canonical pathway of PTEN as a distinct therapeutic target for glioblastoma	Illumina HiSeq 2500	42
EGAD00001006878	high depth WGS sequencing of 8 sites of a RET fusion tumour	Illumina NovaSeq 6000	9
EGAD00001006879	Three capture (Agilent’s SureSelectXT HS, Illumina’s Nextera Rapid Capture Custom, and New England Biolabs’ Next Direct Custom) and one amplicon-based (Qiagen’s Human Breast Cancer Panel) targeted sequencing methods on 6-8 paired blood and FFPE from the Malaysian Breast Cancer Cohort.	Illumina MiSeq	56
EGAD00001006880	This dataset includes high-coverage genomes (~36x) of 317 individuals from 20 populations of the Pacific (Taiwan, Philippines, Solomon Islands, Vanuatu archipelago), described in “Genomic insights into population history and biological adaptation in Oceania”, by Choin, Mendoza-Revilla, Arauna, and colleagues (Nature 2021). The data is made of 331 fastq files.	HiSeq X Five	317
EGAD00001006881	BAM files (aligned against the hg38 genome) from a shallow whole-genome sequencing experiment (median depth 0.5X) on 218 samples from Stage 1 epithelial ovarian cancer biopises. Samples labeled "bis" or "tris" with the same ID are relapses; "left" or "right" samples indicate, in the case of bilateral tumor, from which ovary the sample was taken.	NextSeq 500	218
EGAD00001006882	Data supporting: “Deep molecular phenotyping reveals the identity of Barrett’s esophagus and its malignant transition.” Nowicki-Osuch, Zhuang et al. RNAseq (BAM files) 12 Barrett's samples 12 normal oesophageal samples 11 normal gastric cardia samples	Illumina HiSeq 2000	1
EGAD00001006883	The dataset contains FASTQ files referring to the study "Multi-omics analysis of Parkinson’s disease midbrains". For this project, RNA was isolated from human postmortem midbrain tissue (PD and Control samples). Libraries were prepared with the TruSeq Small RNA library prep (Small RNA Seq) and the TruSeq Stranded Total RNA Kit (for transcriptomics), both from Illumina. Sequencing for both experimental setups was conducted in the Illumina HiSeq4000.	Illumina HiSeq 4000	31
EGAD00001006884	We show that lysosomes are antagonistically controlled by TFEB and MYC to balance catabolic and anabolic processes required for activating LT-HSC and guiding their lineage fate. TFEB-mediated induction of the endolysosomal pathway for membrane receptor degradation limits LT-HSC metabolic and mitogenic activation; this promotes quiescence and self-renewal and governs erythroid-myeloid commitment. By contrast, MYC engages biosynthetic processes while repressing lysosomal catabolism to drive LT-HSC activation. Collectively, our study identifies lysosomes as a central regulatory hub for proper and coordinated stem cell fate determination.	Illumina HiSeq 2500	89
EGAD00001006885	The data set comprises 48 samples from term and preterm infants. Expression profiles were generated using different stimuli (O2 3%, 21%, 65%; LPS stimulation).	Illumina HiSeq 2500	48
EGAD00001006886	Short RNA sequencing of post-mortem human hippocampi from the Calgary Brain Bank. The dataset includes patients with Alzheimer's disease (AD) and healthy control individuals (Ctrl).	Illumina HiSeq 4000	24
EGAD00001006887	Patient samples were sequenced by Foundation Medicine, Inc. (Cambridge, MA), using FoundationOne CDx, a comprehensive NGS-based in vitro diagnostic device designed to capture cancer genes.	Illumina HiSeq 2000	320
EGAD00001006888	Molecular cancer paper (https://doi.org/10.1186/s12943-021-01327-5): This dataset contain shallow whole-genome sequencing (sWGS) of plasma cell-free DNA from cancer patients and healthy subjects, obtained with both Nanopore and Illumina technology. A total of 6 cancer patients and 5 healthy subjects have been sequenced with Nanopore; 4 of the cancer patients have been also sequenced with Illumina. In addition, genomic DNA from white blood cells of one healthy subjects, genomic and 160bp DNA from HEK cells have been sequenced with Nanopore. Genome Biology paper: 3 additional healthy samples have been sequenced (HU), two different bioinformatic pipeline were applied. 2019: Fastqs from the molecular cancer paper were re-demultiplexed and adapter-trimmed (using guppy for multiplex samples, and porechop for singleplex) preserving 5' ends to allow fragmentomics analysis. HAC: All the samples were basecalled with the same updated High Accuracy model (the latest at the time of the analysis) and post-processed as the 2019 dataset. Raw FAST5 are currently available upon request, but will be uploaded soon.	GridION Illumina NovaSeq 6000	18
EGAD00001006889	Genotype of C3 SNPs in 140 LOTx donors and recipients pairs.		290
EGAD00001006893	Single-cell RNA sequencing of bronchoalveolar lavages from COVID-19 patients.	Illumina NovaSeq 6000	35
EGAD00001006894	Actinic keratoses (AK) are lesions of epidermal keratinocyte dysplasia and are precursors for invasive cutaneous squamous cell carcinoma (CSCC). Identifying the specific genomic alterations driving progression from normal skin-AK-invasive CSCC is challenging due to the massive ultraviolet radiation-induced mutational burden characteristic at all stages of this progression. Here, we report the largest AK whole exome sequencing study to date and perform mutational signature and candidate driver gene analysis on these lesions. We demonstrate in 37 AK, from both immunosuppressed and immunocompetent patients, that there are significant similarities to CSCC in terms of mutational burden, copy number alterations, mutational signatures and patterns of driver gene mutations. We identify 44 significantly mutated AK driver genes and confirm that these genes are similarly altered in CSCC. We identify the azathioprine mutational signature in all AK from patients exposed to the drug, providing further evidence for its role in keratinocyte carcinogenesis. CSCC differ from AK in having higher levels of intra-sample heterogeneity. Alterations in signaling pathways also differ, with immune-related signaling and TGF-β signaling significantly more mutated in CSCC. Integrating our findings with independent gene expression datasets confirms that dysregulated TGF-β signaling may represent an important event in AK-CSCC progression.	Illumina HiSeq 2500	74
EGAD00001006895	Paired end whole exome sequencing (WES) data of tumor/normal pairs (sorted malignant CD3+/Vb+ T-cells and CD19+ non-malignant B-cells) for the identification of somatic mutation.	NextSeq 550	12
EGAD00001006896	Initial WGS of plasma cell neoplasms in fire fighters exposed to the WTC attack	Illumina NovaSeq 6000	14
EGAD00001006897	Simple, Multiplexed, PCR-based barcoding of DNA for Sensitive mutation detection using Sequencing (SiMSen-Seq) of 11 PIK3CA hotspot mutations in plasma DNA of breast cancer patients.ng	Illumina MiSeq NextSeq 550	66
EGAD00001006898	Single cell sequencing of 12 ovarian cancer biopsies from 7 patients.	Illumina NovaSeq 6000	12
EGAD00001006899	Gluten reactive T-cells from blood samples from patients undergoing a 3 day gluten challenge. Samples were collected on day 6. Both gluten reactive and non-gluten reactive T-cells were sequenced.	NextSeq 500	12
EGAD00001006900	Paired T-cell receptor sequences sequenced from single cells, from intraepithalial CD8+ αβ T-cells. Sequences are from from untreated and treated (on a gluten-free diet) celiac disease patients and controls.	Illumina MiSeq	19
EGAD00001006901	Paired end shallow whole genome sequencing (sWGS) data for the identification of somatic copy number alterations (SCNA) and the estimation of tumor fraction and ploidy sorted malignant CD3+/Vb+ T-cells and corresponding CD19+ non-malignant B-cells	NextSeq 550	11
EGAD00001006902	Bulk WGS fastq files for germline and tumours in EGAS00001004572.	HiSeq X Ten	1
EGAD00001006903	This dataset includes 87 scRNA-seq samples of bone marrow aspirates of 20 relapsed/refractory patients generated with the 3´(v2) kit of the 10x Chromium platform. Bone marrow cells have been sorted using CD138 +/- fractions using magnetic beads for plasma cell enrichment and processed independently. For 14/20 patients multiple treatment timepoints are available that includes samples before treatment and at relapse during treatment.	Illumina HiSeq 4000	87
EGAD00001006904	This dataset accompanies the publication of Sugita M et al. "Targeting the Epichaperome As an Effective Precision Medicine Approach in a Novel PML-SYK Fusion Acute Myeloid Leukemia" Npj Precision Oncology 2021	Illumina HiSeq 2500 Illumina HiSeq 4000	13
EGAD00001006905	Whole genome sequencing of 29 samples	Illumina NovaSeq 6000	29
EGAD00001006906	Bulk GRIDSS somatic sv vcfs from tumour-normal analysis in EGAS00001004572		1
EGAD00001006907	Bulk Strelka somatic snv vcfs from tumour-normal analysis in EGAS00001004572		1
EGAD00001006908	Bulk copy number segments from Purple analysis in EGAS00001004572		1
EGAD00001006909	Bulk methylation tumour profiles from infinium methylation epic bead kit in EGAS00001004572		1
EGAD00001006910	Bulk Germline snv vcfs from haplotypecaller analysis in EGAS00001004572		1
EGAD00001006911	Bulk RNAseq from HCV infected liver biopsies. Two fastq files per sample for Paired end sequecing. Some samples were sequenced on multiple plates.	Illumina HiSeq 4000	225
EGAD00001006913	Five commercially available parallel sequencing assays were evaluated for their ability to detect gene fusions in eight cell lines and 18 FFPE tissue samples carrying a variety of known gene fusions. Four RNA-based assays and one DNA-based assay were compared; two were hybrid capture-based, TruSight Tumor 170 Assay (Illumina) and SureSelect XT HS Custom Panel (Agilent), and three were amplicon-based, Archer FusionPlex Lung Panel (ArcherDX), QIAseq RNAscan Custom Panel (Qiagen) and Oncomine Focus Assay (Thermo Fisher Scientific).	Illumina MiSeq Ion Torrent S5 NextSeq 500	228
EGAD00001006914		Illumina HiSeq 2500	9
EGAD00001006915	Samples of nucleated cells found in peripheral blood from over 300 patients suffering from resectable pancreatic ductal adenocarcinoma, non-resectable pancreatic cancer, chronic pancreatitis, or none of these. Please cite this article when using data: Al-Fatlawi, A.; Malekian, N.; García, S.; Henschel, A.; Kim, I.; Dahl, A.; Jahnke, B.; Bailey, P.; Bolz, S.N.; Poetsch, A.R.; Mahler, S.; Grützmann, R.; Pilarsky, C.; Schroeder, M. Deep Learning Improves Pancreatic Cancer Diagnosis Using RNA-Based Variants. Cancers 2021, 13, 2654. https://doi.org/10.3390/cancers13112654	Illumina HiSeq 2500	311
EGAD00001006916	This dataset contains: 1) Raw FASTQ and BAM files for short reads. Here, DNA libraries were prepared using Nextera Rapid Capture Custom Enrichment kit (Illumina) and paired-end sequenced on a HiSeq2500 (Illumina). 2) Raw FASTQ and BAM files for long reads. Here, DNA libraries were prepared using 1D DNA ligation Sequencing Kit (SQK-LSK109, Oxford Nanopore) and single-end sequenced on a MinION device (Oxford Nanopore).	Illumina HiSeq 2500 MinION	86
EGAD00001006917	Project Neurodevelopmental Disorders 245 Samples	NextSeq 500	245
EGAD00001006918	We provide a diverse keratinocyte transcriptome signature between SFN and FMS patients, which may hint towards distinct pathomechanisms of small fiber sensitization and lay the basis for advanced diagnostics in both entities	NextSeq 500	29
EGAD00001006919	Whole exome sequencing data of 28 matched normal-tumor-relapse patients from Lübeck and Munich (Germany).	Illumina HiSeq 2000	84
EGAD00001006920	This dataset contains exome sequencing from replication repair deficient brain tumor samples.		7
EGAD00001006921	Hypermutant tumors which harbor many somatic mutations may obscure the interpretation of targetable genomic events. This dataset contains transcriptome sequencing from 21 replication repair deficient brain tumor samples as well as healthy controls.	Illumina HiSeq 2500	30
EGAD00001006922	Longitudinal single-cell RNA-seq data of prospectively collected tumor tissue samples before and after chemotherapy from 11 HGSOC patients.	Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina NovaSeq 6000	22
EGAD00001006923	These data were used in the following publication: Andradas, C.; Byrne, J.; Kuchibhotla, M.; Ancliffe, M.; Jones, A.C.; Carline, B.; Hii, H.; Truong, A.; Storer, L.C.D.; Ritzmann, T.A.; et al. Assessment of Cannabidiol and delta9-Tetrahydrocannabiol in Mouse Models of Medulloblastoma and Ependymoma. Cancers 2021, 13, 330. https://doi.org/10.3390/cancers13020330 There are 4 paired-end RNA-seq samples from paediatric brain cancer cell lines.	Illumina NovaSeq 6000	4
EGAD00001006924	Plasmodium vivax offers unique challenges for control and elimination, and may prove a tougher hurdle to overcome than Plasmodium falciparum. And yet compared to P. falciparum we know very little about the innate and adaptive immune responses that need to be harnessed to reduce disease and transmission. We recently generated a blood bank of a new clonal field isolate of P. vivax (PvW1) for human challenge studies and used systems immunology tools to track the host response throughout infection and convalescence. As part of this study, RNA-sequencing was used to resolve changes in whole blood gene expression through time in 6 volunteers (7-9 time-points per volunteer). In summary, these data show that P. vivax induces two distinct transcriptional programmes in whole blood during and after infection. During infection, transcriptional profiling reveals the rapid mobilisation of an emergency myeloid response, which leads to systemic inflammation and the recruitment of all major T cell subsets into lymphoid tissues. Six days after infection, this innate response subsides and a transcriptional signature of proliferation is revealed. This most likely represents widespread activation of lymphocytes, which return to the circulation after parasite clearance - transcriptional profiling of T cells at this time-point could therefore reveal the outcomes of critical cell-cell interactions that take place within the spleen during infection. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ .	Illumina HiSeq 2500	54
EGAD00001006925	This dataset contains raw .fastq files of a paired-end RNA-seq experiment on 15 PTCL-NOS samples. Samples were prepared with Truseq stranded mRNA library kit.	Illumina NovaSeq 6000	15
EGAD00001006926	This dataset contains subtype assignments for 271 tumor samples profiled by RNA-seq.		271
EGAD00001006927	This dataset contains log2(TPM + 1) for 271 tumor samples profiled by RNA-seq for the entire transcriptome.		271
EGAD00001006928	This dataset contains log2(TPM + 1) for 271 tumor samples profiled by RNA-seq for the subset of genes used for validation of the NMF cluster assignments.		271
EGAD00001006929	Sequencing of LCM-derived microbiopsies from explanted lung from COPD patient. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Deep sampling throughout multiple regions of the lung will determine whether there are differences in smoking-related mutation burden in different portions of the lung. Targeted sequencing will be conducted on samples to identify drivers of interest and clonality of the samples, well-performing samples will be sent for subsequent whole-genome sequencing. Results from this portion of the study will be compared to other individuals with smoking-related diseases (COPD, pulmonary fibrosis, lung cancer), and normal, non-smoking lungs. . This dataset contains all the data available for this study on 2021-02-02.	HiSeq X Ten Illumina HiSeq 4000	30
EGAD00001006930	Sequencing of LCM-derived microbiopsies from explanted lung from COPD patient. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Deep sampling throughout multiple regions of the lung will determine whether there are differences in smoking-related mutation burden in different portions of the lung. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this portion of the study will be compared to other individuals with smoking-related diseases (COPD, pulmonary fibrosis, lung cancer), and normal, non-smoking lungs. . This dataset contains all the data available for this study on 2021-02-02.	Illumina NovaSeq 6000	24
EGAD00001006931	De- and transdifferentiation of melanoma is a rare histopathological phenomenon that has not be characterised genetically. In this project we plan to sequence the genomes of de and transdifferentiated cases so as to define their genetic make-up. . This dataset contains all the data available for this study on 2021-02-02.	Illumina HiSeq 4000	26
EGAD00001006932	Sequencing of LCM-derived microbiopsies from explanted lung from COPD patient. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Deep sampling throughout multiple regions of the lung will determine whether there are differences in smoking-related mutaiton burden in different portions of the lung. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this poriton of the study will be compared to other individuals with smoking-related diseases (COPD, pulmonary fibrosis, lung cancer), and normal, non-smoking lungs. . This dataset contains all the data available for this study on 2021-02-02.	Illumina NovaSeq 6000	20
EGAD00001006933	The Mutographs project aims to advance our understanding of the causes of cancer through studies of mutational signatures. Led by Mike Stratton, together with Paul Brennan, Ludmil Alexandrov, Allan Balmain, David Phillips and Peter Campbell, this large-scale international research endeavour was awarded a Cancer Research UK Grand Challenge. Different patterns of somatic mutation are generated by the different environmental, lifestyle and genetic factors that cause cancer, many of them are still unknown. Within Mutographs, Kings College London will characterise the mutational signatures induced by putative human carcinogens in order to identify the origins of mutational signatures found in human cancers. To achieve this human organoid cell cultures will be exposed to a representative catalogue of known or suspected human carcinogens and mutagens and, using whole genome sequencing, the patterns of mutations induced by them will be determined. Somatic mutational signatures will be subsequently extracted by non-negative matrix factorisation methods and correlated with exposure data. Through an enhanced understanding of cancer aetiology, Mutographs unprecedented effort is anticipated to outline modifiable risk factors, lead to new approaches to prevent cancer, and provide opportunities to empower early detection, refine high-risk groups and contribute to further therapeutic development. . This dataset contains all the data available for this study on 2021-02-02.	HiSeq X Ten	6
EGAD00001006934	We study lymphocyte somatic evolution through the sequencing of normal healthy lymphocytes. We perform whole-genome sequencing of single-cell derived T and B cell colonies to identify somatic mutations, and perform targeted deep-sequencing of these mutations. The lineages of T and B cells, and the frequencies of these mutations reveals the neutral and non-neutral evolutionary processes underlying lymphocyte growth and function. . This dataset contains all the data available for this study on 2021-02-02.	HiSeq X Ten	20
EGAD00001006935	We study lymphocyte somatic evolution through the sequencing of normal healthy lymphocytes. We perform whole-genome sequencing of single-cell derived T and B cell colonies to identify somatic mutations, and perform targeted deep-sequencing of these mutations. The lineages of T and B cells, and the frequencies of these mutations reveals the neutral and non-neutral evolutionary processes underlying lymphocyte growth and function. . This dataset contains all the data available for this study on 2021-02-02.	HiSeq X Ten	9
EGAD00001006936	Single-cell RNA sequencing was performed for cells from five early-stage LUADs and fourteen multi-region normal lung tissues of defined spatial proximities from the tumors.	Illumina NovaSeq 6000	35
EGAD00001006937	Chromium V(D)J and 5' Gene Expression platform (10X Genomics) was used to study patients with aplastic anemia. CD45+ cells from two patients (patient AA-3: 3 longitudinal samples from bone marrow and patient AA-4: 3 longitudinal samples from peripheral blood) were analysed. The raw data was processed using Cell Ranger 3.0.1 pipelines.	Illumina NovaSeq 6000	192
EGAD00001006938	CD14+ monocytes from 4 African and 4 Europeans individuals with varying degree of ex-vivo susceptibility to Influenza, were either stimulated with Influenza A virus, or left resting. Cells from all 16 samples were collected at 4 time points (0, 2, 4, 6h post infection), and pooled across 13 libraries. Samples were processed on the 10x chromium with 3' reagents kits, V3 chemistry and sequenced with Hiseq X ten.	HiSeq X Ten	13
EGAD00001006939	Whole genome sequencing for single cells for library A95629A 1023 cells; filetype=bam	HiSeq X Five	5
EGAD00001006940	Whole genome sequencing for single cells for library A95654B 1740 cells; filetype=bam	HiSeq X Five	5
EGAD00001006941	Whole genome sequencing for single cells for library A95673A 1446 cells; filetype=bam	NextSeq 550	9
EGAD00001006942	Whole genome sequencing for single cells for library A95703B 1267 cells; filetype=bam	HiSeq X Five	5
EGAD00001006943	Whole genome sequencing for single cells for library A95728A 876 cells; filetype=bam	HiSeq X Five	5
EGAD00001006944	Whole genome sequencing for single cells for library A96192B 1304 cells; filetype=bam	HiSeq X Five	5
EGAD00001006945	Whole genome sequencing for single cells for library A96217B 1616 cells; filetype=bam	HiSeq X Five	6
EGAD00001006946	Whole genome sequencing for single cells for library A96219B 1743 cells; filetype=bam	HiSeq X Five	7
EGAD00001006947	Whole genome sequencing for single cells for library A98269B 1609 cells; filetype=bam	HiSeq X Five	7
EGAD00001006948	In this proof of principle study, we performed whole genome sequencing of two cases with multiple relapses in order to investigate whether groups of mutations separated in time show distinct mutational signatures. In patient 1, who experienced two relapses, the analysis unraveled a continuous interplay of aberrant AID/APOBEC-associated activities. Patient 2 had three relapses. We identified episodic mutational processes at diagnosis and first relapse leading to mutations resembling UV light-driven DNA damage, and thiopurine-associated damage at first relapse.	Illumina NovaSeq 6000	10
EGAD00001006950	Paired-end DNA-seq FASTQ files from 16 patients affected by acute intermittent porphyria. Whole genome sequencing of these samples was performed in an Illumina HiSeq 4000 instrument. Libraries were prepared using the Fisher PE Kit (Kapa Biosystems). Each sample was multiplexed across flowcells and lanes, leading to a total number of 83 pairs of FASTQ files.	Illumina HiSeq 4000	16
EGAD00001006951	Paired-end BAM files from 16 patients affected by acute intermittent porphyria. Whole genome sequencing of these samples was performed in an Illumina HiSeq 4000 instrument. Libraries were prepared using the Fisher PE Kit (Kapa Biosystems). FASTQ files were processed at the CNAG (Barcelona) using the GEM short-read aligner on the human genome version hs37d5, producing a total of 16 BAM files.		16
EGAD00001006952	VCF file from 16 patients affected by acute intermittent porphyria. Whole genome sequencing of these samples was performed in an Illumina HiSeq 4000 instrument. Libraries were prepared using the Fisher PE Kit (Kapa Biosystems). BAM files were processed at the CNAG (Barcelona) with their pipeline, including GATK v3.6 for genotyping and other tools such as snpEff for annotating variants, to produce this VCF file with a total of 10,630,259 variants, out of which 8,731,523 are SNVs.		16
EGAD00001006953	Lifelines-DEEP plasma un-targeted metabolomics		-
EGAD00001006954	iAMP21 WGS, total of 224 samples	HiSeq X Ten Illumina HiSeq 2500 Illumina NovaSeq 6000	224
EGAD00001006955	This dataset contains single cell DNA amplicon sequencing of 12 B-ALL patients. For all patients a diagnosis sample was processed, while 4 patients were also followed up during treatment, summing up to a total of 23 samples. Mutations were called in the predefined set of amplicons.	Illumina NovaSeq 6000	23
EGAD00001006956	30 samples of 15 individuals with neuroblastoma tumor, whole genome sequencing	HiSeq X Ten	30
EGAD00001006957	We perform whole exome sequencing on 50 pairs of gastric cancer and matched normal samples.	unspecified	100
EGAD00001006959	Second round of follow-up of population-based LifeLines-DEEP cohort	Illumina HiSeq 2000	676
EGAD00001006960	RNAseq fastq files from 611 bulk pre-treatment tumors from two indications: metastatic urothelial bladder cancer patients (IMvigor210) and metastatic renal cell carcinoma (IMmotion150)	Illumina HiSeq 4000	611
EGAD00001006961	The sequencing data of the CTSC gene after whole genome sequencing of blood samples from two individuals with Papillon-Lèfevre Syndrome.	Illumina NovaSeq 6000	2
EGAD00001006962	The transcriptome of peripheral blood cells (PBMCs) from control or patients with an activation mutation on the STAT1 gene was analyzed. This analysis aimed to identify the major changes in the circulating immune cells of patients with STAT1 mutation and compare this with the result of the perturbation prediction tool huva (human variation, R).	AB 5500xl Genetic Analyzer	6
EGAD00001006963	Whole-genome sequencing of 135 tumor samples and 98 normal samples of gastric cancer with peritoneal metastasis	Illumina NovaSeq 6000	232
EGAD00001006964	This dataset contains the RNA and ChIP Sequencing data from the study Kalirin-RAC controls nucleokinetic migration in ADRN-type neuroblastoma. The data is organized in 7 experiments which are divided by both sequencing technology or the application of siRNA or drug interventions (or lack thereof) on neuroblastoma cell lines. The experiment names and the file names have been chosen in each respective experiment to guide future users of the data to replicate the analyses in the manuscript.	Illumina HiSeq 2000	54
EGAD00001006965	The Genomic Diversity in Africa Project (GDAP) started with the plan to develop a genomic resource from African populations, characterise genomic diversity and population history, and facilitate clinical studies in Africa. Currently, 25 individuals from 24 ethnolinguistic groups have been whole-genome sequenced at high depth totalling 585 individuals. An additional 41 individuals have been sequenced with 10X Genomics libraries. At this stage, the initial curation of this dataset has been finished and we are performing the analysis in coordination with our collaborators. The current state of the GDAP represents a very diverse panel of African populations that maximizes geographical and ethnic variation and represents a great starting point to achieve the aforementioned goals. However, southern sub-Saharan countries, Bantu speakers and hunter gatherer groups are currently underrepresented, despite being crucial to understand the evolutionary history of the continent. After extensive effort to collate studies documentation, we finally have the opportunity to sequence 600 new individuals from these groups, including countries as Gabon, Rwanda and Zambia, and address these deficiencies. We aim to proceed with the same strategy: to sequence at high depth 25 individuals with standard PCR free libraries, with 2 additional individuals with 10X Genomics Chromium libraries per ethnolinguistic group. The former allows a good representation of variants down to low frequency in any given population, and the latter allows accurate phasing and the analysis of structural variation. By including these new populations, we want to investigate three crucial questions in African history in addition to the initial objectives: the Bantu expansion, the evolutionary history of hunter gatherers and the transatlantic slave trade. Additionally, the expanded dataset will help us better discover the genetic variation present in Africa and characterize the African pangenome. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2021-02-12.	Illumina NovaSeq 6000	184
EGAD00001006966	The dataset contains raw miRNA sequencing data of plasma samples from 20 newly diagnosed colorectal cancer cases and 20 controls free of colorectal neoplasms matched by age and sex. It includes files in the FASTQ compressed (.gz) format.	NextSeq 500	40
EGAD00001006967	For this project about non-muscle invasive bladder cancer (NMIBC), we analysed total RNA-seq data from 47 patients used for validation. Sequencing of total RNA was performed using KAPA RNA HyperPrep Kit with RiboErase HMR (Roche). RNA input was 100 to 500 ng. The dataset is composed of 94 fastq files.	Illumina NovaSeq 6000	47
EGAD00001006968	bam files, mapped to hg19 after dedup, recal, recalibration and clipping of overlapping redas	Illumina HiSeq 4000 Illumina MiSeq Illumina NovaSeq 6000	210
EGAD00001006969	NOTCH1 mutant clones occupy the majority of normal human esophagus by middle age, but are comparatively rare in esophageal cancers, suggesting NOTCH1 mutations may promote clonal expansion but impede carcinogenesis. Here we test this hypothesis. Visualizing and sequencing NOTCH1 mutant clones in aging normal human esophagus, reveals frequent biallelic mutations that block NOTCH1 signaling. In mouse esophagus, heterozygous Notch1 mutation confers a competitive advantage over wild type cells, an effect enhanced by loss of the second allele. Notch1 loss alters transcription but has minimal effects on epithelial structure and cell dynamics. In a carcinogenesis model, Notch1 mutations were less prevalent in tumors than normal epithelium. Deletion of Notch1 reduced tumor growth, an effect recapitulated by anti-NOTCH1 antibody treatment. We conclude that Notch1 mutations in normal epithelium are beneficial as wild type Notch1 promotes tumor expansion. NOTCH1 blockade has therapeutic potential in esophageal squamous tumors.	Illumina HiSeq 2500	-
EGAD00001006970	Globally, human populations show structured genetic diversity as a result of geographical dispersion, selection and drift. Understanding this genetic variation can provide insights into the evolutionary processes that shape both human adaptation and variation in disease. Populations from SSA have the highest levels of genetic diversity. This characteristic, in addition to historical genetic admixture, can lead to complexities in the design of studies assessing the genetic determinants of disease and human variation. However, such studies of African populations are also likely to provide new opportunities to discover novel disease susceptibility loci and variants and refine gene-disease association signals. A systematic assessment of genetic diversity within SSA would facilitate genomic epidemiological studies in the region. The Genome Diversity in Africa Project (GDAP) aims to produce a comprehensive catalogue of human genetic variation in SSA, including single nucleotide polymorphisms (SNPs), structural variants, and haplotypes. This resource will make a substantial contribution to understanding patterns of genetic diversity within and among populations in SSA, as well as providing a global resource to help design, implement and interpret genomic studies in SSA populations and studies comprising globally diverse populations, complementing existing genomic resources. Specifically, we plan to carry out high depth whole genome sequencing of up to 2000 individuals across Africa (25 individuals from each ethnolinguistic group). Our scientific objectives are to: 1) develop a resource that provides a comprehensive catalogue of genetic variation in populations from SSA accessible to the global scientific community; 2) characterise population genetic diversity, structure, gene flow and admixture across SSA; 3) develop a cost-efficient, next-generation genotype array for diverse populations across SSA; and 4) facilitate whole genome-sequencing association studies of complex traits and diseases by developing a reference panel for imputation and resource for enhancing fine-mapping disease susceptibility loci. These scientific objectives will be supported by cross-cutting operational activities, including network and management of the consortium, research ethics, and research capacity building in statistical genetics and bioinformatics This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2021-02-16.	HiSeq X Ten	53
EGAD00001006973	Clinical data corresponding to the patients with ovarian cancer studied using scRNA-seq and bulk RNA-seq. Variables include molecular subtype, predicted immune phenotype, reviewing pathologist comment, final immune phenotype, histology characterization, and tumor stage.		59
EGAD00001006974	Matrices of counts from single-cell RNA-seq data for 15 samples from patients with ovarian cancer, 5 samples for each of the 3 tumor immune phenotypes (Infiltrated, Excluded and Desert). Dissociated cells from each tumor sample have been sorted to isolate live cells from the 3 compartments: tumor, immune and stromal. Each of the compartments has been analyzed separately by scRNAseq, excluding some desert tumors for which cells from the stromal and immune compartments have been pooled. Sequencing was performed using 10X Genomics Chromium Single Cell platform (v2 Chemistry).		44
EGAD00001006975	RNAseq FASTq files from 15 samples from patients with ovarian cancer, 5 samples for each of the 3 tumor immune phenotypes (Infiltrated, Excluded and Desert).	NextSeq 500	15
EGAD00001006976	Globally, human populations show structured genetic diversity as a result of geographical dispersion, selection and drift. Understanding this genetic variation can provide insights into the evolutionary processes that shape both human adaptation and variation in disease. Populations from SSA have the highest levels of genetic diversity. This characteristic, in addition to historical genetic admixture, can lead to complexities in the design of studies assessing the genetic determinants of disease and human variation. However, such studies of African populations are also likely to provide new opportunities to discover novel disease susceptibility loci and variants and refine gene-disease association signals. A systematic assessment of genetic diversity within SSA would facilitate genomic epidemiological studies in the region. The Genome Diversity in Africa Project (GDAP) aims to produce a comprehensive catalogue of human genetic variation in SSA, including single nucleotide polymorphisms (SNPs), structural variants, and haplotypes. This resource will make a substantial contribution to understanding patterns of genetic diversity within and among populations in SSA, as well as providing a global resource to help design, implement and interpret genomic studies in SSA populations and studies comprising globally diverse populations, complementing existing genomic resources. Specifically, we plan to carry out high depth whole genome sequencing of up to 2000 individuals across Africa (25 individuals from each ethnolinguistic group). Our scientific objectives are to: 1) develop a resource that provides a comprehensive catalogue of genetic variation in populations from SSA accessible to the global scientific community; 2) characterise population genetic diversity, structure, gene flow and admixture across SSA; 3) develop a cost-efficient, next-generation genotype array for diverse populations across SSA; and 4) facilitate whole genome-sequencing association studies of complex traits and diseases by developing a reference panel for imputation and resource for enhancing fine-mapping disease susceptibility loci. These scientific objectives will be supported by cross-cutting operational activities, including network and management of the consortium, research ethics, and research capacity building in statistical genetics and bioinformatics . This dataset contains all the data available for this study on 2021-02-17.	HiSeq X Ten Illumina MiSeq	27
EGAD00001006977	In order to characterize the T cell receptor (TCR) repertoire of gluten specific T cells, we performed high-throughput DNA sequencing of rearranged TCR-α and TCR-β genes of the single HLA-DQ2.5:DQ2.5-gluten tetramer binding CD4+ T cells isolated from blood, biopsies and T cell line from celiac disease patients.	Illumina MiSeq	44
EGAD00001006978	4 HPS1 patient monocyte-derived macrophages and 4 controls were RNA sequenced at baseline and after Salmonella Typhimurium infection. We used paired end sequencing on an Illumina HiSeq 4000. Each sample was run on 3 lanes for sequencing depth, which we combined for our analysis.	Illumina HiSeq 4000	48
EGAD00001006979	PacBio long-read circular consensus (CCS) sequencing data for individual HV31 generated on PacBio Sequel II instrument, using size-selected (10-15 kb) DNA from CD14+ monocytes, to a sequencing depth of ~12×. Sequencing was performed at the Wellcome Sanger Institute.	Sequel	1
EGAD00001006980	Temporal HER2-negative breast cancers WES	Illumina HiSeq 2500	94
EGAD00001006981	Single-cell RNA-Sequencing of five TNBC primary breast cancers from Wu et al. (2020) EMBO J study. Data was generated using the Chromium controller (10X Genomics) and sequenced on the NextSeq 500 platform.	NextSeq 500	5
EGAD00001006982	The dataset is composed of three sequenced tumor samples: (I) Meta-bone-557 (bone metastasis obtained from occipital lesion resection during treatment with Liposomal doxorubicin); (II) Meta-CNS-888 (brain metastasis obtained from surgical resection during treatment with Nivolumab); (III) Primary-liver-463 (primary hepatic tumor obtained from surgical resection during treatment with Nivolumab). Genomic DNA from tumor samples was extracted using GeneRead DNA FFPE kit (Qiagen), containing Uracyl-D Glycosylase, according to the manufacturer’s instructions. Whole-exome libraries were prepared using SureSelect XT Clinical Research Exome Target Enrichment kit (Agilent Technologies # 5190-7338). Sequences (150bp paired-end) were generated on a NextSeq 500 sequencing platform (Illumina).	NextSeq 500	3
EGAD00001006983	We combined samples from 1,469 inflammatory bowel disease (IBD) patients consisted of 896 Crohn’s disease (CD) and 573 ulcerative colitis (UC) and 4,041 controls used in our previously published GWAS with 1,726 additional IBD patients (725 CD and 1,001 UC) and 378 additional controls genotyped using the Asian Screening Array (ASA). We uploaded summary statistics of three meta-analyses in text files.		7614
EGAD00001006984	This study includes treatment-naïve fresh tissue sample from 4 HGSOC patients.	Illumina HiSeq 4000	4
EGAD00001006985	RNA-seq data from Korean CRC samples	Illumina HiSeq 2500	160
EGAD00001006986	We created three technical replicates of cell-free DNA from AML patient plasma to assess batch effects and utility of spike-in controls for the cfMeDIP-seq method. Each set of samples were given to three different technicians with slightly different protocols. Details can be found in Wilson et al. "Sensitive and reproducible cell-free methylome quantification with synthetic spike-in controls".	Illumina NovaSeq 6000 NextSeq 550	15
EGAD00001006987	To analyse genome wide DNA copy number changes in combination with mutation status of CRC-related genes, CIMP and MSI, in order to explore the biology of PCCRCs. Formalin-fixed, paraffin-embedded samples from 122 PCCRCs and 98 prevalent CRCs collected in 3 different hospitals in the region of South Limburg, the Netherlands, were used in this study. DNA was extracted for molecular analysis. Labels have been updated.	Illumina HiSeq 2500	203
EGAD00001006988	Whole exome sequencing of samples carrying an MBD4 mutation -n=9)	Illumina NovaSeq 6000	17
EGAD00001006989	Targeted sequencing of MBD4 of either tumor and germline DNA from Uveal Melanoma assembled by pool (germline pool or tumor pool).	Illumina MiSeq	186
EGAD00001006990	JAGuaR outputs from RNA-seq of 35 pancreatic neuroendocrine neoplasms.	Illumina HiSeq 2500	35
EGAD00001006991	STAR outputs from RNA-seq of 84 pancreatic neuroendocrine neoplasms, 10 normal islet samples and 4 cell line samples.	Illumina HiSeq 2500	98
EGAD00001006992	BWA outputs from whole-exome sequencing of 35 pancreatic neuroendocrine neoplasms.	Illumina HiSeq 2500	35
EGAD00001006994	we performed sequential scRNA-seq of 21 specimens (discovery cohort) collected at baseline, during treatment, and/or at disease remission/progression from 3 ibrutinib-responsive (R) patients (Pt-V, C and D) and 2 non-responsive (NR) patients (Pt-B and E). In addition, the PBMC samples from two healthy donors (N1 and N2) were included as the normal controls.	Illumina HiSeq 4000	31
EGAD00001006995	The data contains single-cell gene sequencing data (10x Genomics) from FACS-purified CD8 T lymphocytes from two Austrian patients. The cells were stimulated with one MHC class I peptides obtained from a common (wild type) variant and an emerging mutant variant of the SARS-Cov-2 virus. Then the samples were multiplexed using hashtag oligos. We provide the raw and aligned sequence data for: i. The single-cell experiments ii. The PCR-amplified samples for enrichment of the hashtag oligo multiplexing barcodes iii. The PCR-amplified samples for enrichment of the T Cell Receptor (TCR) VDJ region for immuno-profiling. The samples and libraries were processed and obtained in collaboration between St. Anna Children's Cancer Research Institute (CCRI), CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, and the Medical University of Vienna. The cell barcodes and processed data has been submitted to the GEO database with GEO accession GSE166651.	Illumina NovaSeq 6000	2
EGAD00001006996	Whole exome sequencing of 52 chronic phase/blast crisis pairs obtained from chronic myeloid leukemia	Illumina HiSeq 2500	104
EGAD00001006997	Single-cell RNA sequencing of 13 ‘mild-moderate’ and 10 ‘critical’ COVID19 PBMC samples	Illumina NovaSeq 6000	15
EGAD00001006999	Malignant peripheral nerve sheath tumor (MPNST)-like melanoma is a rare malignancy with overlapping characteristics of both neural sarcoma and melanoma. The genomics of MPNST-like melanoma have not been previously described. In this study, we performed whole exome sequencing analysis in 8 samples from 6 patients diagnosed with MPNST-like melanoma. Our results demonstrate that, although MPNST-like melanoma shares oncogenic alterations common to both cutaneous melanoma and MPNST, it also presents unique genomic alterations not previously described in neither of the malignancies.	unspecified	8
EGAD00001007000	Malignant peripheral nerve sheath tumor (MPNST)-like melanoma is a rare malignancy with overlapping characteristics of both neural sarcoma and melanoma. The genomics of MPNST-like melanoma have not been previously described. In this study, we performed whole transcriptome sequencing analysis in 8 samples from 6 patients diagnosed with MPNST-like melanoma. In correlation with deletion ofxa0SERPINB4xa0in all our samples, there was noxa0SERPINB4xa0mRNA expression in our cohort, suggesting a potential tumor-suppressor role of SERPINB4 in MPNST-like melanomas.xa0HRAS, a gene uncommonly mutated in cutaneous melanomas, was mutated in 2 patients, but with no increased mRNA expression.xa0BRAFxa0mRNA expression, resultant from an atypicalxa0BRAFxa0mutation, was increased in association with an inactivatingxa0NF1xa0mutation. Our data demonstrate the role of alternative mechanisms of RAS pathway activation in MPNST-like melanomas and suggest the potential role of other molecular pathways in its carcinogenesis.	unspecified	7
EGAD00001007001	Anal SCC cell line and parent tumour comparative whole exome sequencing	Illumina NovaSeq 6000	13
EGAD00001007002	The aligned bam file of next generation sequencing performed on PSCCE.	Illumina NovaSeq 6000	64
EGAD00001007003	We collected peripheral blood mononuclear cells (PBMC) from 6 RA patients and 4 healthy controls, as well as synovial fluid (SF) from the same RA patients. We then sorted B cells, CD4+ and CD8+ T cells, regulatory T cells and monocytes using flow cytometry and profiled regions marked with H3K27ac using CUT&Tag.	Illumina MiSeq	59
EGAD00001007004	To identify genomic drivers present in limited-stage small cell lung cancer (LS-SCLC); To determine the overall tumor mutational burden in LS-SCLC; To determine genomic intratumor heterogeneity (ITH) in LS-SCLC.	Illumina HiSeq 2500	69
EGAD00001007005	Paired-end RNA-sequencing of tumour tissue samples (n=85) from primary urothelial bladder cancer patients. Sequencing was performed using either HiSeq (n=27) or NextSeq (n=58) Illumina platforms. Of the 85 samples, 78 are Non-muscle invasive (NMIBC) and 7 are Muscle invasive (MIBC).	Illumina HiSeq 2500 NextSeq 550	85
EGAD00001007006	Tumor exomes for 15 DLBCL samples with PMBL GE signature, with 6 matching normal exomes and 1 pooled normal exome.		22
EGAD00001007010	This dataset contains all sequencing data of the publication "Oncogenic cooperation between the TCF7-SPI1 fusion and NRAS(G12D) requires β-catenin activity to drive T-cell acute lymphoblastic leukemia." This is bulk RNA sequencing of 4 T-ALL patients (X09, XB37, XB41 and XB47) of which X09 has a TCF7-SPI1 fusion, single cell RNA sequencing of these 4 patients toghether with a PDX model of the X09 patient and two patients from another cohort (SJTALL030263 and SJTALL031201) which also have a TCF7-SPI1 fusion, and nanopore sequencing of all patients with the TCF7-SPI1 fusion. Moreover these patient samples with the fusion where treated with PKF 118-310, and bulk RNA sequencing was performed in triplicate to determine the differentially expressed genes.	GridION Illumina HiSeq 4000 unspecified	38
EGAD00001007011	Shallow whole genome sequencing of 77 inflammatory myofibroblastic tumor samples.	Illumina HiSeq 4000	77
EGAD00001007012	Whole exome sequencing of 66 inflammatory myofibroblastic tumor samples.	Illumina HiSeq 4000	66
EGAD00001007013	Single-cell B-cell receptor sequencing (scBCR-seq) data of peripheral blood mononuclear cells (PBMCs) obtained from 30 ATL patients (34 samples including 4 sequential ones), 11 HTLV-1-infected asymptomatic carriers, and 4 healthy donors.	Illumina NovaSeq 6000	48
EGAD00001007014	Single-cell T-cell receptor sequencing (scTCR-seq) data of peripheral blood mononuclear cells (PBMCs) obtained from 30 ATL patients (34 samples including 4 sequential ones), 11 HTLV-1-infected asymptomatic carriers, and 4 healthy donors.	Illumina NovaSeq 6000	48
EGAD00001007015	Single-cell RNA sequencing (scRNA-seq) data of peripheral blood mononuclear cells (PBMCs) obtained from 30 ATL patients (34 samples including 4 sequential ones), 11 HTLV-1-infected asymptomatic carriers, and 4 healthy donors.	Illumina NovaSeq 6000	48
EGAD00001007016	Whole exome sequencing (WES) data of peripheral blood mononuclear cells (PBMCs) obtained from 2 ATL patients (3 samples).	HiSeq X Ten	3
EGAD00001007017	Single-cell antibody-derived tag sequencing (scADT-seq) data of peripheral blood mononuclear cells (PBMCs) obtained from 30 ATL patients (34 samples including 4 sequential ones), 11 HTLV-1-infected asymptomatic carriers, and 4 healthy donors.	Illumina NovaSeq 6000	48
EGAD00001007018	Bulk RNA sequencing (RNA-seq) data of peripheral blood mononuclear cells (PBMCs) obtained from 7 ATL patients (9 samples) and 9 HTLV-1-infected asymptomatic carriers.	HiSeq X Ten	18
EGAD00001007019	This dataset includes fastq files from sWGS and exome sequencing data derived from dsDNA and ssDNA libraries of plasma cfDNA samples extracted by a column- or bead-based DNA extraction method	Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina NovaSeq 6000 NextSeq 550	198
EGAD00001007020	This submission is of the sequencing data used in the CRISPR iPSC methods paper. Specifically it is 3 fastq files that each represent a replicate of an experiment to transduce the Toronto KnockOut CRISPR Library - Version 3 (TKOv3) into inferred pluripotent stem cell (iPSC) derived macrophages. The sequencing is of the guide RNAs from the TKOv3 having been extracted from the transduced iPSC derived macrophages.	Illumina HiSeq 4000	3
EGAD00001007022	In the context of research, this dataset contains 423 IRD samples; 411 of them analyzed with Clinical Exome Sequencing solutions, and 12 with Whole Exome Sequencing.	NextSeq 500	423
EGAD00001007023	The dataset includes cram files from WGS of 115 tumor samples as well as 43 matched normal tissue or blood. The sequencing was done with HiSeq X Five instrument.	HiSeq X Five	158
EGAD00001007024	The dataset includes fastq files from 109 tumor samples as well as RNA-seq gene expression R data, RNA-seq transcript expression R data, RNA-seq gene counts matrix, RNA-seq transcript counts matrix, RNA-seq gene FPKM matrix RNA-seq transcript FPKM matrix for the 109 samples. The sequencing was done with HiSeq 2000 instrument.	Illumina HiSeq 2000	110
EGAD00001007025	The BEACCON study aimed to address the lack of power of previous studies to identify novel BC predisposition genes by performing extensive sequencing in 12,000 women (11,511 analysed following exclusions) and further enhancing power by using an ‘extreme phenotype’ design with enrichment of familial non-BRCA1 and BRCA2 cases, compared with a control population of older women with ongoing confirmation of cancer-free status at June 2019. Three-quarters of the 1303 candidate genes screened were selected based on empiric evidence from local (69 multi-case BC families) or international whole exome sequencing studies, and the remainder were included to provide detailed coverage of functional pathways with established associations with BC.	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina MiSeq	11505
EGAD00001007026	106 Whole Exom Sequencing (WXS) of CMML samples. Paired-end fastq are provided.	Illumina HiSeq 2500	106
EGAD00001007027	The Dutch Microbiome Project (DMP) data includes shotgun metagenomic sequencing of faecal samples 8,208 Dutch individuals. Paired-end sequencing was performed using Illumina HiSeq 2000 platform. Data is archived in two batches to facilitate easier data access and upload to EGA. Batch 1 of DMP includes 4396 samples.	Illumina HiSeq 2000	4396
EGAD00001007028	Nanoseq data from sperm from 2 individuals, including technical replicates from one individual (10 total sequences). 8 additional samples and 2 matched normals to call mutations in NanoSeq data (dataset EGAD00001006459).	HiSeq X Ten Illumina NovaSeq 6000	10
EGAD00001007029	Human Induced Pluripotent Stem Cells (hiPSC) are an established patient-specific model system where opportunities are emerging for cell-based therapies. We compared and contrasted hiPSCs derived from different tissues, skin and blood, in the same individual. We show extensive single-nucleotide mutagenesis in all hiPSC lines, although fibroblast-derived hiPSCs (F-hiPSCs) are particularly heavily mutagenized by ultraviolet (UV)-related damage. We utilized genome sequencing data on 454 F-hiPSCs and 44 blood-derived hiPSCs (B-hiPSCs) to gain further insights. Across 324 whole genome sequenced (WGS) F-hiPSCs derived by the Human Induced Pluripotent Stem Cell Initiative (HipSci), UV-related damage is present in ~72% of cell lines, sometimes causing substantial mutagenesis (range 0.25-15 per Mb). Furthermore, we find remarkable genomic heterogeneity between independent F-hiPSC clones derived from the same reprogramming process in the same donor, due to oligoclonal populations within fibroblasts. Combining WGS and exome-sequencing data of 452 HipSci F-hiPSCs, we identify 272 predicted pathogenic mutations in cancer-related genes, of which 21 genes were hit recurrently three or more times, involving 77 (17%) lines. Notably, 151 of 272 mutations were present in starting fibroblast populations suggesting that more than half of putative driver events in F-hiPSCs were acquired in vivo. In contrast, B-hiPSCs reprogrammed from erythroblasts show lower levels of genome-wide mutations (range 0.28-1.4 per Mb), no UV damage, but a strikingly high prevalence of acquired BCOR mutations in ~57% of lines, indicative of strong selection pressure. All hiPSCs had otherwise stable, diploid genomes on karyotypic pre-screening, highlighting how copy-number-based approaches do not have the required resolution to detect widespread nucleotide mutagenesis. This work strongly suggests that models for cell-based therapies require detailed nucleotide-resolution characterization prior to clinical application.	HiSeq X Ten	86
EGAD00001007030	Single-cell RNA-Sequencing of three primary breast cancers, two primary prostate cancers, and a metastatic melanoma sample from Wu et al. (2021) Genome Medicine study. Each tumour was sequenced across different cryopreservation conditions including Fresh Tissue (FT), cryopreserved single-cell suspensions (CCS), cryopreserved solid tissue fragments (CT) and a cryopreserved after overnight cold storage (CO). Data was generated using the Chromium controller (10X Genomics) and sequenced on the NextSeq platform.	NextSeq 550	18
EGAD00001007031	Targeted capture sequencing data of peripheral blood mononuclear cells (PBMCs) obtained from 4 ATL patients (6 samples) and 10 HTLV-1-infected asymptomatic carriers.	HiSeq X Ten	15
EGAD00001007032	Human single cells were clonally expanded by culture and whole-genome sequenced. This dataset includes 334 clonal samples and 7 blood bulks from seven individuals (DB2, DB3, DB5, DB6, DB8, DB9, DB10). We extracted genomic DNA materials from clonally expanded cells and matched peripheral blood using DNeasy Blood and Tissue kits (Qiagen) according to the protocol. DNA libraries for WGS were generated by an Accel-NGS 2S Plus DNA Library Kit (Swift Biosciences) from 1 µg of genomic DNA materials. WGS was performed on either the Illumina HiSeq X platform or the NovaSeq 6000 platform to generate mean coverage of 25.2X for 374 clonally expanded cells and 94.8X for 7 matched blood tissues.	Illumina NovaSeq 6000	240
EGAD00001007033	De- and trans-differentiation is a rare and only poorly understood phenomenon in cutaneous melanoma. To study this disease more comprehensively we have retrieved 11 primary cutaneous melanomas from our pathology archives showing biphasic features characterized by a conventional melanoma and additional areas of de-/trans-differentiation as defined by a lack of immunohistochemical expression of all conventional melanocytic markers (S-100 protein, SOX10, Melan-A and HMB-45). The clinical, histologic and immunohistochemical findings were recorded and follow-up was obtained. The patients were mostly elderly (median: 81 years; range: 42-86 years) without significant gender predilection, and the sun-exposed skin of the head and neck area was most commonly affected. The tumors were deeply invasive with a mean tumor thickness of 7 mm (range: 4-80 mm). The dedifferentiated component showed atypical fibroxanthoma-like features in the majority (7), while additional rhabdomyosarcomatous and epithelial transdifferentiation was noted histologically and/or immunohistochemically in two tumors each. The background conventional melanoma component was of desmoplastic (4), superficial spreading (3), nodular (2), lentigo maligna (1) or spindle cell (1) types. For the 7 patients with available follow-up data (median follow-up period of 25 months; range: 8-36 months), 2 died from their disease and 3 developed metastases. Next-generation sequencing of the cohort revealed somatic mutation of established melanoma drivers including mainly NF1 mutations in the conventional component (5 cases), which were also detected in the corresponding de-/trans-differentiated components. In summary, the diagnosis of de-/trans-differentiated melanoma is challenging and depends on the morphologic identification of the conventional melanoma component. Molecular analysis is diagnostically helpful as the mutated gene profile is shared between the conventional and de-/trans-differentiated components. Importantly, de-/trans-differentiation does not appear to confer a more aggressive behavior.	Illumina HiSeq 4000	21
EGAD00001007034	De- and trans-differentiation is a rare and only poorly understood phenomenon in cutaneous melanoma. To study this disease more comprehensively we have retrieved 11 primary cutaneous melanomas from our pathology archives showing biphasic features characterized by a conventional melanoma and additional areas of de-/trans-differentiation as defined by a lack of immunohistochemical expression of all conventional melanocytic markers (S-100 protein, SOX10, Melan-A and HMB-45). The clinical, histologic and immunohistochemical findings were recorded and follow-up was obtained. The patients were mostly elderly (median: 81 years; range: 42-86 years) without significant gender predilection, and the sun-exposed skin of the head and neck area was most commonly affected. The tumors were deeply invasive with a mean tumor thickness of 7 mm (range: 4-80 mm). The dedifferentiated component showed atypical fibroxanthoma-like features in the majority (7), while additional rhabdomyosarcomatous and epithelial transdifferentiation was noted histologically and/or immunohistochemically in two tumors each. The background conventional melanoma component was of desmoplastic (4), superficial spreading (3), nodular (2), lentigo maligna (1) or spindle cell (1) types. For the 7 patients with available follow-up data (median follow-up period of 25 months; range: 8-36 months), 2 died from their disease and 3 developed metastases. Next-generation sequencing of the cohort revealed somatic mutation of established melanoma drivers including mainly NF1 mutations in the conventional component (5 cases), which were also detected in the corresponding de-/trans-differentiated components. In summary, the diagnosis of de-/trans-differentiated melanoma is challenging and depends on the morphologic identification of the conventional melanoma component. Molecular analysis is diagnostically helpful as the mutated gene profile is shared between the conventional and de-/trans-differentiated components. Importantly, de-/trans-differentiation does not appear to confer a more aggressive behavior.	Illumina HiSeq 4000	18
EGAD00001007035	The dataset contains data for n=7211 FINRISK 2002 participants who underwent fecal sampling. Demultiplexed shallow shotgun metagenomic sequences were quality filtered and adapter trimmed using Atropos (Didion et al., 2017), and human filtered using Bowtie2 (Langmead and Salzberg, 2012). The files are in FASTQ format.	Illumina HiSeq 4000	7231
EGAD00001007037	Germ cell tumours (GCTs) are a collection of benign and malignant neoplasms derived from primordial germ cells (PGCs). They are uniquely able to generate embryonic and extraembryonic tissues, which in malignant GCTs carries prognostic and therapeutic significance. The developmental pathways underpinning GCT initiation and histogenesis are incompletely understood. Here, we studied the phylogenetic and transcriptional diversity of 15 malignant gonadal GCTs and four normal testis biopsies by sequencing 131 whole genomes and 416 transcriptomes from 14 gonadal histologies, excised by laser capture microdissection. Our findings demonstrate that tumours were initiated by whole genome duplication likely in embryogenesis, within ~5-8 cell divisions post-PGC specification, followed by chromosome 12p gains associated with invasive disease. Of note, 12p imbalances were not only generated through GCT-typical isochromosomes, but also through non-isochromosomic configurations. Whilst tumours developed along homogenous phylogenetic pathways, they spawned manifold tissues independent of genetic subclonal diversification. A key feature of GCT tissues was the expression of fetal-specific genes. The transcriptional diversity notwithstanding, we found universal transcriptional elements correlated with hallmark 12p gains. Overall, our study reveals stereotyped phylogenies and transcriptomes underpinning the development of GCT that originate in fetal life and may lend themselves to therapeutic manipulation.		416
EGAD00001007038	Germ cell tumours (GCTs) are a collection of benign and malignant neoplasms derived from primordial germ cells (PGCs). They are uniquely able to generate embryonic and extraembryonic tissues, which in malignant GCTs carries prognostic and therapeutic significance. The developmental pathways underpinning GCT initiation and histogenesis are incompletely understood. Here, we studied the phylogenetic and transcriptional diversity of 15 malignant gonadal GCTs and four normal testis biopsies by sequencing 131 whole genomes and 416 transcriptomes from 14 gonadal histologies, excised by laser capture microdissection. Our findings demonstrate that tumours were initiated by whole genome duplication likely in embryogenesis, within ~5-8 cell divisions post-PGC specification, followed by chromosome 12p gains associated with invasive disease. Of note, 12p imbalances were not only generated through GCT-typical isochromosomes, but also through non-isochromosomic configurations. Whilst tumours developed along homogenous phylogenetic pathways, they spawned manifold tissues independent of genetic subclonal diversification. A key feature of GCT tissues was the expression of fetal-specific genes. The transcriptional diversity notwithstanding, we found universal transcriptional elements correlated with hallmark 12p gains. Overall, our study reveals stereotyped phylogenies and transcriptomes underpinning the development of GCT that originate in fetal life and may lend themselves to therapeutic manipulation.	HiSeq X Ten Illumina NovaSeq 6000	-
EGAD00001007039	This dataset includes bam files of WES of clonally related neuroblastoma and teratoma as well as peripheral blood samples as a control. Neuroblastoma and teratoma samples were formalin-fixed paraffin embedded.	Illumina HiSeq 2500	3
EGAD00001007040	The dataset contains high-throughput sequencing data derived from a cancer autopsy series of 10 patients. As part of this study, whole-exome sequencing and RNA-seq were performed for spatially distinct tissue biopsies from the patients. In addition, plasma samples from the patients were sequenced using a custom panelt to profile ctDNA. There are 106 files containing whole-exome sequencing data, 107 files containing RNA-seq data, and 9 files containing plasma sequencing data.	Illumina HiSeq 2500	222
EGAD00001007041	We developed Genetic-Epigenetic Tissue Mapping (GETMap) to determine the tissue composition of plasma DNA carrying genetic variants not present in the constitutional genome through comparing their methylation profiles with relevant tissues.	Illumina HiSeq 4000 NextSeq 500	152
EGAD00001007042	Illumina PCR-free sequencing data for individual HV31 generated using DNA from peripheral blood mononuclear cells, to a sequencing depth of ~44×. Sequencing was performed at the Wellcome Centre for Human Genetics on the Illumina Novaseq platform.	unspecified	1
EGAD00001007043	Oxford Nanopore long-read sequencing data for individual HV31 generated using DNA from CD14+ monocytes, to a sequencing depth of ~63×. Sequencing was performed at the Wellcome Centre for Human Genetics using the Oxford Nanopore PromethION platform.	PromethION	1
EGAD00001007044	MGI standard short-read sequencing data for individual HV31 generated using DNA from peripheral blood mononuclear cells, to a sequencing depth of ~57×.	unspecified	1
EGAD00001007045	MGI single-tube long fragment read (stLFR) linked-read sequencing data for individual HV31 generated using DNA from CD14+ monocytes, to a sequencing depth of ~51×.	unspecified	1
EGAD00001007046	10x linked-read sequencing data for individual HV31 generated using DNA from CD14+ monocytes, to a sequencing depth of ~40×. Sequencing was performed at Bart’s and the London Genome Centre on the Illumina HiSeq platform.	unspecified	1
EGAD00001007047	PacBio continuous long read (CLR) sequencing data for individual HV31 generated on PacBio Sequel II instrument, using DNA from CD14+ monocytes, to a sequencing depth of ~35×. Sequencing was performed at the Wellcome Sanger Institute.	Sequel	1
EGAD00001007048	MGI CoolMPS short-read sequencing data for individual HV31 generated using DNA from peripheral blood mononuclear cells, to a sequencing depth of ~57×.	unspecified	1
EGAD00001007049	Bionano DLS optical mapping data for individual HV31 generated using DNA from peripheral blood mononuclear cells, to a molecule depth of ~153×. Optical mapping was performed at the Weatherall Institute of Molecular Medicine using the Bionano Saphyr platform.		1
EGAD00001007050	De novo assembly of eight immune system regions for individual HV31, generated using a multi-platform pipeline. A full description of the generation of these assemblies can be found at https://doi.org/10.1101/2021.02.03.429586.		1
EGAD00001007051	Exome libraries were prepared using 100ng DNA of tumor tissue or matched normal DNA. Exome capture was performed using Agilent SureSelect Human Exome Library Preparation V5 or V6 COSIMC + kits.	unspecified	184
EGAD00001007052	This contains H3K27ac ChIP-seq, RNA-seq and HiC fastq files.	Illumina Genome Analyzer Illumina NovaSeq 6000	5
EGAD00001007055	RNAseq of 55 melanoma tumors that were used as a validation dataset in Garg et al Nat Commun, 2021 Feb 18;12(1):1137. doi: 10.1038/s41467-021-21207-2.		-
EGAD00001007056	Low-coverage whole genome sequencing of 29 early breast cancer samples.	Illumina HiSeq 4000	29
EGAD00001007057	Whole-exome sequencing of 30 early breast cancer samples.	Illumina HiSeq 4000	30
EGAD00001007058	CITE-seq of early breast cancer samples.	Illumina HiSeq 4000 Illumina NovaSeq 6000	16
EGAD00001007060	The data are the aggregate results from an IGPP Consortium genome-wide survival study, showing overall risk for Parkinson disease progression associated with each variant in a longitudinal cohort study. 11.2 million deeply imputed variants in 3,821 PD patients who were prospectively tracked with 36,123 visits over a median of 6.7 years from disease onset (inter-quartile range, 4.2 years) were analyzed. Data include hazard ratio, SNP ID, and P value.		1
EGAD00001007061	Amplicon seqeuencing of (1) wildtype IPC298 cell line grown for 3-4 weeks with DMSO, amplified for ARAF exon 11 (2) IPC298 cells treated for 3-4 weeks with 10uM belvarafenib, isolated colony 9, amplified for ARAF exon 11 (3) MelJuso cell line grown with DMSO, amplified for ARAF exon 11	Illumina HiSeq 4000	3
EGAD00001007062	Whole Exome Sequencing of Belvarafenib resistant IPC-298 clones after treatment for 3-4 weeks with 10uM belvarafenib	Illumina HiSeq 4000	6
EGAD00001007063	Multiregional analysis of three cases of GBM. For each tumor, 9 portions were analyzed by whole exome sequencing. A total of 27 bam files are present in our dataset.	NextSeq 500	27
EGAD00001007064	Shotgun metagenomic sequencing data of a total 2,338 fecal DNA samples from adults of the Pinggu cohort.	unspecified	2338
EGAD00001007066	epigenome profiling in tumor tissues and paired normal tissues of LUAD patients and transcriptome profiling in tumor tissues of LUAD patients.	HiSeq X Ten	83
EGAD00001007070	This study consists of over 200 data files from cfDNA and germline DNA from 69 patients and 32 healthy normal volunteers discussed in this publication.	Illumina MiSeq	71
EGAD00001007071	PopCol is a cohort study in Stockholm, Sweden that includes a data-rich set of individuals with data available from bowel symptoms questionnaires, gastroenterology visits and biospecimens (genotype and 16S sequencing from blood and stool samples, respectively). Genotyping was carried out using the Illumina HumanOmniExpressExome-8v1 arrays at the SciLifeLab NGI facility in Uppsala, Sweden. Fecal DNA was extracted from samples kept at -80°C using Qiagen 5 QIAamp DNA Stool Mini Kits and analyzed using 16S rRNA gene amplicon sequencing (in the V1-V2 hypervariable region). This was performed on the Illumina MiSeq platform at the Institute of Clinical Molecular Biology (IKMB) in Kiel, Germany. Of these, six PopCol participants were PPI users and 12 used antibiotics. The study was approved by the local Committee of Research Ethics (Forskningskommitté Syd) at Karolinska Institutet, Stockholm, in November 2001. Written informed consent was obtained from all participants	Illumina MiSeq	134
EGAD00001007072	16p11.2 CNV iPSC derived dopaminergic neuron transcriptional gene expression data.	NextSeq 500	17
EGAD00001007073	RNA-seq data of non-tumorous breast tissue. There are 32 samples in this cohort.	Illumina HiSeq 4000	32
EGAD00001007074	Maastricht IBS cohort with biobank aims to identify subgroups of IBS according to phenotypical and genotypical characterization. This dataset represents 16S amlicon sequencing of the gut microbiome of case samples and matched controls. Fecal DNA was extracted using the Qiagen AllPrep kit with bead-beating step. Sequencing of bacterial 16S gene, domain V4, was performed using the Illumina MiSeq platform.	Illumina MiSeq	356
EGAD00001007075	This dataset includes linked-read whole-genome sequencing data (subfolder HFF7VCCXY) for multifocal ileal tumor samples from one patient. Samples were sequenced using the 10x Genomics linked-read whole-genome sequencing (WGS) approach.	Illumina HiSeq 2500	88
EGAD00001007076	We conducted whole-exome sequencing (WES) and microarray profiling on 19 micro-dissected tumor regions of different histologic subtypes from 9 patients with lung cancers of mixed histologic patterns including 6 LUAD, 6 LCNEC, 3 SCLC, 3 LUSC, and one poorly differentiated NSCLC-NOS.	Illumina HiSeq 2000	28
EGAD00001007078	Dataset contains WGS sequecing data from clonally expanded hematopoietic stem cells from 7 individual pediatric cancer patients. Samples were taken before (DX - diagnosis) or Follow-up (DX2/REM/FU - Diagnosis 2, remission or follow-up, respectively). In addtion, cord blood clones (Designated CB) treated with X-ray radiation, Cisplatin, Maphosphamide, Vincristine and Doxorubicin and untreated cord blood hematopoietic stem/progenitor cells were have been whole-genome sequenced. (Abbreviations RAD, CISPL, MAPH, VINC, DOX and CTRL, respectively)	Illumina NovaSeq 6000	115
EGAD00001007079	RNA-Seq data from vocal fold fibroblasts from controls and patients with Reinke’s edema	NextSeq 550	27
EGAD00001007080	This dataset contains: i) 241 deep (median 12x) whole-genome sequencing profiles of 95 patients with Ewing sarcoma, 31 patients with other pediatric sarcomas, and 22 additional profiles from healthy controls. Sequencing was performed on a NovaSeq 6000 instrument using the NovaSeq S4 2x100 bp configuration. In addition, pilot experiments for 18 cfDNA samples were performed using Illumina HiSeq 2000/2500 machines (2x75 bp configuration). Data is provided as .fastq.gz files (2 files, .R1 and .R2, per sample). i) Low coverage whole-genome-sequencing on 43 tumor biopsy samples from patients with Ewing sarcoma with matched cfDNA samples. The samples were sequenced using a NovaSeq 6000 instrument with the NovaSeq S4 1x100 bp configuration. Data are provided as unmapped (raw) .bam files. iii) Reduced-representation bisulfite sequencing data for 38 tumor biopsy samples from patients with Ewing sarcoma with matched cfDNA samples, and 2 control samples. RRBS libraries were sequenced on Illumina HiSeq 2000/2500 machines. Data are provided as unmapped (raw) .bam files.	Illumina NovaSeq 6000 unspecified	346
EGAD00001007081	Paired single-cell sequencing dataset of T-cell receptors from both treated and untreated celiac patients. (Amplicon sequencing, paired-end fastq files).	Illumina MiSeq	62
EGAD00001007082	Through the Peruvian Genome Project we generate and analyze the high coverage genomes of 150 individuals where the majority have >90% Native American ancestry and explore questions at the interface of evolutionary genetics, history, anthropology, and medicine. This is the most extensive sampling of high-coverage Native American and mestizo whole genomes to date. Reference: https://doi.org/10.1073/pnas.1720798115	HiSeq X Ten	150
EGAD00001007083	The innate immune response of cells of hepatic origin (Huh7, Huh7.5, PH5CH and primary human hepatocytes (PHH), 66 samples) was analyzed by transcriptome analysis (RNAseq) upon supernatant delivery or transfection of synthetic dsRNA (poly(I:C)). Expression of TLR3 and RIG-I was reconstituted by lentiviral transduction in Huh7 and Huh7.5 cells. The sequencing is single RNA-Seq on an Illumina HiSeq 4000 with the Illumina TruSeq stranded mRNA kit.	Illumina HiSeq 4000	66
EGAD00001007085	The dataset is composed by the raw and processed sequencing data generated from 8 Australian Patients and 13 Argentinian Patients affected by a form of male infertility characterised by vital, but immotile sperm often in combination with a spectrum of structural abnormalities of the sperm flagellum.	Illumina NovaSeq 6000 NextSeq 500	21
EGAD00001007086	Whole genome sequencing for single cells for library A108757B 1644 cells; filetype=bam	HiSeq X Five	5
EGAD00001007087	Whole genome sequencing for single cells for library A98299B 991 cells; filetype=bam	HiSeq X Five	5
EGAD00001007088	Whole genome sequencing for single cells for library A108759B 1134 cells; filetype=bam	HiSeq X Five	5
EGAD00001007089	Whole genome sequencing for single cells for library A108768B 1126 cells; filetype=bam	HiSeq X Five	5
EGAD00001007090	Whole genome sequencing for single cells for library A108837A 1435 cells; filetype=bam	HiSeq X Five	5
EGAD00001007091	Whole genome sequencing for single cells for library A108846A 1670 cells; filetype=bam	HiSeq X Five	5
EGAD00001007092	Whole genome sequencing for single cells for library A108846B 1636 cells; filetype=bam	HiSeq X Five	5
EGAD00001007093	Whole genome sequencing for single cells for library A108879A 1567 cells; filetype=bam	HiSeq X Five	5
EGAD00001007094	Whole genome sequencing for single cells for library A110632A 1151 cells; filetype=bam	HiSeq X Five	5
EGAD00001007095	Whole genome sequencing for single cells for library A110632B 892 cells; filetype=bam	HiSeq X Five	5
EGAD00001007096	Whole genome sequencing for single cells for library A118833A 919 cells; filetype=bam	HiSeq X Five	5
EGAD00001007097	Whole genome sequencing for single cells for library A118833B 1147 cells; filetype=bam	HiSeq X Five	5
EGAD00001007098	Whole genome sequencing for single cells for library A118845B 1209 cells; filetype=bam	HiSeq X Five	5
EGAD00001007099	Whole genome sequencing for single cells for library A118869A 1165 cells; filetype=bam	HiSeq X Five	5
EGAD00001007100	Whole genome sequencing for single cells for library A118869B 1124 cells; filetype=bam	HiSeq X Five	5
EGAD00001007101	Whole genome sequencing for single cells for library A73044B 2085 cells; filetype=bam	Illumina HiSeq 2500	5
EGAD00001007102	Whole genome sequencing for single cells for library A73047D 2091 cells; filetype=bam	Illumina HiSeq 2500	5
EGAD00001007103	Whole genome sequencing for single cells for library A95626A 1040 cells; filetype=bam	HiSeq X Five	5
EGAD00001007104	Whole genome sequencing for single cells for library A95633B 2259 cells; filetype=bam	NextSeq 550	7
EGAD00001007105	Whole genome sequencing for single cells for library A95634B 1335 cells; filetype=bam	HiSeq X Five	5
EGAD00001007106	Whole genome sequencing for single cells for library A95646A 1071 cells; filetype=bam	HiSeq X Five	5
EGAD00001007107	Whole genome sequencing for single cells for library A95653B 1342 cells; filetype=bam	HiSeq X Five	5
EGAD00001007108	Whole genome sequencing for single cells for library A95663B 1364 cells; filetype=bam	HiSeq X Five	7
EGAD00001007109	Whole genome sequencing for single cells for library A95675A 668 cells; filetype=bam	HiSeq X Five	5
EGAD00001007110	Whole genome sequencing for single cells for library A95731B 1307 cells; filetype=bam	HiSeq X Five	5
EGAD00001007111	Whole genome sequencing for single cells for library A96115A 1635 cells; filetype=bam	HiSeq X Five	7
EGAD00001007112	Whole genome sequencing for single cells for library A96115B 1201 cells; filetype=bam	HiSeq X Five	5
EGAD00001007113	Whole genome sequencing for single cells for library A96118A 1193 cells; filetype=bam	HiSeq X Five	5
EGAD00001007114	Whole genome sequencing for single cells for library A96130B 1267 cells; filetype=bam	HiSeq X Five	5
EGAD00001007115	Whole genome sequencing for single cells for library A96178A 788 cells; filetype=bam	HiSeq X Five	5
EGAD00001007116	Whole genome sequencing for single cells for library A96178B 846 cells; filetype=bam	HiSeq X Five	5
EGAD00001007117	Whole genome sequencing for single cells for library A96181C 1216 cells; filetype=bam	HiSeq X Five	5
EGAD00001007118	Whole genome sequencing for single cells for library A96193B 2410 cells; filetype=bam	HiSeq X Five	4
EGAD00001007119	Whole genome sequencing for single cells for library A96225B 959 cells; filetype=bam	HiSeq X Five	4
EGAD00001007120	Whole genome sequencing for single cells for library A96225C 1034 cells; filetype=bam	HiSeq X Five	4
EGAD00001007121	Whole genome sequencing for single cells for library A96240A 1493 cells; filetype=bam	HiSeq X Five	5
EGAD00001007122	Whole genome sequencing for single cells for library A96240B 1683 cells; filetype=bam	HiSeq X Five	5
EGAD00001007123	Whole genome sequencing for single cells for library A98172A 898 cells; filetype=bam	HiSeq X Five	5
EGAD00001007124	Whole genome sequencing for single cells for library A98256A 1372 cells; filetype=bam	HiSeq X Five	5
EGAD00001007125	Whole genome sequencing for single cells for library A98256B 1437 cells; filetype=bam	HiSeq X Five	5
EGAD00001007126	Whole genome sequencing for single cells for library A98257B 1286 cells; filetype=bam	HiSeq X Five	5
EGAD00001007127	Whole genome sequencing for single cells for library A98258B 1357 cells; filetype=bam	HiSeq X Five	5
EGAD00001007128	Whole genome sequencing for single cells for library A98274B 1295 cells; filetype=bam	HiSeq X Five	5
EGAD00001007129	Whole genome sequencing for single cells for library A98282A 1198 cells; filetype=bam	HiSeq X Five	5
EGAD00001007130	Whole genome sequencing for single cells for library A98283A 1137 cells; filetype=bam	HiSeq X Five	5
EGAD00001007131	Whole genome sequencing for single cells for library A98285A 1422 cells; filetype=bam	HiSeq X Five	5
EGAD00001007132	Whole genome sequencing for single cells for library A98295B 1381 cells; filetype=bam	HiSeq X Five	5
EGAD00001007133	This dataset are the bam files of WGS data from the paper by He et al.	Illumina NovaSeq 6000	56
EGAD00001007134	This dataset are the bam files of WES data from the paper by He et al.	Illumina NovaSeq 6000	64
EGAD00001007135	This dataset are the bam files of RNA-seq data from the paper by He et al.	Illumina NovaSeq 6000	52
EGAD00001007136	ADMSC05 WGBS paired end data	HiSeq X Ten	1
EGAD00001007137	ADMSC06 WGBS paired end data	HiSeq X Ten	1
EGAD00001007138	ADMSC07 WGBS paired end data	HiSeq X Ten	1
EGAD00001007139	ADMSC08 WGBS paired end data	HiSeq X Ten	1
EGAD00001007140	Islet-derived-MSC01 WGBS paired end data	HiSeq X Ten	1
EGAD00001007141	Islet-derived-MSC02 WGBS paired end data	HiSeq X Ten	1
EGAD00001007142	Islet-derived-MSC03 WGBS paired end data	HiSeq X Ten	1
EGAD00001007143	Islet-derived-MSC04 WGBS paired end data	HiSeq X Ten	1
EGAD00001007144	Islet-derived-iPSC01 WGBS paired end data	HiSeq X Ten	1
EGAD00001007145	Islet-derived-iPSC02 WGBS paired end data	HiSeq X Ten	1
EGAD00001007146	Pancreas-Islet07 WGBS paired end data	HiSeq X Ten	1
EGAD00001007147	Fat-adipocyte03 WGBS paired end data	HiSeq X Ten	1
EGAD00001007148	Fat-adipocyte04 WGBS paired end data	HiSeq X Ten	1
EGAD00001007149	Fat-adipocyte05 WGBS paired end data	HiSeq X Ten	1
EGAD00001007150	ADMSC01 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007151	ADMSC02 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007152	ADMSC03 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007153	ADMSC05 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007154	ADMSC06 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007155	ADMSC07 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007156	ADMSC08 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007157	Fat-adipocyte03 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007158	Fat-adipocyte04 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007159	Fat-adipocyte05 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007160	Islet-derived-MSC01 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007161	Islet-derived-MSC02 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007162	Islet-derived-MSC03 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007163	Islet-derived-MSC04 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007164	Islet-derived-iPSC01 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007165	Islet-derived-iPSC02 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007166	Pancreas-Islet06 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007167	Pancreas-Islet07 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007168	Pancreas-Islet08 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007169	SMC01 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007170	SMC02 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007171	SMC03 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007172	SMC04 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007173	SMC05 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007174	SMC07 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007175	SMC08 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007176	SMC09 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007177	ADMSC05 miRNA-Seq single end data	Illumina HiSeq 2500	1
EGAD00001007178	ADMSC06 miRNA-Seq single end data	Illumina HiSeq 2500	1
EGAD00001007179	ADMSC07 miRNA-Seq single end data	Illumina HiSeq 2500	1
EGAD00001007180	ADMSC08 miRNA-Seq single end data	Illumina HiSeq 2500	1
EGAD00001007181	Islet-derived-MSC01 miRNA-Seq single end data	Illumina HiSeq 2500	1
EGAD00001007182	Islet-derived-MSC02 miRNA-Seq single end data	Illumina HiSeq 2500	1
EGAD00001007183	Islet-derived-MSC03 miRNA-Seq single end data	Illumina HiSeq 2500	1
EGAD00001007184	Islet-derived-MSC04 miRNA-Seq single end data	Illumina HiSeq 2500	1
EGAD00001007185	Islet-derived-iPSC01 miRNA-Seq single end data	Illumina HiSeq 2500	1
EGAD00001007186	Islet-derived-iPSC02 miRNA-Seq single end data	Illumina HiSeq 2500	1
EGAD00001007187	Pancreas-Islet06 miRNA-Seq single end data	Illumina HiSeq 2500	1
EGAD00001007188	Pancreas-Islet07 miRNA-Seq single end data	Illumina HiSeq 2500	1
EGAD00001007189	Pancreas-Islet08 miRNA-Seq single end data	Illumina HiSeq 2500	1
EGAD00001007190	Fat-adipocyte03 miRNA-Seq single end data	Illumina HiSeq 2500	1
EGAD00001007191	Fat-adipocyte04 miRNA-Seq single end data	Illumina HiSeq 2500	1
EGAD00001007192	Fat-adipocyte05 miRNA-Seq single end data	Illumina HiSeq 2500	1
EGAD00001007193	Pancreas-Islet02 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007194	Pancreas-Islet02 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007195	Pancreas-Islet02 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007196	Pancreas-Islet02 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007197	Pancreas-Islet02 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007198	Pancreas-Islet02 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007199	Pancreas-Islet02 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007200	Pancreas-Islet03 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007201	Pancreas-Islet03 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007202	Pancreas-Islet03 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007203	Pancreas-Islet03 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007204	Pancreas-Islet03 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007205	Pancreas-Islet03 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007206	Pancreas-Islet03 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007207	Pancreas-Islet04 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007208	Pancreas-Islet04 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007209	Pancreas-Islet04 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007210	Pancreas-Islet04 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007211	Pancreas-Islet04 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007212	Pancreas-Islet04 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007213	Pancreas-Islet04 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007214	Pancreas-beta01 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007215	Pancreas-beta01 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007216	Pancreas-beta01 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007217	Pancreas-beta01 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007218	Pancreas-beta01 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007219	Pancreas-beta01 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007220	Pancreas-beta01 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007221	Fat-Preadipocyte01 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007222	Fat-Preadipocyte01 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007223	Fat-Preadipocyte01 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007224	Fat-Preadipocyte01 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007225	Fat-Preadipocyte01 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007226	Fat-Preadipocyte01 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007227	Fat-Preadipocyte01 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007228	Kidney-Podocyte01 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007229	Kidney-Podocyte01 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007230	Kidney-Podocyte01 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007231	Kidney-Podocyte01 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007232	Kidney-Podocyte01 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007233	Kidney-Podocyte01 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007234	Kidney-Podocyte01 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007235	Kidney-Podocyte03 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007236	Kidney-Podocyte03 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007237	Kidney-Podocyte03 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007238	Kidney-Podocyte03 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007239	Kidney-Podocyte03 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007240	Kidney-Podocyte03 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007241	Kidney-Podocyte03 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007242	Kidney-Podocyte04 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007243	Kidney-Podocyte04 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007244	Kidney-Podocyte04 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007245	Kidney-Podocyte04 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007246	Kidney-Podocyte04 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007247	Kidney-Podocyte04 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007248	Kidney-Podocyte04 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007249	Kidney-mesangial01 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007250	Kidney-mesangial01 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007251	Kidney-mesangial01 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007252	Kidney-mesangial01 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007253	Kidney-mesangial01 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007254	Kidney-mesangial01 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007255	Kidney-mesangial01 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007256	Kidney-mesangial02 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007257	Kidney-mesangial02 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007258	Kidney-mesangial02 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007259	Kidney-mesangial02 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007260	Kidney-mesangial02 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007261	Kidney-mesangial02 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007262	Kidney-mesangial02 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007263	IPS-Fibroblast01 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007264	IPS-Fibroblast01 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007265	IPS-Fibroblast01 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007266	IPS-Fibroblast01 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007267	IPS-Fibroblast01 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007268	IPS-Fibroblast01 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007269	IPS-Fibroblast01 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007270	IPS-Fibroblast02 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007271	IPS-Fibroblast02 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007272	IPS-Fibroblast02 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007273	IPS-Fibroblast02 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007274	IPS-Fibroblast02 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007275	IPS-Fibroblast02 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007276	IPS-Fibroblast02 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007277	IPS-NPC01 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007278	IPS-NPC01 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007279	IPS-NPC01 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007280	IPS-NPC01 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007281	IPS-NPC01 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007282	IPS-NPC01 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007283	IPS-NPC01 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007284	IPS-NPC02 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007285	IPS-NPC02 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007286	IPS-NPC02 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007287	IPS-NPC02 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007288	IPS-NPC02 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007289	IPS-NPC02 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007290	IPS-NPC02 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007291	IPS-ENeuron01 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007292	IPS-ENeuron01 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007293	IPS-ENeuron01 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007294	IPS-ENeuron01 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007295	IPS-ENeuron01 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007296	IPS-ENeuron01 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007297	IPS-ENeuron01 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007298	IPS-ENeuron02 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007299	IPS-ENeuron02 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007300	IPS-ENeuron02 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007301	IPS-ENeuron02 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007302	IPS-ENeuron02 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007303	IPS-ENeuron02 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007304	IPS-ENeuron02 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007305	458 single cell samples of multiple colorectal cancer organoids	NextSeq 500	458
EGAD00001007306	Whole genome sequencing for single cells for library A95646B 507 cells; filetype=bam	HiSeq X Five	5
EGAD00001007307	Molecular profiling by exome sequencing of an AML case following treatment with a BCL2 inhibitor	Illumina HiSeq 2000	6
EGAD00001007308	SMC01 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007309	SMC01 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007310	SMC01 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007311	SMC01 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007312	SMC01 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007313	SMC01 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007314	SMC02 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007315	SMC02 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007316	SMC02 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007317	SMC02 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007318	SMC02 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007319	SMC02 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007320	SMC03 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007321	SMC03 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007322	SMC03 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007323	SMC03 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007324	SMC03 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007325	SMC03 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007326	SMC04 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007327	SMC04 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007328	SMC04 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007329	SMC04 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007330	SMC04 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007331	SMC04 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007332	SMC05 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007333	SMC05 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007334	SMC05 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007335	SMC05 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007336	SMC05 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007337	SMC05 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007338	SMC07 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007339	SMC07 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007340	SMC07 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007341	SMC07 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007342	SMC07 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007343	SMC07 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007344	SMC08 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007345	SMC08 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007346	SMC08 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007347	SMC08 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007348	SMC08 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007349	SMC08 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007350	SMC09 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007351	SMC09 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007352	SMC09 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007353	SMC09 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007354	SMC09 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007355	SMC09 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007356	ADMSC01 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007357	ADMSC01 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007358	ADMSC01 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007359	ADMSC01 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007360	ADMSC01 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007361	ADMSC01 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007362	ADMSC02 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007363	ADMSC02 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007364	ADMSC02 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007365	ADMSC02 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007366	ADMSC02 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007367	ADMSC02 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007368	ADMSC03 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007369	ADMSC03 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007370	ADMSC03 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007371	ADMSC03 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007372	ADMSC03 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007373	ADMSC03 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007374	ADMSC05 h3k27Ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007375	ADMSC05 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007376	ADMSC05 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007377	ADMSC05 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007378	ADMSC05 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007379	ADMSC05 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007380	ADMSC05 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007381	ADMSC06 h3k27Ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007382	ADMSC06 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007383	ADMSC06 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007384	ADMSC06 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007385	ADMSC06 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007386	ADMSC06 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007387	ADMSC06 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007388	ADMSC07 h3k27Ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007389	ADMSC07 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007390	ADMSC07 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007391	ADMSC07 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007392	ADMSC07 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007393	ADMSC07 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007394	ADMSC07 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007395	ADMSC08 h3k27Ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007396	ADMSC08 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007397	ADMSC08 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007398	ADMSC08 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007399	ADMSC08 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007400	ADMSC08 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007401	ADMSC08 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007402	Islet-derived-MSC01 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007403	Islet-derived-MSC01 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007404	Islet-derived-MSC01 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007405	Islet-derived-MSC01 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007406	Islet-derived-MSC01 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007407	Islet-derived-MSC01 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007408	Islet-derived-MSC01 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007409	Islet-derived-MSC02 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007410	Islet-derived-MSC02 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007411	Islet-derived-MSC02 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007412	Islet-derived-MSC02 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007413	Islet-derived-MSC02 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007414	Islet-derived-MSC02 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007415	Islet-derived-MSC02 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007416	Islet-derived-MSC03 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007417	Islet-derived-MSC03 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007424	Islet-derived-MSC03 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007425	Islet-derived-MSC03 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007426	Islet-derived-MSC03 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007427	Islet-derived-MSC03 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007428	Islet-derived-MSC03 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007429	Islet-derived-MSC04 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007430	Islet-derived-MSC04 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007431	Islet-derived-MSC04 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007432	Islet-derived-MSC04 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007433	Islet-derived-MSC04 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007434	Islet-derived-MSC04 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007435	Islet-derived-MSC04 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007436	Islet-derived-iPSC01 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007437	Islet-derived-iPSC01 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007438	Islet-derived-iPSC01 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007439	Islet-derived-iPSC01 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007440	Islet-derived-iPSC01 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007441	Islet-derived-iPSC01 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007442	Islet-derived-iPSC01 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007443	Islet-derived-iPSC02 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007444	Islet-derived-iPSC02 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007445	Islet-derived-iPSC02 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007446	Islet-derived-iPSC02 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007447	Islet-derived-iPSC02 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007448	Islet-derived-iPSC02 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007449	Islet-derived-iPSC02 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007450	Pancreas-Islet06 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007451	Pancreas-Islet06 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007452	Pancreas-Islet06 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007453	Pancreas-Islet06 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007454	Pancreas-Islet06 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007455	Pancreas-Islet06 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007456	Pancreas-Islet06 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007457	Pancreas-Islet07 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007458	Pancreas-Islet07 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007459	Pancreas-Islet07 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007460	Pancreas-Islet07 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007461	Pancreas-Islet07 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007462	Pancreas-Islet07 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007463	Pancreas-Islet07 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007464	Pancreas-Islet08 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007465	Pancreas-Islet08 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007466	Pancreas-Islet08 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007467	Pancreas-Islet08 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007468	Pancreas-Islet08 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007469	Pancreas-Islet08 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007470	Pancreas-Islet08 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007471	Fat-adipocyte03 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007472	Fat-adipocyte03 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007473	Fat-adipocyte03 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007474	Fat-adipocyte03 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007475	Fat-adipocyte03 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007476	Fat-adipocyte03 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007477	Fat-adipocyte03 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007478	Fat-adipocyte04 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007479	Fat-adipocyte04 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007480	Fat-adipocyte04 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007481	Fat-adipocyte04 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007482	Fat-adipocyte04 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007483	Fat-adipocyte04 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007484	Fat-adipocyte04 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007485	Fat-adipocyte05 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007486	Fat-adipocyte05 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007487	Fat-adipocyte05 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007488	Fat-adipocyte05 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007489	Fat-adipocyte05 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007490	Fat-adipocyte05 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007491	Fat-adipocyte05 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007493	Data supporting: "Genomic analysis of response to neoadjuvant chemotherapy in esophageal adenocarcinoma" Izadi et al. WGS for tumour and normal samples. RNAseq for tumour samples.	HiSeq X Five Illumina HiSeq 2000 Illumina NovaSeq 6000	8
EGAD00001007494	RNA-sequencing of meningiomas for integrative molecular classification.	Illumina HiSeq 2500	124
EGAD00001007495	Single-cell RNA-Sequencing of 26 primary breast cancers from Wu et al. (2021) study. Data was generated using the Chromium controller (10X Genomics) and sequenced on the NextSeq 500 platform.	NextSeq 500	26
EGAD00001007496	Data supporting: "Widespread reorganisation of the regulatory chromatin landscape facilitates resistance to inhibition of oncogenic ERBB2 signalling" Ogden et al. WGS for tumour and normal samples. RNAseq for tumour samples.	HiSeq X Five Illumina HiSeq 2000	1
EGAD00001007497	One retinoblastoma sample was studied by single-cell RNA-sequencing (10X genomics Chromium).	Illumina NovaSeq 6000	1
EGAD00001007498		NextSeq 500	1
EGAD00001007499	38 PPTCs were sequenced using RNASeq to identify oncogenic variants in driver-unknown tumors and to explore gene expression patterns. RNA libraries were quantified by qPCR using the Kapa Library Quantification Illumina/ABI Prism Kit protocol (KAPA Biosystems). Libraries were pooled in equimolar quantities and paired-end sequenced on 2 lanes of a High Throughput Run Mode flowcell with the V4 sequencing chemistry on an Illumina HiSeq 2500 platform following Illumina’s recommended protocol to generate paired-end reads of 126-bases in length.	Illumina HiSeq 2500	38
EGAD00001007501	The Dutch Microbiome Project (DMP) data includes shotgun metagenomic sequencing of faecal samples 8,208 Dutch individuals. Paired-end sequencing was performed using Illumina HiSeq 2000 platform. Data is archived in two batches to facilitate easier data access and upload to EGA. Batch 2 of DMP includes ~400 samples.	Illumina HiSeq 2000	3848
EGAD00001007502	Exome libraries from 47 blood and tissue samples were prepared using Agilent SureSelect Human Exome Library Preparation V5 kit and the Agilent Bravo Automation System fExome libraries were pooled and sequenced with the TruSeq SBS sequencing chemistry using a V4 high throughput flowcell on a HiSeq 2500 platform following Illumina’s recommended protocol. Approximately 6-8 gigabases of raw paired end data of 126-bases were generated per exome library.	Illumina HiSeq 2500	47
EGAD00001007503	Multiple metastatic sites were sampled at autopsy from four patients diagnosed with metastatic colorectal cancer and subjected to whole-genome sequencing using the Illumina HiSeq X Ten platform to identify somatic variants, structural rearrangements and mutational signatures. The number of tumour samples per patient ranged from 6 to 66.	HiSeq X Ten	88
EGAD00001007504		HiSeq X Ten	12
EGAD00001007505	This is the raw data obtained from shallow whole-genome sequencing of plasma DNA (plasma-seq) for calling of somatic copy number alterations as well as focal amplifications from patients with lung cancer.	Illumina MiSeq NextSeq 550	1
EGAD00001007506	This dataset combines single cell transcriptome data from fetal pancreas at 7-10 wpc, embryonic stem cell-derived pancreas progenitors and spheroids generated from both fetal pancreas and human pluripotent stem cell-derived pancreas progenitors.	NextSeq 500	4
EGAD00001007508	This dataset contains shallow whole genome sequencing data from paired cfDNA - tumor DNA samples of various pediatric cancer entities (total 215 samples). Files are provided in fastq format. Samples were sequenced on an Ion Proton sequencer (Thermo Fisher Scientific) or a Hiseq3000 (Illumina). Data analysis is available at https://github.com/rmvpaeme/sWGS_pediatric_cancer.	Illumina HiSeq 3000 Ion Torrent Proton	215
EGAD00001007509	Longitudinal genome-sequencing analysis of 12 patients with metastatic or refractory osteosarcoma. The study was approved at the University Hospital Basel, following the approval of the ethical committee for mutational analysis of anonymized samples (“Ethikkommission beider Basel” ref. 274/12). Informed consent was obtained from all 12 patients. All tumor samples were evaluated by an experienced bone pathologist to conﬁrm the diagnosis. WES and low coverage WGS are aligned against the reference genome GRCh37. More details in the associated publication.	Illumina HiSeq 4000	72
EGAD00001007511	Dataset with 55 whole-exome sequences from Tunisian non-Imazighen samples and 20 whole-exome sequences from Tunisian Imazighen samples.	unspecified	75
EGAD00001007512	RNA sequencing data of anti-SARS-CoV-2 spike IgG (monoclonal or patient serum), poly(I:C) and Fostamatinib treated human primary IL10-M2 macrophages	Illumina HiSeq 4000	14
EGAD00001007515	This dataset contains different samples from a single patient with SEF. The dataset contains whole genome, whole exome and RNAseq information. Two of the DNA sequencing samples also contain matched normals.	HiSeq X Ten Illumina HiSeq 2500 unspecified	6
EGAD00001007516	This dataset contains different samples from a single patient with SEF. The dataset contains whole genome, whole exome and RNAseq information. Two of the DNA sequencing samples also contain matched normals.	HiSeq X Ten Illumina HiSeq 2500 unspecified	6
EGAD00001007520	In this project we performed targeted sequencing of known and suspected melanoma susceptibility genes in a cohort of melanoma patients and matched controls. Our aim was to identify variants that predispose to melanoma development. The melanoma cases used in this study were population ascertained.	Illumina HiSeq 2000	2731
EGAD00001007521	14 samples were processed for single cell DNA sequencing	NextSeq 500	14
EGAD00001007522	13 samples were processed by whole genome sequencing	Illumina NovaSeq 6000	13
EGAD00001007523	17 samples were processed for single cell RNA sequencing (SORT-seq)	NextSeq 500	17
EGAD00001007524	We collected 80 NASH-HCCs formalin fixed paraffin-embedded (FFPE) samples from 5 different institutions. NASH was diagnosed in FFPE samples by at least two expert pathologists following a described histological algorithm (Bedossa et al., Hepatology, 2012). All NASH patients included in the study were HBV- and HCV-negative. Patients reporting alcohol consumption ≥ 20 g/day for women and 30 g/day for men, as well as patients with a known liver disease superimposed to NASH were excluded. Tumour and paired non-tumour gDNA of NASH-HCC FFPE samples was submitted to Whole Exome Sequencing (WES). Exome capture and sequencing library preparation were performed using the SureSelect Human All Exon V5, no UTR hybridization capture kit from Agilent (Target Size 50 Mb). Libraries were sequenced on an Illumina HiSeq 4000 instrument with 100-bp paired-end reads.	Illumina HiSeq 4000	66
EGAD00001007525	This is 10X single cell RNA-seq data of esophageal adenocarcinoma organoids.	Illumina HiSeq 4000	5
EGAD00001007526	Novel optineurin frameshift insertion causing familial frontotemporal dementia and parkinsonism without amyotrophic lateral sclerosis	Illumina HiSeq 4000	4
EGAD00001007527	SPECTA Lung - RP1335 14MG cram files	Illumina HiSeq 2500	38
EGAD00001007528	RNAseq from 13 uveal melanoma patients	Illumina HiSeq 2500	13
EGAD00001007529	November 2020 data update (fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	unspecified	10
EGAD00001007530	RNAseq data of total TXVI samples	Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina NovaSeq 6000 NextSeq 550	227
EGAD00001007531	Comparative transcriptome of CD34+ hematopoietic progenitors from 4 myeloproliferative patients (MPN) and 4 control donors performed by RNA-Sequencing.	Illumina NovaSeq 6000	8
EGAD00001007532	A molecular signature for IL-10-producing Th1 cells in protozoan parasitic diseases	Illumina HiSeq 2500	18
EGAD00001007533	sn-RNAseq profiling of the impact of a cytokine storm model in human cardiac organoids	NextSeq 550	2
EGAD00001007563	We analyzed chromothripsis in 252 human breast cancers from two patient cohorts (149 metastatic breast cancers, 63 untreated primary tumors, 29 local relapses, 11 longitudinal pairs) using whole-genome and whole-exome (paired) sequencing. A lot of the WGS samples were sequenced on Illumina HiSeq X-Ten using Illumina TruSeq Nano DNA. For exome sequencing Agilent_SureSelect_V5+UTRs has been used (sequencing on Hiseq2000, Hiseq2500 and Hiseq4000). For exome sequencing Agilent_SureSelect_V5+UTRs has been used (sequencing on Hiseq2000, Hiseq2500 and Hiseq4000).	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	516
EGAD00001007564	G3BP2-KIT drives leukemia amenable to kinase inhibition in Ph-like ALL: RNAseq for human sample	Illumina HiSeq 4000	1
EGAD00001007565	We performed genetic analysis of HLA and immune escape genes in samples from 44 patients sequenced by whole exome sequencing (34 tumor samples, 32 normal samples) and whole genome sequencing (10 tumor samples, 12 normal samples). We also performed HLA targeted sequencing in 26/44 patients (26 tumor samples, 26 normal samples).	Illumina HiSeq 2000 unspecified	139
EGAD00001007566	Matrix of gene x sample RNAseq read count data.		97
EGAD00001007567	Sample and clinical data from the Idiopathic Pulmonary Fibrosis Core Biopsy Study, including disease group, sex, diagnosis, and sample location.		97
EGAD00001007568	RNA-Seq fastq files for 97 Idiopathic Pulmonary Fibrosis samples.	Illumina HiSeq 2500	97
EGAD00001007569	RNAseq data set, Degradation of Janus kinases in CRLF2-rearranged acute lymphoblastic leukemia	Illumina NovaSeq 6000	11
EGAD00001007570	WGS data set of 11 xenograft samples, Degradation of Janus kinases in CRLF2-rearranged acute lymphoblastic leukemia	Illumina NovaSeq 6000	11
EGAD00001007571	Background and Aims: Homologous recombination deficiency (HRD) in pancreatic ductal adenocarcinoma (PDAC), remains poorly defined beyond germline(g) alterations in BRCA1, BRCA2 and PALB2. Methods: We interrogated whole genome sequencing (WGS) data on 391 patients including 49 carriers of pathogenic variants (PVs) in g_BRCA_ and PALB2. HRD classifiers were applied to the dataset and included: 1) the genomic instability score (GIS) used by Myriad MyChoice HRD assay; 2) substitution base signature 3 (SBS3); 3) HRDetect; and, 4) Structural Variant (SV) burden. Clinical outcomes and responses to chemotherapy were correlated with HRD status. Results: Biallelic tumour inactivation of g_BRCA_ or PALB2 was evident in 43/49 germline carriers identifying HRD-PDAC. HRDetect (score ?0.7) predicted gBRCA1/PALB2 deficiency with highest sensitivity (98%) and specificity (100%). HRD genomic tumour classifiers suggested that 7-10% of PDAC that do not harbor g_BRCA/PALB2_ have features of HRD. Of the somatic HRDetect cases, 69% were attributed to alterations in BRCA1/2, PALB2, RAD51C/D and XRCC2, and a tandem duplicator phenotype. TP53 loss was more common in BRCA1- compared to BRCA2-associated HRD-PDAC. HRD status was not prognostic in resected PDAC. However in advanced disease the GIS (p=0.02), SBS3 (p=0.03) and HRDetect score (p=0.005) were predictive of platinum response and superior survival. PVs in g_ATM_ (n=6) or g_CHEK2_ (n=2) did not result in HRD-PDAC by any of the classifiers. In four patients, BRCA2 reversion mutations associated with platinum resistance. Conclusions: Germline and parallel somatic profiling of PDAC outperforms germline testing alone in identifying HRD-PDAC. An additional 7-10% of patients without gBRCA/PALB2 mutations may benefit from DNA damage response agents.		-
EGAD00001007572	This data set contains single cell transcriptomes generated using the chromium 10X platform from both fresh cells and nuclei. The samples measured are derived from children with Wilms tumour, Clear Cell Sarcoma of the Kidney (CCSK), or Malignant Rhabdoid Tumours.	Illumina HiSeq 4000 Illumina NovaSeq 6000	1
EGAD00001007573	Purpose Exploratory analyses of CheckMate 066 and 067 trials were conducted to investigate associations of tumor mutational burden (TMB), a 4-gene inflammatory gene expression signature, and BRAF mutation status with tumor response, progression-free survival (PFS), and overall survival (OS) in patients with advanced melanoma. Patients and Methods Patients with known programmed death ligand 1 (PD-L1) expression and BRAF mutation status received nivolumab (NIVO) or dacarbazine in CheckMate 066 and either NIVO, ipilimumab (IPI), or NIVO+IPI in CheckMate 067. Whole exome sequencing and RNA sequencing were used to determine TMB and inflammatory gene expression signature scores, respectively. These biomarkers were evaluated in terms of their association with PFS and OS. Results In the NIVO, NIVO+IPI, and IPI arms of CheckMate 067, longer survival was associated with high (> median) versus low (≤ median) TMB with hazard ratios (HRs) (95% confidence interval [CI]) for PFS of 0.45 (0.30–0.65), 0.55 (0.38–0.81), and 0.60 (0.43–0.82), and for OS of 0.46 (0.30–0.71), 0.53 (0.34–0.82), and 0.52 (0.36–0.74), respectively. For NIVO-treated patients, these results were confirmed in CheckMate 066. A survival benefit was observed with high TMB and absence of BRAF mutation. Survival was associated with high versus low inflammatory signature scores with HRs (95% CI) for PFS of 0.56 (0.34–0.94), 0.40 (0.23–0.72), and 0.43 (0.27–0.70), and for OS of 0.37 (0.20–0.66), 0.38 (0.19–0.74), and 0.46 (0.27–0.79), in the NIVO, NIVO+IPI, and IPI arms, respectively. Weak correlations were observed between PD-L1, TMB, and the inflammatory signature. Conclusions Combined assessment of TMB, inflammatory gene expression signature, and BRAF mutation status may be predictive for response to immunotherapy in advanced melanoma.	Illumina HiSeq 2500	122
EGAD00001007574	ctDNA data from IMvigor010: ctDNA data include TMB_status, cfDNA_extracted_ng, Plasma_Volume_Used_ml, Sample_Call, Sample_Number_Positive_Calls, Sample_Mean_VAF_In_Plasma (%), Sample_MTM_per_mL_In_Plasma for 581 patients across IMvigor010.		-
EGAD00001007575	RNAseq FASTq files from 728 bulk pre-treatment tumors from IMvigor010.	unspecified	728
EGAD00001007576	Clinical data from IMvigor010: Clinical data include race, sex, baseline ecog, tumor stage, node status, prior neoadjuvant, PD-L1 status, number lymph nodes resected, pridis, arm, overall survival and disease free survival for 809 patients across IMvigor010.		-
EGAD00001007577	RNAseq samples from the iAMP21 study	Illumina HiSeq 2500 Illumina HiSeq 4000 NextSeq 550	88
EGAD00001007578	ctDNA guiding adjuvant immunotherapy in urothelial carcinoma (Nature 2021). ABACUS clinical data cut (2020) used in 2021 paper: Clinical data include PCR, RFS_months, RFS_event, and OUTCOME for 40 patients in ABACUS with ctDNA data.		-
EGAD00001007579	ctDNA guiding adjuvant immunotherapy in urothelial carcinoma (Nature 2021). ABACUS ctDNA data used in 2021 paper: ctDNA data include TMB_(mut/Mb), cfDNA_extracted_ng, Plasma_Volume_Used_ml, Sample_Call, Sample_Number_Positive_Calls, Sample_Mean_VAF_In_Plasma (%), Sample_MTM_per_mL_In_Plasma across 40 patients in ABACUS.		-
EGAD00001007580	The Genomic DNA Clean & Concentrator kit (ZYMO Research) was used to remove EDTA from the DNA samples. Sample libraries were prepared using 100 ng of input according to the KAPA HyperPlus Kit (Roche) using Unique Dual Index adapters (Integrated DNA Technologies, Inc.). Exomes were captured using the SeqCap EZ MedExome (Roche Nimblegen) according to SeqCap EZ HyperCap Library v1.0 Guide (Roche) with the xGen Universal blockers – TS Mix (Integrated DNA Technologies, Inc.). The amplified captured sample libraries were paired-end sequenced (2x100 bp) on the Novaseq 6000 platform (Illumina) and aligned to the hg19 reference genome using the Burrows-Wheeler Aligner (BWA)3, v0.7.15-r1140.	Illumina NovaSeq 6000	209
EGAD00001007581	Sample libraries were prepped using 500 ng of input RNA according to the KAPA RNA HyperPrep Kit with RiboErase (HMR) (Roche) using Unique Dual Index adapters (Integrated DNA Technologies, Inc.). Amplified sample libraries were paired-end sequenced (2x100 bp) on the Novaseq 6000 platform (Illumina) and aligned against the human genome (hg19) using STAR v2.5.4b2.	Illumina NovaSeq 6000	209
EGAD00001007582	ChIP-seq data were generated for a number of selected patients to investigate changes in enhancer and promoter regions. ChIP was performed as described previously with slight modifications27. Briefly, cells were crosslinked with 1% formaldehyde for 10 minutes at room temperature and the reaction was quenched with glycine at a final concentration of 0.125 M. Chromatin was sheared using the Covaris S220 focused-ultrasonicator to an average size of 250–350 bp. A total of 2.5 µg of antibody against H3K27ac (Abcam, ab4729) was added to sonicated chromatin of 2 × 106 cells and incubated overnight at 4 °C. Protein A sepharose beads (GE healthcare) were added to the ChIP reactions and incubated for 2 h at 4 °C. Beads were washed and chromatin was eluted. After crosslink reversal, RNase A and proteinase K treatment, DNA was extracted with the Monarch PCR & DNA Cleanup kit (NEB). Sequencing libraries were prepared with the NEBNext Ultra II DNA Library Prep Kit for Illumina (NEB) according to the manufacturer’s instructions. The quality of dsDNA libraries was analyzed using the High Sensitivity D1000 ScreenTape Kit (Agilent) and concentrations were assessed with the Qubit dsDNA HS Kit (Thermo Fisher Scientific). Libraries were single-end sequenced on a HiSeq 4000 (Illumina). ChIP-seq reads were aligned to the human reference genome build hg19 with bowtie	Illumina HiSeq 4000	72
EGAD00001007583	ATAC-seq data were generated for a number of selected patients to investigate changes in enhancer and promoter regions. ATAC-seq was essentially carried out as described in31. Briefly, prior to transposition the viability of the cells was assessed and 1 × 106 cells were treated in culture medium with DNase I (Sigma) at a final concentration of 200 U ml−1 for 30 minutes at 37 °C. After Dnase I treatment, cells were washed twice with ice-cold PBS, and cell viability and the corresponding cell count were assessed. 5 × 104 cells were aliquoted into a new tube and spun down at 500 × g for 5 minutes at 4 °C, before the supernatant was discarded completely. The cell pellet was resuspended in 50 µl of ATAC-RSB buffer (10 mM Tris-HCl pH 7.4, 10 mM NaCl, 3 mM MgCl2) containing 0.1% NP-40, 0.1% Tween-20, and 1% Digitonin (Promega), and was incubated on ice for 3 minutes to lyse the cells. Lysis was washed out with 1 ml of ATAC-RSB buffer containing 0.1% Tween-20. Nuclei were pelleted at 500 × g for 10 minutes at 4 °C. The supernatant was discarded carefully and the cell pellet was resuspended in 50 µl of transposition mixture (25 µl 2× tagment DNA buffer, 2.5 µl transposase (100 nM final; Illumina), 16.5 µl PBS, 0.5 µl 1% digitonin, 0.5 µl 10% Tween-20, 5 µl H2O) by pipetting up and down six times. The reaction was incubated at 37 °C for 30 minutes with mixing before the DNA was purified using the Monarch PCR & DNA Cleanup Kit (NEB) according to the manufacturer’s instructions. Purified DNA was eluted in 20 µl elution buffer (EB) and 10 µl purified sample was objected to a ten-cycle PCR amplification using Nextera i7- and i5-index primers (Illumina). Purification and size selection of the amplified DNA were carried out with Agencourt AMPure XP beads. For purification the ratio of sample to beads was set to 1:1.8, whereas for size selection the ratio was set to 1:0.55. Purified samples were eluted in 15 µl of EB. Quality and concentration of the generated ATAC libraries were analyzed using the High Sensitivity D1000 ScreenTape Kit (Agilent) and libraries were sequenced paired-end on a NovaSeq (Illumina). ATAC-seq reads were aligned to the human reference genome build hg19 with bowtie2	NextSeq 500	64
EGAD00001007585	The file data.RData contains data objects needed to run the .rmd template that generates the manuscript's figures. These include ExpressionSets for the training and test sets, module content and annotations, PCA results, the lasso58 signature with coefficients, and Foundation Medicine CDKN2A copy-number alteration data.		1651
EGAD00001007586	This submission consists of 15 volunteer FASTQs, split by chemistry: Chemistry v1.1: 7 FASTQ sets, one set for each volunteer (RA1-7) Chemistry v2: 16 FASTQ sets, two for each volunteer (RA8-15; FASTQ set 1: Gene Expression (GEX); FASTQ set 2: TCR enrichment (VDJ))	NextSeq 550	7
EGAD00001007587	In this prospective study, targeted deep sequencing was performed on a total of 160 primary tumors (474 regions) and 112 lymph nodes from 125 patients with stage I-III lung cancer (LuCaTH). Progressive evolution at clonal divergence scale was observed while specific driver events were positively selected for clonal sweeps during tumor development. Between-region genetic divergence (BRGD) of tumors were assessed and positively correlated with tumor differentiation. A machine learning algorithm was employed to evaluate clinicopathological and molecular parameters of primary tumors underlying lymph node metastasis. By analyzing clonal lineages and metastatic trajectories across multiple nodal stations, we unraveled a common sequential LNM seeding pattern but with divergent modes of clonal spread.		760
EGAD00001007589	Contains 30x coverage whole genome sequence data from 40 HIV+ South Africans. Samples sequenced using Illumina HiSeq. BAM files have been uploaded. Sequenced at Edinburgh Genomics.	HiSeq X Ten	40
EGAD00001007591	Retinoblastoma is a rare childhood cancer of the retina. We studied retinoblastoma by Whole-Exome-Sequencing.	Illumina HiSeq 2000	182
EGAD00001007592	Whole genome sequencing for single cells for library A108732A 1139 cells; filetype=bam	HiSeq X Five	7
EGAD00001007593	Whole genome sequencing for single cells for library A108735A 714 cells; filetype=bam	NextSeq 550	5
EGAD00001007594	Whole genome sequencing for single cells for library A108833B 866 cells; filetype=bam	HiSeq X Five	7
EGAD00001007595	Whole genome sequencing for single cells for library A108842A 1477 cells; filetype=bam	HiSeq X Five	5
EGAD00001007596	Whole genome sequencing for single cells for library A108851B 1681 cells; filetype=bam	HiSeq X Five	5
EGAD00001007597	Whole genome sequencing for single cells for library A108863A 1301 cells; filetype=bam	HiSeq X Five	5
EGAD00001007598	Whole genome sequencing for single cells for library A108870A 1795 cells; filetype=bam	HiSeq X Five	5
EGAD00001007599	Whole genome sequencing for single cells for library A110618A 1184 cells; filetype=bam	HiSeq X Five	5
EGAD00001007600	Whole genome sequencing for single cells for library A110618B 1047 cells; filetype=bam	HiSeq X Five	5
EGAD00001007601	Whole genome sequencing for single cells for library A110621A 1137 cells; filetype=bam	HiSeq X Five	5
EGAD00001007602	Whole genome sequencing for single cells for library A110673A 1089 cells; filetype=bam	HiSeq X Five	5
EGAD00001007603	Whole genome sequencing for single cells for library A110673B 1153 cells; filetype=bam	HiSeq X Five	5
EGAD00001007604	Whole genome sequencing for single cells for library A118830A 912 cells; filetype=bam	HiSeq X Five	5
EGAD00001007605	Whole genome sequencing for single cells for library A118862A 1070 cells; filetype=bam	HiSeq X Five	5
EGAD00001007606	Whole genome sequencing for single cells for library A118862B 1152 cells; filetype=bam	HiSeq X Five	5
EGAD00001007607	Whole genome sequencing for single cells for library A95623A 1897 cells; filetype=bam	HiSeq X Five	5
EGAD00001007608	Whole genome sequencing for single cells for library A95623B 1363 cells; filetype=bam	HiSeq X Five	5
EGAD00001007609	Whole genome sequencing for single cells for library A95668B 1905 cells; filetype=bam	HiSeq X Five	5
EGAD00001007610	Whole genome sequencing for single cells for library A95697B 1233 cells; filetype=bam	HiSeq X Five	20
EGAD00001007611	Whole genome sequencing for single cells for library A95700A 1172 cells; filetype=bam	HiSeq X Five	5
EGAD00001007612	Whole genome sequencing for single cells for library A95703A 740 cells; filetype=bam	NextSeq 550	5
EGAD00001007613	Whole genome sequencing for single cells for library A95720A 1467 cells; filetype=bam	HiSeq X Five	5
EGAD00001007614	Whole genome sequencing for single cells for library A95730A 739 cells; filetype=bam	HiSeq X Five	5
EGAD00001007615	Whole genome sequencing for single cells for library A96114A 798 cells; filetype=bam	NextSeq 550	5
EGAD00001007616	Whole genome sequencing for single cells for library A96124B 1241 cells; filetype=bam	HiSeq X Five	5
EGAD00001007617	Whole genome sequencing for single cells for library A96155C 1316 cells; filetype=bam	HiSeq X Five	5
EGAD00001007618	Whole genome sequencing for single cells for library A96162A 1505 cells; filetype=bam	HiSeq X Five	5
EGAD00001007619	Whole genome sequencing for single cells for library A96183B 1191 cells; filetype=bam	HiSeq X Five	5
EGAD00001007620	Whole genome sequencing for single cells for library A96190A 1833 cells; filetype=bam	HiSeq X Five	5
EGAD00001007621	Whole genome sequencing for single cells for library A96226B 1274 cells; filetype=bam	HiSeq X Five	7
EGAD00001007622	Whole genome sequencing for single cells for library A96233A 637 cells; filetype=bam	HiSeq X Five	5
EGAD00001007623	Whole genome sequencing for single cells for library A96233B 1601 cells; filetype=bam	HiSeq X Five	5
EGAD00001007624	Whole genome sequencing for single cells for library A98168A 1294 cells; filetype=bam	HiSeq X Five	5
EGAD00001007625	Whole genome sequencing for single cells for library A98168B 1290 cells; filetype=bam	HiSeq X Five	5
EGAD00001007626	Whole genome sequencing for single cells for library A98234A 1240 cells; filetype=bam	HiSeq X Five	5
EGAD00001007627	Whole genome sequencing for single cells for library A98234B 1600 cells; filetype=bam	HiSeq X Five	5
EGAD00001007628	Whole genome sequencing for single cells for library A98244A 1271 cells; filetype=bam	HiSeq X Five	7
EGAD00001007629	Whole genome sequencing for single cells for library A98253A 1250 cells; filetype=bam	HiSeq X Five	5
EGAD00001007630	Whole genome sequencing for single cells for library A98253B 1388 cells; filetype=bam	HiSeq X Five	5
EGAD00001007631	Whole genome sequencing for single cells for library A98254A 1490 cells; filetype=bam	HiSeq X Five	7
EGAD00001007632	Whole genome sequencing for single cells for library A98255B 1394 cells; filetype=bam	HiSeq X Five	5
EGAD00001007633	Whole genome sequencing for single cells for library A98271A 1190 cells; filetype=bam	HiSeq X Five	5
EGAD00001007634	Whole genome sequencing for single cells for library A98289A 1066 cells; filetype=bam	HiSeq X Five	5
EGAD00001007635	Whole genome sequencing for single cells for library A98290B 869 cells; filetype=bam	HiSeq X Five	5
EGAD00001007636	Whole genome sequencing for single cells for library A98304A 1560 cells; filetype=bam	HiSeq X Five	7
EGAD00001007637	Whole genome sequencing for single cells for library A98305A 986 cells; filetype=bam	HiSeq X Five	5
EGAD00001007638	sequencing of libraires enriched for viral particles	NextSeq 550	33
EGAD00001007639	Sequencing of fecal metagenomic libraries	Illumina HiSeq 2000	33
EGAD00001007640	Whole genome sequencing for single cells for library A96120A 1143 cells; filetype=bam	HiSeq X Five	11
EGAD00001007641	Whole genome sequencing for single cells for library A98172B 941 cells; filetype=bam	HiSeq X Five	5
EGAD00001007642	We determined intra- and inter-vascular transcriptional heterogeneity within the circulatory system using single-cell RNA-sequencing of 113 CTCs isolated from four key vascular sites along their dissemination route in ten HCC patients. The dataset consists of 146 paired-end fastq files from 113 sigle CTCs, HuH-7 cell line, Hep3B cellline, white blood cell, tumor and normal bulk tissue.	Illumina HiSeq 4000	146
EGAD00001007644	Whole exome sequencing data obtained from sorted leukemic clones and a buccal swab as germline reference. The dataset contains raw sequencing data (paired-end reads) for 3 samples (2 leukemic, 1 buccal swab), for a total of 6 fastq files.	unspecified	3
EGAD00001007645	The Genomic DNA Clean & Concentrator kit (ZYMO Research) was used to remove EDTA from the DNA samples. Sample libraries were prepared using 100 ng of input according to the KAPA HyperPlus Kit (Roche) using Unique Dual Index adapters (Integrated DNA Technologies, Inc.). Exomes were captured using the SeqCap EZ MedExome (Roche Nimblegen) according to SeqCap EZ HyperCap Library v1.0 Guide (Roche) with the xGen Universal blockers – TS Mix (Integrated DNA Technologies, Inc.). The amplified captured sample libraries were paired-end sequenced (2x100 bp) on the Novaseq 6000 platform (Illumina) and aligned to the hg19 reference genome using the Burrows-Wheeler Aligner (BWA)3, v0.7.15-r1140.	Illumina NovaSeq 6000	12
EGAD00001007646	Sample libraries were prepped using 500 ng of input RNA according to the KAPA RNA HyperPrep Kit with RiboErase (HMR) (Roche) using Unique Dual Index adapters (Integrated DNA Technologies, Inc.). Amplified sample libraries were paired-end sequenced (2x100 bp) on the Novaseq 6000 platform (Illumina) and aligned against the human genome (hg19) using STAR v2.5.4b2.	Illumina NovaSeq 6000	12
EGAD00001007647	Bam files of unmapped sequencing reads of samples in EGAD00001007002. When utilized in combination with corresponding aligned bam files in dataset EGAD00001007002, they contain all sequencing reads of samples.	Illumina NovaSeq 6000	64
EGAD00001007648	The dataset contains phenotypes of 14 melanoma biopsies sequenced in connection with the study UV1-hTERT-mm, where the thereapeutic cancer vaccine UV1 is combined with ipilimumab in treatment of melanoma patients.		1
EGAD00001007649	Exome sequences of three unrelated individuals of south Asian ancestry from the EXCEED study	Illumina HiSeq 2000	3
EGAD00001007650	A dataset of ER+PR+ breast tumor samples that were analyzed in order to identify mutation enrichment in cis-regulatory elements and cistrome. The repository includes sequencing data from 88 patients. 26/88 were sequenced using ATAC-seq	Illumina HiSeq 2000	26
EGAD00001007651	Three primary adult glioblastoma specimens were dissociated and nuclei were extracted. A portion of the nuclei was used for single-cell ATAC seq and the remainder were submitted for whole genome sequencing, to provide orthogonal validation of copy number variations in the samples compared to single-cell ATAC seq. Samples were sequenced on the Illumina NovaSeq 6000 in paired-end mode.	Illumina NovaSeq 6000	3
EGAD00001007652	R code to run analyses on anonymized data from ABACUS for IMvigor010 ctDNA publication.		-
EGAD00001007653	TPM matrices of counts from RNAseq data for IMvigor010 from bulk pre-treatment tumors.		-
EGAD00001007654	R code to run analyses on anonymized data from IMvigor010 ctDNA publication.		-
EGAD00001007655	This dataset includes all FASTQ files for 11 samples where different capture-based methods for transcriptome profiling have been tested. Specifically, we have the 'traditional' RNA-seq experiment with fresh frozen (FF) material, and 3 different capture methods for the matching formalin-fixed paraffin-embedded samples: Agilent (AGI), IDT, and Twist Biosciences (TBS). In total, there are 43 samples with paired-end FASTQ files (1 sample did not have sufficient material to test all methods). Samples are identified by the R01-R11 IDs with a suffix that indicates the capture method used (or FF for fresh frozen).	Illumina HiSeq 2500 Illumina HiSeq 4000	43
EGAD00001007656	Single cell sequencing analysis for Dengue patients using Smart-seq2 and 10X platforms	Illumina HiSeq 4000	11
EGAD00001007657	Targeted sequencing was applied to an unselected population-based follicular lymphoma (FL) cohort (n=548) diagnosed in the UK's Haematological Malignancy Research Network catchment population of ~4 million (14 centres). DNA extracted from FL samples was sequenced with a 293-gene panel using the Illumina HiSeq 2500. All data are provided in the CRAM format.	Illumina HiSeq 2500	548
EGAD00001007658	Targeted sequencing was applied to an unselected population-based Burkitt lymphoma cohort (n=39) diagnosed in the UK's Haematological Malignancy Research Network catchment population of ~4 million (14 centres). DNA extracted from tumour samples was sequenced with a 293-gene panel using the Illumina HiSeq 2500. All data are provided in the CRAM format.	Illumina HiSeq 2500	39
EGAD00001007660	Treg and Tfh cells (8 Samples)from the same donors was subject to ATAC-seq processing. Paired end fastq-files are supplied.	NextSeq 550	8
EGAD00001007661	Regulatory Tcells were FACS sorted and processed with 10x Genomics Chromium Next GEM SingleCell V(D)J Reagents Kits v1.1 sequencing. In total 17 samples were processed. Fastq files are supplied.	NextSeq 550	17
EGAD00001007662	Treg and Tfh cells (8 Samples)from the same donors was subject to RNA-seq (TruSeq Stranded Total RNA) processing. Single end fastq-files are supplied.	NextSeq 550	8
EGAD00001007663	Tcells were isolated fom Tissues and FACS sorted. In total 50 samples were processed in 5 replicates with the Takara SmartSeq Stranded kit. Single end fastq-files are supplied.	NextSeq 550	50
EGAD00001007664	Tcells were isolated fom Tissues and FACS sorted. In total 43 samples were processed in 5 replicates with the Takara SmartSeq Stranded kit. Single end fastq-files are supplied.	NextSeq 550	43
EGAD00001007665	Regulatory Tcells were FACS sorted and processed with 10x Genomics Chromium Next GEM Single Cell 5' v2 sequencing. In total 17 samples were processed. Fastq files are supplied.	NextSeq 550	17
EGAD00001007666	Whole exome sequnecing of upper urinary tract urothelial carcinoma	Illumina HiSeq 2000	467
EGAD00001007667	RNA sequencing of upper urinary tract urothelial carcinoma	Illumina HiSeq 2000	166
EGAD00001007669	This dataset contains 139 Tumor and Control WGS files for samples for Gerhauser et al.,Cancer Cell, 2018, 34:996-1011. WGS and sequencing protocol was earlier described in Weischenfeldt et al, Cancer Cell, 2013.	HiSeq X Ten Illumina HiSeq 2000	141
EGAD00001007670	RNAseq data set of BCL11B, 519 samples	Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina NovaSeq 6000 NextSeq 550	506
EGAD00001007671	Paired-end BAM files with associated index files of 15 AML samples were generated using Agilent SureSelect XT library capture system using a custom panel as described before Takahashi et al Blood 2018. Libraries were sequenced using Illumina HiSeq 2500.	Illumina HiSeq 2500	15
EGAD00001007672	Pared-end fastq files of 22 AML and 2 normal bone marrow samples were generated using 10x Genomics 5' single cell RNA sequencing following manufacturer's protocol (CG000086 User Guide RevG) for chemistry version 1.0 and targeting 10000 cells per sample. Libraries were sequenced using Illumina HiSeq 4000 using recommended cycling parameters.	Illumina HiSeq 4000	24
EGAD00001007673	Paired-end fastq files of 22 AML and 2 normal bone marrow samples were generated using 10x Genomics single cell BCR sequencing following manufacturer's protocol (CG000086 User Guide RevG) for chemistry version 1.0 and targeting 10000 cells per sample. Libraries were sequenced using Illumina NovaSeq 6000 using recommended cycling parameters.	Illumina NovaSeq 6000	24
EGAD00001007674	Paired-end fastq files of 22 AML and 2 normal bone marrow samples were generated using 10x Genomics single cell TCR sequencing following manufacturer's protocol (CG000086 User Guide RevG) for chemistry version 1.0 and targeting 10000 cells per sample. Libraries were sequenced using Illumina NovaSeq 6000 using recommended cycling parameters.	Illumina NovaSeq 6000	24
EGAD00001007675	Paired-end fastq files of 22 AML and 2 normal bone marrow samples were generated using 10x Genomics single cell ATAC sequencing following manufacturer's protocol (CG000168 User Guide RevB) for chemistry version 1.0 and targeting 10000 nuclei per sample. Libraries were sequenced using Illumina HiSeq4000 using recommended cycling parameters.	Illumina HiSeq 4000	24
EGAD00001007676	Complete sequence of the mitochondrial DNA of 87 Hessequa-descendant individuals	PacBio RS II	87
EGAD00001007677	Single-nucleus RNA-sequencing of meningiomas for integrative molecular classification	Illumina NovaSeq 6000	10
EGAD00001007678	Complete reference epigenome (as defined by IHEC) of a SaOS-2 cell line with osteosarcoma.	unspecified	1
EGAD00001007679	Complete reference epigenome (as defined by IHEC) of a lung epithelial cell line with non-small Cell Lung Adenocarcinoma	unspecified	4
EGAD00001007680	Complete reference epigenome (as defined by IHEC) of normal hTERT RPE1 cell line as well as hTERT RPE1 cell lines engineered to express EPC1-PHF1 and JAZF1-SUZ12 fusion proteins, found in Endometrial Stromal Sarcoma.	unspecified	3
EGAD00001007682	Deep targeted sequencing of 56 genes associated with clonal haematopoiesis and haematological malignancy in peripheral blood-derived DNA from 385 older adults, each sampled 2-5 times over ~13 years.	Illumina HiSeq 2500 Illumina MiSeq	1269
EGAD00001007683	Deep targeted sequencing of 56 genes associated with clonal haematopoiesis and haematological malignancy in peripheral blood-derived DNA from 11 older adults, each previously sampled 2-5 times over the preceding ~13 years.	Illumina HiSeq 2500	1
EGAD00001007684	Whole-genome sequencing of 288 single-cell-derived blood colonies from 3 elderly individuals with clonal haematopoiesis.	Illumina NovaSeq 6000	1
EGAD00001007685	HTG EdgeSeq fastq files of bulk baseline tumors from IMbassdor250: a randomised phase 3 trial comparing atezolizumab with enzalutamide vs enzalutamide alone in patients with metastatic castration-resistant prostate cancer.	unspecified	400
EGAD00001007686	Samples encompass primary colorectal tumors, adjacent normal colonic mucosa and metastasis of 12 patients, collected by Medical Pathologists from surgically removed specimens. Normal mucosa samples were taken more than 2 cm away from the tumor. Tissues were embedded in optimal cutting temperature (OCT) medium, snapshot frozen in liquid nitrogen within 40 minutes of collection and preserved at -80ºC. Samples were collected between June 2010 and October 2017 as part of a prospective biobanking project.	Illumina NovaSeq 6000	36
EGAD00001007687	We performed a prospective investigation in Sézary syndrome by the application of a standardized multiparameter flow cytometry, FACS-cell sorting, and RNA-sequencing for an in-depth immunophenotypic and transcriptional profiling of Sézary cells.	Illumina NovaSeq 6000	15
EGAD00001007688	mRNA capture sequencing data of 57 seminal plasma samples.	NextSeq 500	57
EGAD00001007689	Low-input RNA-Seq (~200 cells per sample) of CD4+ T cells in patients with kidney transplants, dialysis, or healthy controls after the second dose of Tozinameran COVID-19 vaccine. RNA-Seq was performed using the Smart-Seq v4 ultra-low input protocol. The dataset includes 4 healthy control samples, 3 kidney transplant recipients, and 4 patients undergoing dialysis.	NextSeq 500	11
EGAD00001007690	ATAC-seq	unspecified	6
EGAD00001007691	Follicular lymphoma (FL) is an indolent cancer of mature B-cells but carries increased risk of transformation to a more aggressive histology over time. We present here comprehensive profiling both tumor and immune compartments in 6 diagnostic FL biopsies by single-cell RNA sequencing. This confirmed results from 155 FL tumors characterized by mass cytometry (CyTOF) which revealed two distinct evolutionary trajectories with disparate risk of transformation and alternate biologies.		6
EGAD00001007692	Dataset consists of 19 bam files from RNA sequencing experiments batch1 and batch2.	Illumina NovaSeq 6000 NextSeq 500	19
EGAD00001007693	64 left atrial appendages from patients without atrial fibrillation (AF) undergoing cardiac surgery, patients with paroxysmal AF and with persistent AF (~20 per group). Trizol RNA isolation, rRNA depletion, paired transcriptome sequencing on illumina NovaSeq 6000. Provided are FastQ and BAM files. Additional data (e.g. clinical characteristics, RIN values etc.) can be provided upon reasonable request. The same RNA samples (62 out of 64) were used for miRNA sequencing. Results from miRNA seuqencing are stored in the EGA database managed by the same DAC.	Illumina NovaSeq 6000	64
EGAD00001007694	64 left atrial appendages from patients without atrial fibrillation (AF) undergoing cardiac surgery, patients with paroxysmal AF and with persistent AF (~20 per group). Trizol RNA isolation, size selection, single end miRNA sequencing on 4 lanes of illumina NextSeq 500. Provided are FastQ files, one per lane per sample. Additional data (e.g. clinical characteristics, RIN values etc.) can be provided upon request. The same RNA samples (62) were used for transcriptome sequencing. Results from transcriptome sequencing are stored in the EGA database and are managed by the same DAC.	NextSeq 500	62
EGAD00001007695	We analyzed the T-cell receptor (TCR) repertoires from twelve kidney transplant recipients. Six out of the twelve kidney transplant recipients experienced a cellular rejection after kidney transplantation. TCR repertoires of CD4+ and CD8+ positive T-cells were assessed prior to transplantation and after transplantation at time of allograft biopsy using RNA based T-cell receptor beta next generation sequencing (NGS). In addition, the pre-formed alloreactive TCR repertoire for each kidney transplant recipient was identified using mixed lymphocyte reaction and donor reactive T-cells were subjected to TCR beta sequencing. In two out of the six patients with cellular rejection the TCR repertoire of graft infiltrating T-cells was additionally captured. This dataset comprises a total of 98 samples. NGS TCR beta libraries of all samples were sequenced on an Illumina NextSeq 500 and raw sequencing data (in the form of fastq files) as well assembled clonotypes and their counts (in the form of clonotype tables) are provided.	NextSeq 500	98
EGAD00001007696	mRNA capture sequencing data (FASTQ files) of 28 FFPE sarcoma tumors	Illumina MiSeq NextSeq 500	28
EGAD00001007697	mRNA capture sequencing and small RNA sequencing data (FASTQ files) of the exRNAQC study phase 1.	Illumina NovaSeq 6000 NextSeq 500	276
EGAD00001007698	Profiling of 24 human anterior cingulate cortex samples by bulk-tissue RNA-sequencing. Samples were derived from 5 non-neurological control individuals and 19 individuals with Lewy body disease (Parkinson’s disease = 7 individuals; Parkinson’s disease with dementia = 6 individuals; dementia with Lewy bodies = 6 individuals). Paired-end FASTQ files for each of the human samples are provided and are denoted by the suffixes R1 (read 1) and R3 (read 2). Fastp (v 0.20.0), a fast all-in-one FASTQ pre-processor, was used for adapter trimming, read filtering and base correction. Fastp default settings were used for quality filtering and base correction. Further details on parameters used are available here: https://github.com/RHReynolds/RNAseqProcessing/blob/master/QC/prealignmentQC_fastp_PEadapters.R.	Illumina NovaSeq 6000	24
EGAD00001007699	Filtered somatic CNA calls detected using Battenberg from the CPCGGene 666PG study		302
EGAD00001007700	Lowpass whole genome sequencing of 43 single CTCs and one tumor biopsy	Illumina HiSeq 2500	44
EGAD00001007701			1
EGAD00001007702	Whole exome sequencing of FNH and two HCC compartments from a single patient, along with matched germline.	Illumina NovaSeq 6000	4
EGAD00001007703	This dataset includes full tumor transcriptomes from 891 advanced NSCLC tumors. These data originate from pre-treatment samples from two large randomized clinical trials for second-line non-small cell lung cancer (POPLAR and OAK). The patients in these trials were treated with either the PD-L1 inhibitor atezolizumab or chemotherapy.	unspecified	891
EGAD00001007704	To estimate the contribution of early embryogenic cell lineages in adult tissues, we performed deep targeted sequencing on 379 bulk tissues from various organs of the five individuals (DB3, DB6, DB8, DB9, DB10). Of the 441 early embryonic mutations targeted, 411 mutations could have high-quality baits designed for them. DNA libraries were prepared by SureSelectXT Library Prep Kit (Agilent), hybridized to the appropriate capture panel, multiplexed on flow cells, and subjected to paired-end sequencing (150-bp reads) on the NovaSeq 6000 platform (Illumina) with a mean ~2,900x depth of coverage for the early mutations. Sequence reads were trimmed and mapped to the human reference genome (GRCh37) using the BWA-MEM algorithm.	unspecified	379
EGAD00001007705	This is a oral microbiota amplicon dataset derived from adult participants aged over 65 years old. It consists a total number of 491 samples, stored in 982 paired-end FASTQ files with sequence lengths of 200 nucleotides generated with a Illumina MiSeq. Of those 491 samples, 347 were used in analysis, the remaining 144 samples are control samples.	Illumina MiSeq	491
EGAD00001007706	WGS TAML samples with their remission controls obtained from bone marrow.	Illumina HiSeq 4000	6
EGAD00001007707	This dataset includes ChIP-seq data for H3K27ac and H3K4me1 on 20 paired samples of colorectal cancer and adjacent normal mucosa. One tumor sample that failed QC is not available.	Illumina HiSeq 2500	39
EGAD00001007708	We performed deep targeted DNA sequencing with a panel of 134 selected cancer-related genes previously identified to be recurrently mutated in BL/B-AL. A Nextera rapid capture custom kit was designed using the Illumina DesignStudio. For every gene, all regions in which previous mutations had been described were covered. For 40 of these genes, the entire coding sequence was covered. Targeted deep sequencing was performed on a MiSeq-Sequencer using MiSeq Reagent Kit v2 (300 cycle) with 24 samples per run. There are 396 samples in this targetedDNAseq-dataset with 298 tumors (288 patients and 10 cell lines) and 98 normals. There are 132 female, 262 male, and 2 unknown gender samples in this dataset.	Illumina MiSeq	396
EGAD00001007709	We collected saliva samples from three nuclear families having 4, 5, and 7 children, respectively. One child in each nfamily had been diagnosed with a pediatric tumor, and neither parent had been diagnosed with cancer. Diagnoses included Wilms tumor, low-grade astrocytoma, and Burkitt’s lymphoma, respectively. We used whole-genome sequencing to profile normal cells from each family member and a linked-read technology.	HiSeq X Ten	22
EGAD00001007710	Transcriptomic sequencing on pre-immunotherapy melanoma patients.	Illumina HiSeq 2500	16
EGAD00001007711	Control cohort of lymphoma samples sequenced with a hybrid capture panel designed to be able to detect translocations and mutations in lymphoma samples. used in the paper "Robust detection of translocations in lymphoma FFPE samples using Targeted Locus Capture-based sequencing"	Illumina HiSeq 2500 Illumina HiSeq 4000	19
EGAD00001007712	This dataset contains two sets of samples. The reference sample set consists of a total of 669 samples that had been reported previously to be euploid by the NIPTIFY screening test. The validation sample set is based on a previously published validation study by Zilina et al. (1), consisting of 423 samples, of which 259 were high-risk pregnancies that had undergone diagnostic invasive prenatal analysis (1). All samples were sequenced with Illumina NextSeq 500 platform, producing 85 bp single-end reads with an average per-sample coverage of 0.32× at the University of Tartu, Institute of Genomics Core Facility, according to the manufacturer’s standard protocols, as described previously (1). This study was performed with the approval of the Research Ethics Committee of the University of Tartu (#315/T-13). 1. Zilina O, Rekker K, Kaplinski L, Sauk M, Paluoja P, Teder H, et al. Creating basis for introducing non‐invasive prenatal testing in the Estonian public health setting. Prenat Diagn [Internet]. 2019 Dec 6;39(13):1262-8. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1002/pd.5578	NextSeq 550	1092
EGAD00001007713	Paired fastq files ( 12 pairs, WES) of EGFR treated and untreated PDX models of mCRCs of 2 patiens sequenced on Illumina HiSeq 2000, the enrichment kit was Agilent SureSelect V5+UTRs.	Illumina HiSeq 2000	6
EGAD00001007714	Mutations in cancer-associated genes drive tumour outgrowth. However, the timing of driver mutations and dynamics of clonal expansion that lead to human cancers are largely unknown. We used 580,133 somatic mutations from whole-genome sequencing of 1013 clonal haematopoietic colonies to reconstruct the phylogeny of haematopoiesis, from embryogenesis to clinical disease, in 12 patients with myeloproliferative neoplasms which are blood cancers more common in older age. JAK2V617F, the pathognomonic mutation driving the majority of these cancers, was acquired in utero or childhood, with upper estimates of age of acquisition from 33 weeks gestation to 10.8 years, in all 5 patients in whom JAK2V617F was either the only or the first driver event. Driver mutations associated with age-related clonal haematopoiesis occurred prior to or following JAK2V617F, as independent clonal expansions in JAK2V617F-mutated patients, and as large clonal expansions in JAK2V617F-unmutated patients . These mutations were also acquired in utero or childhood, with DNMT3A mutations occurring by 8 weeks of gestation to 7.6 years across 4 patients, and PPM1D mutation occurring by age 5.8yrs in a patient with MPN lacking phenotypic driver mutations. Sequential driver mutation acquisition was common, separated by decades across life, and often outcompeted ancestral clones. The mean latency between JAK2V617F acquisition and clinical presentation was 30 years (range 11-54 years). Rates of clonal expansion were inferred from phylogenetic trees and varied substantially (3% to 190% expansion/year), were affected by additional driver mutations, and were predictive of latency to clinical presentation. Driver mutations and rates of expansion would have been detectable in blood one to four decades before clinical presentation. This study reveals how driver mutation acquisition early in life with life-long growth and evolution underlie adult myeloproliferative neoplasms, providing opportunities for early detection and intervention and a new paradigm for cancer development.	HiSeq X Ten Illumina HiSeq 2000	1029
EGAD00001007715	Mutations in cancer-associated genes drive tumour outgrowth. However, the timing of driver mutations and dynamics of clonal expansion that lead to human cancers are largely unknown. We used 580,133 somatic mutations from whole-genome sequencing of 1013 clonal haematopoietic colonies to reconstruct the phylogeny of haematopoiesis, from embryogenesis to clinical disease, in 12 patients with myeloproliferative neoplasms which are blood cancers more common in older age. JAK2V617F, the pathognomonic mutation driving the majority of these cancers, was acquired in utero or childhood, with upper estimates of age of acquisition from 33 weeks gestation to 10.8 years, in all 5 patients in whom JAK2V617F was either the only or the first driver event. Driver mutations associated with age-related clonal haematopoiesis occurred prior to or following JAK2V617F, as independent clonal expansions in JAK2V617F-mutated patients, and as large clonal expansions in JAK2V617F-unmutated patients . These mutations were also acquired in utero or childhood, with DNMT3A mutations occurring by 8 weeks of gestation to 7.6 years across 4 patients, and PPM1D mutation occurring by age 5.8yrs in a patient with MPN lacking phenotypic driver mutations. Sequential driver mutation acquisition was common, separated by decades across life, and often outcompeted ancestral clones. The mean latency between JAK2V617F acquisition and clinical presentation was 30 years (range 11-54 years). Rates of clonal expansion were inferred from phylogenetic trees and varied substantially (3% to 190% expansion/year), were affected by additional driver mutations, and were predictive of latency to clinical presentation. Driver mutations and rates of expansion would have been detectable in blood one to four decades before clinical presentation. This study reveals how driver mutation acquisition early in life with life-long growth and evolution underlie adult myeloproliferative neoplasms, providing opportunities for early detection and intervention and a new paradigm for cancer development.	Illumina NovaSeq 6000	57
EGAD00001007716	The data consists of 3 BAM files. Two of three BAMs are tumour FFPE samples (1 repaired-FFPE; 1 unrepaired-FFPE); The other BAM file is sequenced from normal colon tissue	Illumina NovaSeq 6000	3
EGAD00001007717	The data contain whole transcriptome sequencing of 499 Greenlanders.	unspecified	499
EGAD00001007718	Single-cell data gene expression data set (5’Chromium 10X) of healthy paediatric volunteers, and paediatric and adult COVID-19 patients. Gene expression was determined from samples of nasal, tracheal and bronchial brushings and blood (PBMCs). In addition to gene expression, PBMC’s were assayed by CITE-seq. A subset of samples have VDJ sequencing data for T cell receptors (TCR) and B cell receptors (BCR).	Illumina NovaSeq 6000	268
EGAD00001007722	DNA was extracted from fresh frozen LMS material for 29 untreated tumors (24 primary tumors, 5 metastatic 13 relapses) and 13 tumors treated with radiation (7 primaries, 6 metastatic relapses). DNA from matched blood was used as a normal reference. Whole-genome sequencing was performed using established protocols on Illumina instruments.	Illumina HiSeq 2500	87
EGAD00001007723	We included 3 BAM files of the genome sequencing data: 2 of 3 are from tumour samples, namely 1 repaired-FFPE and 1 unrepaired FFPE; the third BAM file is from normal tissue of FFPE block. There is also a VCF file containing all somatic mutations in the dataset.	Illumina NovaSeq 6000	3
EGAD00001007724	Full clinical data for a cohort of 199 individuals with acute coronary syndrome. Untargeted serum metabolomics using the Metabolon platform for individuals with ACS (n=156). Serum metabolomics using the Nightingale Health (NMR) platform for individuals with ACS and controls (ACS, n=191; controls, n=961).		1
EGAD00001007725	16S rRNA gene V3-V4 region sequenced from 21 saliva samples of BaYaka hunter-gatherer from Congo.	Illumina MiSeq	21
EGAD00001007726	16S rRNA gene V3-V4 region sequenced from 148 saliva samples of Agta hunter-gatherers from Philippines.	Illumina MiSeq	148
EGAD00001007727	16S rRNA gene V3-V4 region sequenced from 15 saliva samples of farmers from Palanan (Philippines)	Illumina MiSeq	15
EGAD00001007728	This dataset includes 406 samples from non small cell lung cancer patients treated with neoadjuvant anti-PD-1. The Single Cell 5’ V(D)J and 5’ DGE kits (10X Genomics) were used to capture immune repertoire information and gene expression from the same cell in an emulsion-based protocol at the single cell level. Libraries were generated and sequenced on an Illumina NovaSeq instrument using 2x150bp paired end sequencing.	Illumina NovaSeq 6000	406
EGAD00001007729	Sex, age at recruitment (2014-2018), and birthdate of GCAT Cohort individuals.		1
EGAD00001007730	First 20 principal components of 4988 genotyped GCAT Cohort individuals with Infinium Multi-Ethnic Global (MEGAEX2) array, with data for Cr1-22. Plink files with QC and imputed (SHAPEIT+IMPUTE).		1
EGAD00001007731	Disease diagnoses of GCAT Cohort participants obtained from electronic health records (EHR), mainly including the time period from 2012 to 2017. Disease diagnoses are codified in ICD-9, and the position of diagnosis refers to primary/secondary diagnoses (up to 14 secondary diagnoses per visit). The date and origin of the visit are also specified (AP: primary care, UGR: emergency, AH: hospital care, SMA: outpatient medical service, SMH: hospital medical service).		1
EGAD00001007732	Two DNA samples extracted from GM09237 cell line cultured with either normal medium or medium with no folic acid for 5 days were sequenced using the BGISEQ500 platform (BGI whole genome 100 bp paired-end sequencing 60x) as well as PacBio Sequel long read sequencing.	Sequel unspecified	1
EGAD00001007733	High-precision human leukocyte antigen (HLA) genotyping is crucial for anti-cancer immunotherapy, but existing tools predicting HLA genotypes using next-generation sequencing (NGS) data are insufficiently accurate. We compared the availability and accuracy of eight HLA genotyping tools (OptiType, HLA-HD, PHLAT, seq2HLA, arcasHLA, HLAscan, HLA*LA, and Kourami) using 1,275 cases from the 1000 Genomes Project data and created a new HLA-genotyping algorithm combining tools. Then, we assessed the new algorithm’s performance in 39 in-house samples with normal whole-exome sequencing (WES) data and polymerase chain reaction–sequencing-based typing (PCR-SBT) results.	Illumina HiSeq 2500	39
EGAD00001007734	Whole exome sequencing of cord bloods with activated IL7RA leading to leukemia outgrowth in NSG mice. Four leukemia and two corresponding normal controls were sequenced on an Illumina Nextseq550 sequencer (paired end 2x 150 bp).	Illumina HiSeq 2500 NextSeq 550	6
EGAD00001007735	Metagenomic sequencing data of human gut microbiome	Illumina NovaSeq 6000	130
EGAD00001007736	Our probands A and B are boys-monozygotic twins with the clinical diagnosis of severe intellectual impairment, developmental stagnation, and dysphasia. They were diagnosed at the Department of Medical Genetics and Genomics (University Hospital Brno). Parents provided written informed consent, which was approved by the Research Ethics Committee of Masaryk University and Ethics Committee of University Hospital Brno. Peripheral blood samples were collected in sterile heparinized tubes for cytogenetic analysis. Genomic DNA samples were obtained from 1 ml peripheral blood in EDTA, according to the standard DNA isolation process using the MagNaPure system (Roche Diagnostics, Basel, Switzerland). Quality and quantity were checked using a DeNovix DS-11 Spectrophotometer (DeNovix Inc., Wilmington, DE, USA) and Qubit® 2.0 (Thermo Fisher Scientific, Inc., Waltham, MA, USA).	Illumina NovaSeq 6000	4
EGAD00001007737	RNA sequencing from MSTO-211H cell line cultures treated for 72h with vehicle solution, palbociclib 250nM, or abemaciclib 250nM (N = 3, each). RNA-seq prepared using TruSeq Stranded mRNA libraries and sequenced with Illumina HiSeq 4000. Data is in raw fastq format, paired end. Some samples have been split in two lanes, with a final count of 24 fastq files.	Illumina HiSeq 4000	12
EGAD00001007738	Whole exome sequencing of 3 patients derived cell lines and patient blood (N = 6 samples), performed using Agillent SureSelect All Exon V5 and Illumina HiSeq 4000. Data is in raw fastq format (N = 10 fastq pairs, 20 files total), as some samples were split between two lanes.	Illumina HiSeq 4000	10
EGAD00001007739	RNA sequencing from 12 xenografts implanted using MSTO-211H cell line and treated with vehicle solution, cisplatin + pemetrexed, or palbociclib (N = 4, each). RNA-seq prepared using TruSeq Stranded mRNA libraries and sequenced with Illumina HiSeq 4000. Data is in raw fastq format, paired end. Some samples have been split in two lanes, with a final count of 34 fastq files.	Illumina HiSeq 4000	17
EGAD00001007740	Approximately 200 ng of high-quality of genomic DNA samples were used for library preparation. DNA libraries were prepared using the Human Core Exome Kit according to manufacturer’s recommendations (Twist Bioscience, San Francisco, CA, USA) and then sequenced on Illumina NovaSeq 6000 (Illumina, Inc., san Diego, CA, USA). Detailed protocol of WES data processing and variant analysis are available in Supplementary data.	Illumina NovaSeq 6000	3
EGAD00001007741	The samples across 17 NHL patient samples with the CD19 CAR-T treatments are sequenced in 10x genomics. Refer to the manuscript supplementary tables and "https://github.com/hwanglab/hwanglab_2021_tigitCarT" for the sequencing sample sheets, the patient clinical information, processed data, and source codes.	Illumina NovaSeq 6000	109
EGAD00001007742	The cytogenetic analysis of probands was performed with a result of normal male karyotypes (46,XY). The microarray analysis on oligonucleotide 180K CGH+SNP microarray platform was then indicated resulting in a detection of a 8q24.23q24.3 duplication (694 kb) in both probands. The family-based real-time PCR confirmed this CNV in both probands and their unaffected mother. Based on the information obtained from databases mentioned above it was classified as likely benign.	Illumina Genome Analyzer	2
EGAD00001007743	DNA samples were extracted from peripheral blood lymphocytes using commercially available kit (Puregene Core Kit A, Qiagen) according to manufacturer's protocol. Agilent SurePrint G3 Human CGH Microarray 180K platform was used for screening of copy number aberrations (CNAs) using array-CGH protocol recommended by manufacturer (Agilent Technologies), data mining and interpretation of array-CGH results was performed in same manner as in our previously published results.	unspecified	4
EGAD00001007744	Five edited and two unedited organoid clones with one clone prior to editing were paired-end whole genome sequenced using Illumina Novaseq 6000 system. The reads were mapped to hg38 genome assembly and data is provided as BAM files.	Illumina NovaSeq 6000	8
EGAD00001007745	35 samples from individuals with colorectal tumors, exome sequencing	Illumina HiSeq 2000 Illumina HiSeq 2500	35
EGAD00001007746	36 samples from individuals with colorectal tumors, whole genome sequencing	HiSeq X Ten	36
EGAD00001007747	48 samples of individuals with rare germline variants in familial multiple myeloma, whole genome sequencing	HiSeq X Ten	1
EGAD00001007748	This dataset includes FASTQ files of low coverage whole genome sequencing of cell free DNA from plasma samples. The samples include 271 plasma samples of patients with an adnexal mass and 125 plasma samples of healthy individuals.	Illumina HiSeq 2500	396
EGAD00001007749	WES: (7 samples, BAM files), scRNA-Seq (4 samples, BAM files), TruSight (56 samples, BAM or FASTQ files)	Illumina NovaSeq 6000 unspecified	67
EGAD00001007750	The dataset contains 2x75bp paired-end sequencing data in DNASE1L3-deficient human subjects. We performed bisulfite sequencing of plasma samples from three DNASE1L3-deficient subjects and one heterozygous parent to investigate how nuclease deficiencies alter plasma cell-free DNA methylation profiles.	Illumina HiSeq 1500 Illumina HiSeq 4000	7
EGAD00001007751	The dataset contains sequencing data in wildtype, Dnase1-deficient and Dnase1l3-deficient mice. We performed 2 x 75bp paired-end whole genome bisulfite sequencing of pooled plasma cell-free DNA (cfDNA) and buffy coat genomic DNA. The effects of DNASE1L3 or DNASE1 deficiency on cfDNA methylation was explored in plasma of mice deficient in these nucleases.	Illumina HiSeq 4000 NextSeq 500	36
EGAD00001007752	This dataset contains two experiments. 1) Single cell RNA-seq of peripheral blood diagnostic samples from patients with MLL-rearranged infant ALL that underwent relapse or not (samples ending in R relapsed, samples ending in N did not), sequenced with SORT-seq (see cell systems, 2016, doi:10.1016/j.cels.2016.09.002). For some of the patients, multiple indipendent plates were produced (each plate is a sample). Barcode-well correspondence can be found here: https://bitbucket.org/princessmaximacenter/sharq/src/master/data/celseq2_bc384-v4.csv . 2) Single cell RNA-seq of peripheral blood diagnostic samples from patients with MLL-rearranged infant ALL that underwent relapse or not (samples ending in R relapsed, samples ending in N did not), sequenced with10x Genomics Version 2.	NextSeq 500	26
EGAD00001007753	This data set includes bam files of WGS of 36 paired lymphomas in immune-privileged sites and normal controls.	HiSeq X Ten	72
EGAD00001007754	RNA-sequencing on neuroblastoma PDX model COG-N-519 treated with control miR-1283 and test miR-99b-5p mimics. Three samples from each of the treatment condition were analysed. Next-Seq platform was used for sequencing.	NextSeq 500	1
EGAD00001007755	Dataset form patines with retinal dystrophies.	Illumina MiSeq	93
EGAD00001007756	OV2295-052021 dataset	Illumina HiSeq 2000	1
EGAD00001007758	Shallow WGS of neuroblastoma cell lines with large-scale deletions induced through CRISPR-Cas9 and matching controls. Deletion of 11q was induced in the cell line SKNSH and loss of 6q was induced in the cell line NMB.	Illumina HiSeq 2000	13
EGAD00001007759	Raw sequencing reads of ATAC-seq of spermatogonia in FASTQ format, comprising 6 samples sequenced on the Illumina HiSeq 4000 platform.	Illumina HiSeq 4000	6
EGAD00001007760	This dataset contains bam files mapped to hg19 that either were primary bone marrow cells or sorted human cells after long term engraftment in NSG mice.	Illumina NovaSeq 6000	229
EGAD00001007761	This file contains read identifiers for local CCS, CLR, ONT and MGI reads for each of the eight selected genomic regions (HLA, KIR, IGH, IGK, IGL, TRA, TRD, andTRG). We extracted these reads by aligning whole-genome sequencing data to a draft whole-genome de novo assembly, and selecting reads that map to contigs representing each region. These reads were involved in the polishing and validation of the HV31 assembly. Please refer to the relevant manuscript (https://doi.org/10.1101/2021.02.03.429586) for additional details. Read identifiers are stored in JSON format. Along with the full FASTQ files, this file enables convenient re-analysis of the HV31 sequencing data in the eight selected regions.		1
EGAD00001007762		Illumina HiSeq 4000 Illumina MiSeq	82
EGAD00001007763	This dataset includes genome-wide autosomal array data and whole mtDNA sequences for 24 Merchero individuals.	Illumina MiSeq	24
EGAD00001007764	Here we present the 1M-scBloodNL study, in which we performed single-cell RNA-seq on 120 individuals of the Northern Netherlands population cohort Lifelines. For each individual peripheral blood mononuclear cells (PBMCs) were sequenced in an unstimulated condition, and after 3 and 24 hour in vitro stimulation with C. albicans (CA), M. tuberculosis (MTB) and P. aeruginosa (PA), totalling approximately 1.3 million cells. scRNA-seq was conducted with the 10X Genomics 3'-end v2 (72 libraries) and v3 (33 libraries) technology. In general, each library contains PBMCs from 8 donors and 2 different stimulation-timepoint combinations. Donors were demultiplexed using a combination of SoupOrCell (https://www.nature.com/articles/s41592-020-0820-1) and genotype information to assign the correct donor to a donor-specific cell cluster.	Illumina NovaSeq 6000	988
EGAD00001007765	Cellular suspensions (∼15000 cells, with expected recovery of ∼7500 cells) of sorted CD45+ HLA-DR+ CD14+ macrophages from colonic mucosa and muscularis propria were loaded on the 10X Chromium Controller instrument (10X Genomics) according to the manufacturer’s protocol using the 10X GEMCode proprietary technology. All samples from individual patients were loaded in one batch. The Chromium Single Cell 3´ v2 Reagent kit (10X Genomics) was used to generate the cDNA and prepare the libraries, according to the manufacturer’s protocol. The libraries were then equimolarly pooled and sequenced on an Illumina NextSeq500 using HighOutput flow cells v2.5. A coverage of 400M reads per sample was targeted, in order to obtain 50 000 reads per cell. The raw data were then demultiplexed and processed with the Cell Ranger software (10X Genomics) v2.1.1.	NextSeq 500	8
EGAD00001007766	Human placenta samples from 52: 5 first trimester , 7 second trimester, and 40 term placenta. Data is uploaded as BAM files.	Illumina HiSeq 2000	52
EGAD00001007767	Exome sequencing was carried out in a tall male (height 3.5 SDS) and his parents (3 samples). The data was sequences on a Illumina Hiseq2000 and the library was prepared with Agilent SureSelect V4.	Illumina HiSeq 2000	3
EGAD00001007768	We profiled transcriptome and epigenome of BMP signaling effects on H3.3K27M DIPG.	HiSeq X Ten	49
EGAD00001007769	This dataset includes 914 BAM files from 6 IDH-mutant, 5 IDH-wild-type glioma patient samples of unmatched initial and recurrent timepoints profiled using single-cell reduced-representation bisulfite sequencing.	Illumina HiSeq 4000	914
EGAD00001007770	This dataset includes 60 BAM files from HF2354, HF3016 glioblastoma cell lines subjected to continuous stress (hypoxia, 3-day and 9-day), stress followed by recovery (irradiation, 4-day stress exposure and 5-day recovery), and no stress/normoxia controls and profiled using reduced-representation bisulfite sequencing.	Illumina NovaSeq 6000	60
EGAD00001007771	This dataset includes 22 BAM files for tumor tissue and matched normal blood from 6 IDH-mutant, 5 IDH-wild-type glioma patient samples of unmatched initial and recurrent timepoints profiled using whole genome sequencing.	Illumina NovaSeq 6000	22
EGAD00001007772	This dataset includes paired-end fastq files from 6 IDH-mutant, 5 IDH-wild-type glioma patient samples of unmatched initial and recurrent timepoints profiled using single-cell RNA sequencing.	Illumina HiSeq 4000 Illumina NovaSeq 6000	11
EGAD00001007773	This dataset includes genome-wide autosomal array data for 11 Iberian Roma individuals used in the Merchero project.		11
EGAD00001007774	This dataset contains genotypes (35.4M of SNVs, Indels and SVs), from 785 samples, after QC filtering, from the 808 WGS GCAT cohort.		1
EGAD00001007775	Intrahepatic cholangiocyte organoid clone from patients with chronic alcohol consumption, NASH (nonalcoholic steatohepatitis), and PSC (primary sclerosing cholangitis)	HiSeq X Ten	19
EGAD00001007776	This dataset contains whole blood transcriptome data generated from 93 patients with COVID-19 across a range of severities and 23 healthy controls. All patients were PCR positive for SARS-CoV-2 and disease severity ranged from asymptomatic to severe disease requiring ventilation. Individuals without symptoms, or with mild symptoms, were recruited from routine screening of healthcare workers, while COVID-19 patients were recruited at or soon after admission to Addenbrooke’s or Royal Papworth hospitals. Blood samples were taken at recruitment and then again four weeks later. Further details of the cohort and the generation of the RNA-Sequencing data can be obtained from Bergamaschi, L. et al. Longitudinal analysis reveals that delayed bystander CD8+ T cell activation and early immune pathology distinguish severe COVID-19 from mild disease. Immunity 54, 1257-1275 e8 (2021).	Illumina HiSeq 4000	768
EGAD00001007777	This dataset contains multiplexed fastq files containing raw BCR repertoire data	Illumina MiSeq	11
EGAD00001007778	This dataset contains samples from 9 patients with alveolar rhabdomyosarcoma. 9 samples have whole exome data (one has multiple). 6 samples have RNAseq data (one has multiple). 6 samples have matched normals dna sequence data	Illumina HiSeq 2000	16
EGAD00001007780	Whole genome sequencing of sick children in neonatal and paediatric intensive care units, aligned to reference assembly GRCh37.	Illumina HiSeq 2000	-
EGAD00001007782	Glioblastoma Patient samples were acquired before and after standard chemoradiation or standard chemoradiation+TTFields (Novo-TTF100) treatment. The set includes paired before-after samples of 6 control patients (chemoradiation) and 6 TTFields+chemoradiation patients.	Illumina HiSeq 2500	24
EGAD00001007783	The PGDP dataset includes 58 whole genome sequences for Papua New Guinean individuals from different locations. DNA was extrated from saliva samples (Oragen kit). Sequencing libraries were prepared using the TruSeq DNA PCR-Free HT kit. 150 bp paired-end sequencing was performed on the Illumina HiSeq X5 sequencer. The PGDP dataset provides Fastq and BAM files.	HiSeq X Five	58
EGAD00001007785	Data from 496 OCCAMS (Oesophageal Cancer Clinical And Molecular Stratification) cases. WGS BAM files 496x oesophageal adenocarcinoma samples 496x normal samples	HiSeq X Five Illumina HiSeq 2000 Illumina NovaSeq 6000	1
EGAD00001007786	This dataset contains the genotypes jointly-called from whole genome sequencing data of 177 self-reported Peranakans in Singapore. Reads were aligned to GRCh37 reference genome and jointly-called with other WGS samples. Basic quality control measures and population phasing without reference were performed on the called genotypes. The data are stored in VCF v 4.3 format, and one .vcf.gz file stored the genotypes from one of the 23 chromosomes (22 autosomes+X chromosome).		177
EGAD00001007787	The study prospectively enrolled patients admitted for HF with LV ejection fraction (LVEF) ≥ 50% and LV wall thickness <12 mm. TTR cardiac amyloidosis was diagnosed according to accepted criteria, which include positive cardiac 99-Tc-DPD scintigraphy in the absence of monoclonal protein expansion in blood. In a cohort of patients with HFpEF without LVH, the prevalence of TTR cardiac amyloidosis was 5%. Transthyretin gene sequencing was performed in positive patients.	NextSeq 500	2
EGAD00001007788	Patient-derived samples were profiled using 10X genomics single-cell CNV and single-cell ATAC kits.	Illumina NovaSeq 6000 NextSeq 500	10
EGAD00001007789	SmMIP libraries using cord blood DNA were generated in replicates and were sequenced on the NovaSeq SP platform (Illumina)		16
EGAD00001007790	SmMIP libraries using bulk cell line DNA and DNA mixes were generated in replicates and were sequenced on the NovaSeq SP platform (Illumina)		44
EGAD00001007791	SmMIP libraries using DNA from patients diagnosed with myeloid malignancies were generated in replicates and were sequenced on the NovaSeq SP platform (Illumina)		336
EGAD00001007792	Gallbladder carcinoma is the most common cancer of the biliary tract with dismal survival largely due to delayed diagnosis. Biliary tract intraepithelial neoplasia (BilIN) is the common benign tumor that is suspected to be precancerous lesions. However, the genetic and evolutionary relationships between BilIN and carcinoma remain unclear. Here we performed whole-exome sequencing of coexisting low-grade BilIN (adenoma), high-grade BilIN, and carcinoma lesions, and normal tissues from the same patients.	HiSeq X Ten	44
EGAD00001007793	Somatic mutations of RUNX1, which encodes the myeloid and lymphoid transcriptional factor RUNX1, are common in both B- and T- acute lymphoid leukemia (ALL) and are associated with poor prognosis of T-ALL. However, there has been no comprehensive investigation of the pattern or prevalence of RUNX1 germline mutation in both B- and T-ALL. Here we report germline RUNX1 variants in 1.23% of B-ALL and 2.11% of T-ALL, identifying 31 unique variants in 62 B-ALL and 18 unique variants in 26 T-ALL children. The majority of frameshift and nonsense variants affected RUNX1 function in transcriptional regulation, hematopoiesis, and cellular proliferation. We identified JAK3 as the most frequent somatic mutation in T-ALL with RUNX1 variants. These results not only identify RUNX1 as a leukemia predisposition gene but also further underline the importance of germline genetic variants to the development of ALL	Illumina NovaSeq 6000	16
EGAD00001007794	scGBS is a single-cell sequencing-based methodology to haplotype and copy-number profile single cells. Genomic size and complexity is reduced through restriction enzyme digestion and DNA is genotyped through sequencing of the restriction fragments. scGBS data serves as the input for haplarithmisis, an algorithm we previously developed for SNP array-based single-cell haplotyping (Zamani Esteki et al., 2015). We established technical parameters and developed an analysis pipeline enabling accurate concurrent haplotyping and copy-number profiling of single cells with the use of a HapMap cell line pedigree (7 single cells). A clinical validation of the methodology with a total of 14 single blastomeres and 3 trophectoderm samples biopsies from human preimplantation embryos for 6 PGT-M families were processed with scGBS and were previously haploptyped via SNP array.	Illumina HiSeq 2500 NextSeq 500	49
EGAD00001007796	The dataset for Detection and characterization of lung cancer using cell-free DNA fragmentomes includes 872 bam files from whole genome next-generation sequencing on the Illumina HiSeq2500. The samples analyzed include plasma samples from healthy individuals and patients with cancer.	Illumina HiSeq 2500	872
EGAD00001007799	Analysis of RAD51C promoter methylation using targeted bisulfite sequencing (amplicon sequencing) in ovarian cancer pre-clinical models and patient samples.	Illumina MiSeq	20
EGAD00001007800	To better understand variation in metastatic prostate cancer behaviour, we assembled and analyzed longitudinal clinical and autopsy records in 33 men. The dataset is contained in a self-explanatory Excel Workbook, with each patient identified as A1, A2, etc. as listed in the "Combined longitudinal clinical and autopsy phenomic assessment in lethal metastatic prostate cancer: recommendations for advancing precision medicine" publication in European Urology Open Science. Please see Jasu J, Tolonen T, Antonarakis ES, Beltran H, Halabi S, Eisenberger MA, Carducci MA, Loriot Y, Van der Eecken K, Lolkema M, Ryan CJ, Taavitsainen S, Gillessen S, Högnäs G, Talvitie T, Taylor RJ, Koskenalho A, Ost P, Murtola TJ, Rinta-Kiikka I, Tammela T, Auvinen A, Kujala P, Smith TJ, Kellokumpu-Lehtinen PL, Isaacs WB, Nykter M, Kesseli J, Bova GS. Combined Longitudinal Clinical and Autopsy Phenomic Assessment in Lethal Metastatic Prostate Cancer: Recommendations for Advancing Precision Medicine. Eur Urol Open Sci. 2021 Jul 2;30:47-62. doi: 10.1016/j.euros.2021.05.011. PMID: 34337548; PMCID: PMC8317817. for more details.		33
EGAD00001007801	Fastq files generated during target sequencing of 10MB genomic region surrounding top hits in GWAS in a subset of 86 individuals in case families. Paired end sequencing performed on Illumina NextSeq.	NextSeq 500	86
EGAD00001007803	Whole-exome sequencing of IMFT tumor samples from 24 participants in the clinical phase II trial EORTC 90101 “CREATE” (CREATE IMFT cohort)	Illumina HiSeq 4000	24
EGAD00001007804	Whole-genome sequencing of IMFT tumor samples from 24 participants in the clinical phase II trial EORTC 90101 “CREATE” (CREATE IMFT cohort)	Illumina HiSeq 4000	24
EGAD00001007805	Mutational landscape of high-grade B-cell lymphoma with MYC-, BCL2 and/or BCL6 rearrangements characterized by whole-exome sequencing and panel sequencing.	Illumina MiSeq Illumina NovaSeq 6000	73
EGAD00001007806	26 Tumor/Control pairs of WGS data of PCNSL tumors, sequenced on either Illumina HiSeq2000/2500 instruments or HiSeq X Ten. The controls are blood or buffy coat samples in most cases.	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500	52
EGAD00001007807	Paired-end WGS data of 27 neuroblastoma patient samples (10 obtained at diagnosis, 6 at relapse and 11 matched blood samples as controls) used for detection of complex "seismic" amplification. Mean coverage is 24-55x per sample. The remaining patient samples of the dataset can be found under accession number EGAS00001001308.	HiSeq X Ten Illumina HiSeq 2000 Illumina NovaSeq 6000	27
EGAD00001007808	Data supporting "Interplay of processes shapes structural variations undergoing selection in oesophageal adenocarcinoma" Ng, Contino et al. WGS (BAM files) 383 oesophageal adenocarcinoma samples 383 normal samples	Illumina HiSeq 2000	1
EGAD00001007809	Data supporting: "Interplay of processes shapes structural variations undergoing selection in oesophageal adenocarcinoma" Ng, Contino et al. RNAseq (BAM files) 214 oesophageal adenocarcinoma samples	Illumina HiSeq 2000	1
EGAD00001007810	Paired WGS samples, 24 tumor/control pairs of primary CNS lymphoma, sequenced on HiSeq X Ten using Illumina TruSeq Nano DNA for library preparation.	HiSeq X Ten	24
EGAD00001007811	Primary lymphomas of the central nervous system (PCNSL) are diffuse large B-cell lymphomas (DLBCLs) which are confined to the central nervous system (CNS). Paired RNA-Seq sequencing was done on Illumina HiSeq2000 machines using Illumina TruSeq RNA library preparation kit. About 36 tumor samples were sequenced.	Illumina HiSeq 2000	38
EGAD00001007812	We analyzed 34 AGCTs (19 primary and 15 recurrent) and the KGN cell line by RNA-Seq. Our cohort comprised of 3 AGCTs WT for FOXL2, 28 heterozygous and 3 homo/hemizygous for the pathogenic variant. Fresh-frozen AGCTs were selected from OVCARE’s Gynecological Tissue Bank in Vancouver, Canada for bulk RNA-seq. RNA was extracted from frozen tissue and sections adjacent to the scrolls submitted for RNA-seq were stained with hematoxylin and eosin (H&E) to evaluate tumour cell purity. Cases with >80% tumour cell purity were selected for sequencing with the majority of cases (29 of 34 patients) containing >90% tumour cells. Ribodepleted RNA libraries were constructed and paired-end sequencing (125 base pair reads) was performed.	Illumina HiSeq 2500	35
EGAD00001007813	RNAseq on 20 samples of multiple myeloma patients and 3 normal plasma cells. RNAseq was performed using 200 ng of total RNA by GATC Biotech. Directional libraries were performed after mRNA selection by polyA selection using UTP method. RNA-seq libraries were sequenced on HiSeq2500 Illumina machine using 100bp paired-end reads. Reads alignment was performed using the STAR aligner (version 2.4.0f1) and human genome hg19 as reference.	Illumina HiSeq 2500	20
EGAD00001007814	This dataset contains ATAC sequencing of plasma cells from multiple myeloma (MM) patients. The data was used to investigate genotype-specific chromatin accessibility quantitative trait locus (caQTL) using caQTLseg (https://github.com/abhisheknrl/caQTLseg). This dataset contains 161 bam files.	unspecified	161
EGAD00001007815	Genotyped data for 28,022 British individuals with South Asian ancestry from the Genes and Health cohort (Feb2020), which were imputed with the GenomeAsia pilot reference panel.		28022
EGAD00001007816	This dataset contains whole exome sequencing (WES) data (various enrichment methods) from tumor DNA samples of various pediatric cancer entities. Files are provided in fastq format. Samples were sequenced on a Novaseq6000 or Hiseq2500 (Illumina).	Illumina HiSeq 2500	10
EGAD00001007817	This dataset contains 538 Tumor and Control WGS and WES files for samples already submitted and published in study EGAS00001004276	Illumina HiSeq 4000 NextSeq 500	400
EGAD00001007818	Some data was previously submitted data under study number EGAS00001004276. In this new dataset we provide additional WGS and Avenio Surveillance Panel data. We utilized 43 ALK+ NSCLC patients receiving targeted ALK therapy to evaluate ctDNA levels based on matched panel-based targeted next generation sequencing (tNGS) and untargeted shallow whole genome sequencing (sWGS). For the Avenio panel the sequencing was done on Illumina NextSeq 550 paired end 150 bp, for WGS the sequencing was done on Illumina HiSeq 4000, partly with KAPA_Hyper_Prep_Kit. In this dataset there are 132 WGS tumor samples and 134 panel sequencing data of plasma.	Illumina HiSeq 4000 NextSeq 550	266
EGAD00001007819	Capture lncRNA and totalRNA sequencing of various sample types (including plasma, FFPE and high quality RNA).	Illumina NovaSeq 6000	8
EGAD00001007820	RNAseq of liver organoids with a dG genotype (B20, nt115, U15) and with a TT genotype (nt5, U16, U19) was performed, which was used to study the impact of IFNλ4 on the cellular response to Sendai viral infection.	NextSeq 550	18
EGAD00001007821	Whole genome bisulfite sequencing on 10 multiple myeloma cases. Data quality control and adaptor-trimmed were performed with the Trimomatic tool. Paired-reads were mapped to the hg19 human reference with methylCtools aligner.	Illumina Genome Analyzer	1
EGAD00001007822	Enhanced reduced representation bisulfite sequencing (eRRBS) on 45 multiple myeloma samples and 3 normal plasma cell. Enhanced reduced representation bisulfite sequencing (eRRBS) on 45 multiple myeloma samples and 3 normal plasma cell. Libraries were sequenced on a HiSeqTM4000 Illumina machine using 75bp paired-end reads	Illumina HiSeq 4000	24
EGAD00001007824	Whole exome sequencing data (bam files) of 55 samples of myxofibrosarcoma and 44 matched pairs.	Illumina HiSeq 2500	99
EGAD00001007825	Myxofibrosarcoma (MFS) is a rare subtype of sarcomas in the elderly, whose genetic basis is poorly understood. To elucidate it, the whole genome sequence was performed.	HiSeq X Ten	10
EGAD00001007826	Myxofibrosarcoma (MFS) is a rare subtype of sarcomas in the elderly, whose genetic basis is poorly understood. To elucidate it, the Targeted-capture sequencing was performed.	Illumina HiSeq 2500	108
EGAD00001007827	Cryopreserved PBMCs from 10 individuals before and after vaccination were used to perform single cell RNA sequencing. Equal number of cells per individual were pooled together (5 individuals per pool) and single-cell RNA sequencing was performed in paired-end mode on NovaSeq 6000 (Illumina) with a depth of 50,000 reads per cell. DNA was isolated from PBMCs and then used for genotyping by Illumina GSA Beadchip. This dataset contains the fastq sequence files, genotypes of the donors used for demultiplexing the pools and files indicating the linkages between individuals, pools and fastq files. The number of samples listed by EGA does not match the actual number of samples due to limitations on the upload scheme used.	Illumina NovaSeq 6000	4
EGAD00001007828	661 bam files generated from high-throughput RNAseq of tumour biopsies from colorectal cancer patients	NextSeq 500	661
EGAD00001007829	This data set contains BAM files of the RNAseq analysis for three SCCOHT patient tumors. Total mRNA was isolated from fresh frozen tumor samples. RNA sequencing was performed using Illumina HiSeq 4000, paired end 150 bp.	Illumina HiSeq 4000	3
EGAD00001007830	Total collection of Samples. Exome sequencing and RNAseq from Mongolia and Western HCC samples.	Illumina HiSeq 2500 Illumina NovaSeq 6000	550
EGAD00001007831	Samples are from patients enrolled in an international multicentric study aimed to define the genetic determinants of recurrence of membranous nephropathy in the kidney graft. They include 248 samples from patients with MN including 105 patients who received a graft, their 105 graft donors, and 192 controls all of Caucasian origin. Files from targeted-capture of HLA and PLA2R loci are available as fastq files.	Illumina HiSeq 4000	545
EGAD00001007832	Basic phenotypes for BRACOVID cohort.		348
EGAD00001007833	Lab values for BRACOVID cohort.		234
EGAD00001007834	Basic phenotypes for BelCovid2 cohort.		392
EGAD00001007835	Lab values for BelCovid2 cohort.		262
EGAD00001007836	Basic phenotypes for GEN_COVID cohort.		1141
EGAD00001007837	Lab values for GEN_COVID cohort.		739
EGAD00001007838	Basic phenotypes for Hostage1 cohort.		847
EGAD00001007839	Lab values for Hostage1 cohort.		847
EGAD00001007840	Basic phenotypes for Hostage2 cohort.		306
EGAD00001007841	Lab values for Hostage2 cohort.		306
EGAD00001007842	Basic phenotypes for Hostage3 cohort.		71
EGAD00001007843	Lab values for Hostage3 cohort.		71
EGAD00001007844	Basic phenotypes for Hostage4 cohort.		121
EGAD00001007845	Lab values for Hostage4 cohort.		121
EGAD00001007846	Basic phenotypes for INMUNGEN_CoV2 cohort.		367
EGAD00001007847	Lab values for INMUNGEN_CoV2 cohort.		37
EGAD00001007848	Basic phenotypes for SPGRX cohort.		364
EGAD00001007851	Age-related loss of function in the human haematopoietic system is well documented, manifesting as reduced regenerative capacity, age-related cytopenias and immune dysfunction. However, the cellular and population level changes that underpin both this functional decline and the increased risk of clonal haematopoiesis and blood cancer in the elderly remain elusive. Here we performed whole genome sequencing on >3350 single haematopoietic stem cell / multipotent progenitors (HSC/MPP) derived colonies across 10 haematologically normal subjects aged 0 to 81. We found that HSC/MPPs accumulated 17 single nucleotide variants per year post birth and had a reduction in telomere length of 50bp per year throughout young adult life. We reconstructed phylogenies of the sampled HSC/MPPs to interrogate changes in clonal dynamics through life. Haematopoiesis in adults aged less than 65 was predominantly polyclonal, with few known driver mutations. In contrast, individuals aged over 75 displayed a profound change in clonal structure, with frequent clonal expansions, many unexplained by known driver mutations. The ratio of non-synonymous to synonymous mutations revealed widespread positive selection, estimating around 1000 driver mutations in the dataset (10-fold more than the number of known drivers). We identified novel genes ZNF318 and HIST2H3D as being under positive selection, despite not being enriched in myeloid malignancies. Our data show that HSC clonal dynamics is more complex than previously thought. One implication is that by old age, the majority of HSCs carry at least one of a number of largely undescribed driver mutations, which may underlie aspects of their functional decline.	HiSeq X Ten Illumina NovaSeq 6000	-
EGAD00001007853	We have performed single cell RNA-sequencing for infant and childhood B-cell acute lymphoblastic leukemias as well as infant acute myeloid leukemias at diagnosis. The sequencing was performed with 10X Chromium single cell 3’ and 5’ chemistry.	HiSeq X Ten	3
EGAD00001007854	We have performed single cell RNA-sequencing for infant and childhood B-cell acute lymphoblastic leukemias as well as infant acute myeloid leukemias at diagnosis. The sequencing was performed with 10X Chromium single cell 3’ and 5’ chemistry.	Illumina HiSeq 4000 Illumina NovaSeq 6000	5
EGAD00001007856	The dataset consists of - 126 whole exome sequencings (SAMD9/9Lmut: 64; GATA2mut 24, MDS wildtype 38/471) performed using SureSelect Human All Exon V6 enrichment (Agilent, cat# 5190-8863). The generated libraries were sequenced on the Illumina Hiseq 2500 with 150bp paired-end reads. FASTQ files were processed using SeqNext platform (JSI medical system, Germany), with gene-based alignment to a virtual panel of 300 genes (including 28 MDS-associated genes, SAMD9, and SAMD9L), consisting of genes relevant to bone marrow failure, MDS predisposition, and hematological cancers as per the Pan-Cancer studies with cohorts of >10,000 cancers. The respective BAM files are provided. - Custom panel targeting SAMD9, SAMD9L, and 22 single nucleotide polymorphisms (SNP) on chromosome 7q (allele frequency >35% in all ethnic sub-populations in gnomAD) (Ampliseq #IAD104171) were performed in 666/669 cases. And Custom panel targeting 28 MDS-associated genes (GATA2, RUNX1, HOXA9, CEBPA, GATA1, KRAS, NRAS, CBL, PTPN11, ASXL1, EZH2, SETBP1, FLT3, KIT, JAK2, JAK3, CSF3R, MPL, SH2, BCOR, BCORL1; RAD21, STAG2, CTCF, TP53, PTEN, CALR, VPS45) was performed in 544 cases (Ampliseq #IAD51150). Both custom panel libraries were prepared using NEBNext Ultra II DNA library prep kit (New England BioLabs, cat#E7645S/L) per manufacturer’s instruction and samples were sequenced on an Illumina Miseq 2000 with 2 x 150 bp reads. The respective BAM files are provided - 4 SAMD9/9L patients were subjected to MissionBio custom single-cell panel (CO-112) targeting 250 heterozygous gnomAD population polymorphisms on 7q arm and 69 amplicons in SAMD9/9L and other cancer genes. All libraries were sequenced on an Illumina NovaSeq6000 with 150 base-paired ending multiplexed runs. Fastq files were processed using the Tapestri Pipeline V2 and python-based Mosaic package (multi-omics analysis, data visualization). The derived BAM and loom files are provided.	Illumina HiSeq 2500 Illumina MiSeq Illumina NovaSeq 6000	437
EGAD00001007858		HiSeq X Five Illumina HiSeq 4000 Illumina NovaSeq 6000	234
EGAD00001007859	This dataset contained raw sequencing fastq data of the article "A body map of somatic mutagenesis in morphologically normal human tissues". We sampled morphologically normal tissue biopsies from 5 donors. We performed low-depth WGS on 1,764 samples, high-depth WGS on 48 samples and WES on 1,772 samples.	HiSeq X Ten Illumina NovaSeq 6000	1792
EGAD00001007860	Collection of mostly matching primary and recurrent glioblastoma RNA-seq sample pairs, also matching with an earlier DNA sequencing study	Illumina NovaSeq 6000	346
EGAD00001007861	This dataset contains 318 Tumor and Control WGS files submitted in another EGA box for samples for Gerhauser et al.,Cancer Cell, 2018, 34:996-1011. WGS and sequencing protocol was earlier described in Weischenfeldt et al, Cancer Cell, 2013.	Illumina HiSeq 2000	320
EGAD00001007862	The dataset is composed by the raw and processed sequencing data generated from 185 Patients affected by azoospermia or severe oligozoospermia recruited from the Netherlands and the UK.	Illumina NovaSeq 6000 NextSeq 500	555
EGAD00001007863	BCL11B PacBio data set, 4 samples	Sequel	4
EGAD00001007864	Part of the project: The INFORM Precision Medicine Study for High-Risk Pediatric Malignancies resulted in the publication of this study: Radiation-induced gliomas represent H3-/IDH-wild type pediatric gliomas with recurrent PDGFRA amplification and loss of CDKN2A/B. This dataset contains the subset of 17 patient exome sequencing data.	Illumina HiSeq 2500 Illumina HiSeq 4000	17
EGAD00001007865	Single-cell multi-omic profiling of COVID19 patients recruited from University College London. Data represent RNA-seq, surface protein measurements (CITE-seq) of 192 antibody targets, along with VDJ-seq profiling of single T cell and B cell receptors. Samples are pooled, with 4 donors per pool. Germ-line genotypes derived from previous single-cell RNA-sequencing are provided (VCF) to aid demultiplexing of single-cell and assignment to specific patient donor samples.	Illumina NovaSeq 6000	21
EGAD00001007866	Single-cell multi-omic profiling of healthy controls, asymptomatic and hosptial-admitted COVID19 patients recruited from Newcastle University hospitals. These data also include healthy control volunteers treated with IV-LPS as inflammatory controls. Data represent RNA-seq, surface protein measurements (CITE-seq) of 192 antibody targets, along with VDJ-seq profiling of single T cell and B cell receptors.	Illumina NovaSeq 6000	73
EGAD00001007867	Single-cell multi-omic profiling of healthy controls, asymptomatic and hosptial-admitted COVID19 patients recruited from Addenbrookes and Royal Papworth hospitals, in collaboration with the NIHR Cambridge Bioresource. Data represent RNA-seq, surface protein measurements (CITE-seq) of 192 antibody targets, along with VDJ-seq profiling of single T cell and B cell receptors. Samples are pooled, with 4 donors per pool. Germ-line genotypes are provided (VCF) to aid demultiplexing of single-cell and assignment to specific patient donor samples.	Illumina NovaSeq 6000	96
EGAD00001007868	Whole genome sequencing of sick children in neonatal and paediatric intensive care units, aligned to reference assembly GRCh38.	Illumina HiSeq 2000	449
EGAD00001007870	This dataset contains the 22 bam files coresponding to the scRNAseq done in PDX models and cell lines.	Illumina NovaSeq 6000	24
EGAD00001007871	Filtered somatic MT SNV (heteroplasmy) calls detected using mitoCaller (v1.0)		304
EGAD00001007872	To QC the TraCe-seq strategy, single-cell RNA-seq libraries were generated from a variety of human cancer cell lines transduced with the TraCe-seq library to validate the TraCe-seq strategy. Specifically, 5 different cell lines (PC9, MCF-10A, MDA-MB-231, NCI-H358, and NCI-H1373) were each transduced with a unique TraCe-seq barcode. The transduced cells were selected with puromycin only, dissociated to single cell suspensions, and then mixed together. The complex mixture of the 5 cell lines was profiled by 10X scRNA-seq. Furthermore, transduced NCI-H1373 cells were sorted by FACS to enrich for the top 50% of eGFP positive cells, and sorted cells were cultured briefly and used to construct scRNA-seq libraries and profiled by 10x scRNA-seq. To carry out the full TraCe-seq experiment, ~600 PC9 cells carrying unique TraCe-seq barcodes were expanded over 12 doublings to establish the barcoded population. A subset of the barcoded PC9 population was used to generate scRNA-seq libraries and profiled by 10x scRNA-seq prior to treatment to establish a baseline transcription profile for each barcoded clone. The rest of the cells were then treated for four days with 1 µM erlotinib, 1 µM GNE-069, or 1 µM GNE-104 respectively. scRNA-seq libraries were then generated form the treated cells and profiled by 10x scRNA-seq.	Illumina HiSeq 4000	6
EGAD00001007873	This dataset contains 26 whole-genome sequencing (13 paired tumor and normal), 106 whole-exome sequencing (53 paired tumor and normal), and 43 targeted sequencing data. Sequencing was performed using an Illumina platform. The data are BAM files aligned to the hg19 reference genome.	Illumina HiSeq 2500	175
EGAD00001007874	RNA-seq data from paired tumour and germline samples from mesothelioma patients for study EGAS00001005196	Illumina HiSeq 2500 Illumina NovaSeq 6000	42
EGAD00001007875	Islet-derived_MSC06 WGBS paired end data	HiSeq X Ten	1
EGAD00001007876	Islet-derived_MSC08 WGBS paired end data	HiSeq X Ten	1
EGAD00001007877	Islet-derived_iPSC04 WGBS paired end data	HiSeq X Ten	1
EGAD00001007878	Islet-derived_MSC06 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007879	Islet-derived_MSC08 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007880	Islet-derived_iPSC04 mRNA-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007881	Islet-derived_MSC06 miRNA-Seq single end data	Illumina HiSeq 2500	1
EGAD00001007882	Islet-derived_MSC08 miRNA-Seq single end data	Illumina HiSeq 2500	1
EGAD00001007883	Islet-derived_iPSC04 miRNA-Seq single end data	Illumina HiSeq 2500	1
EGAD00001007884	Pancreas-Islet06 WGBS paired end data	HiSeq X Ten	1
EGAD00001007885	Short read whole genome sequencing (WGS) VCF files for the NIHR BioResource Rare Diseases WGS project – Participants from the Hypertrophic Cardiomyopathy (HCM) Rare Disease domain		-
EGAD00001007886	Short Description: This study contains 7 RRBS samples, including 3 ex vivo CD4+ Trm (2 x spleen and 1 x bone marrow) and 4 blood tetanus (TT) and measles (Me) antigen-reactive memory CD4+ cells before and one day post DTaP (diphtheria-tetanus-pertussis) and MMR (measles-mumps-rubella) vaccination, respectively (1 x tetanus D0, 1 x tetanus D1, 1 x measles D0, 1 x measles D1). Technology: Illumina HiSeq 2500 Filetype: fastq format	Illumina HiSeq 2500	7
EGAD00001007887	Pancreas-Islet08 WGBS paired end data	HiSeq X Ten	1
EGAD00001007888	Islet-derived_iPSC04 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007889	Islet-derived_iPSC04 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007890	Islet-derived_iPSC04 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007891	Islet-derived_iPSC04 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007892	Islet-derived_iPSC04 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007893	Islet-derived_iPSC04 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007894	Islet-derived_iPSC04 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007895	Islet-derived_MSC06 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007896	Islet-derived_MSC06 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007897	Islet-derived_MSC06 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007898	Islet-derived_MSC06 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007899	Islet-derived_MSC06 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007900	Islet-derived_MSC06 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007901	Islet-derived_MSC06 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007902	Islet-derived_MSC08 h3k27ac ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007903	Islet-derived_MSC08 h3k27me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007904	Islet-derived_MSC08 h3k36me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007905	Islet-derived_MSC08 h3k4me1 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007906	Islet-derived_MSC08 h3k4me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007907	Islet-derived_MSC08 h3k9me3 ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007908	Islet-derived_MSC08 input ChIP-Seq paired end data	Illumina HiSeq 2500	1
EGAD00001007909	HCA Endometrium_LM The endometrium regenerates monthly and its transformation is executed through dynamic changes in states and interactions of multiple cell types. Using transcriptomics methods we seek to profile changes of the endometrium across the menstrual cycle. Our map will have implications in women's health and cancer, by enabling the interpretation of GWAS analyses or the studying functional consequences of somatic mutations. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina NovaSeq 6000	7
EGAD00001007910	Open chromatin regions in the MYC super-enhancer region were investigated by ATAC-seq in t(3;8) AML. ATAC-seq was performed as described (Buenrostro et al, 2013) with a modification in the lysis buffer (0.30 M sucrose, 10 mM Tris pH 7.5, 60 mM KCl, 15 mM NaCl, 5 mM MgCl2, 0.1 mM EGTA, 0.1% NP40, 0.15 mM Spermine, 0.5 mM Spermidine, 2 mM 6AA) to reduce mitochondrial DNA contamination.	Illumina HiSeq 2500 Illumina NovaSeq 6000	5
EGAD00001007911	DNA (exome) sequencing of uveal melanoma metastases.	Illumina NovaSeq 6000 NextSeq 500	107
EGAD00001007912	RNA sequencing of uveal melanoma metastases.	Illumina NovaSeq 6000 NextSeq 500	22
EGAD00001007913	Single-cell RNA and TCR sequencing of PBMC from patients with uveal melanoma.	Illumina MiSeq NextSeq 500	16
EGAD00001007914	Hi-C (n=72) data from a variety of pediatric brain tumors including ependymoma (PFA, PFB, Ste, spinal), medulloblastoma (G3, G4, SHH), high grade glioma (H3K27 and H3-WT), pilocytic astrocytoma, and more. Raw data provided as FASTQ. Data generated on Illumina HiSeq2500.	Illumina HiSeq 2500	70
EGAD00001007915	RNA-seq (n=52) data from a variety of pediatric brain tumors including ependymoma (PFA, PFB, Ste, spinal), medulloblastoma (G3, G4, SHH), high grade glioma (H3K27 and H3-WT), pilocytic astrocytoma, and more. Raw data provided as FASTQ. Data generated on Illumina HiSeq2500.	Illumina HiSeq 2500	52
EGAD00001007916	Novaseq whole exome raw sequence files (FastQ) for breast cancer tumor core biopsies and blood normal control.	Illumina NovaSeq 6000	6
EGAD00001007917	This data contains the TCR-beta sequences of 10 head and neck squamous carcinomas and 19 nasopharyngeal carcinomas. The library preparation method is a customised targeted amplification of the VDJ regions and is sequenced on the Illumina Miseq.	Illumina MiSeq	29
EGAD00001007918	Targeted sequencing of non-small cell lung cancer samples. BAM files of paired end reads aligned to hg19 using BWA MEM v0.7.1573. This targeted panel covers 370 genes of clinical relevance in non-small cell lung cancer.	Illumina NovaSeq 6000	140
EGAD00001007919	February 2021 data update (fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	unspecified	5
EGAD00001007920	This paper describes the work by Akbari V,. et al. on detection of allele specific methylation using oxford nanopore sequencing data. They have developed set of tools, SNVoter and NanoMethPhase, and workflow which enable the detection of allele specific methylation even in samples with sparse coverage of nanopore sequencing data.	PromethION	1
EGAD00001007921	Two sections of cryopreserved prostate cancer tissue from one untreated prostate cancer patient were profiled for spatial transcriptomics using the Visium Spatial library preparation protocol from 10x Genomics. The GRCh38 aligned sequencing reads from the two prostate cancer tissue sections are provided as BAM files.	Illumina NovaSeq 6000	2
EGAD00001007922	Raw FASTQ files for 77 RS + DLBCL + CLL samples. RNA-sequencing with single-end 50 nt reads.	Illumina HiSeq 4000	77
EGAD00001007923	PromethION-based whole genome sequencing of endothelial cells differentiated from patient derived induced pluripotent stem cells (iPSCs) of a hemophilic donor, transiently treated with a Cre recombinase, a RecF8 recombinase, and untreated cells. The dataset contains fastq files with all sequencing reads passing the standard quality filtering.	PromethION	3
EGAD00001007930	Mutation analysis of 77 frequently mutated genes in NSCLC in plasma DNA and corresponding PBMCs of NSCLC patients under ICI using the AVENIO Expanded Kit.	Illumina HiSeq 4000 NextSeq 550	516
EGAD00001007931	Anonymised patient metadata and associated data dictionary. For further information regarding this dataset, please contact Alexander Mentzer at contact@combat.ox.ac.uk.		611
EGAD00001007932	SmartSeq2 RNAseq data from 16 samples. For further information regarding this dataset, please contact Julian Knight and Alexander Mentzer at contact@combat.ox.ac.uk.	NextSeq 500	16
EGAD00001007933	mRNA capture seq of uterotubal lavage samples	Illumina NovaSeq 6000	74
EGAD00001007934	Whole exome and RNASeq raw sequencing data for a cohort 24 patients with non-small cell lung cancer, 15 adenocarcinoma (8 female, 7 male) and 9 squamous cell carcinoma(5 female, 4 male). Median age at diagnosis was 69. Tumour tissue and PBMCs were used for whole exome sequencing and RNA sequencing. This data was generated as part of a study funded by a Cancer Research UK Centres Network Accelerator Award Grant (A21998).	Illumina NovaSeq 6000	83
EGAD00001007935	Whole-exome sequencing (~250X coverage) of primary GBM tumours and matched patient-derived organoids and normal blood. Samples from two spatially distinct regions of seven tumours from five patients (five primary, two recurrent).	Illumina HiSeq 2500	29
EGAD00001007936	Single-cell RNA-seq of primary GBM tumours and matched patient-derived organoids and gliomasphere lines. Obtained using the 10X Genomics single-cell 3' expression solution (v2 chemistry). Primary samples and PDOs from 12 tumours from 10 patients (10 primary, two recurrent), and gliomasphere lines from a subset of five tumours. Samples were obtained from two spatially distinct regions of each tumour.	Illumina HiSeq 2500	99
EGAD00001007937	Single-cell whole-genome sequencing of primary GBM tumours and matched patient-derived organoids. Obtained using the 10X Genomics single-cell CNV solution. Samples from two spatially distinct regions of five tumours from three patients (three primary, two recurrent).	HiSeq X Ten	16
EGAD00001007938	CM214 - Biomarker Analysis From the Phase 3 CheckMate 214 Trial of Nivolumab Plus Ipilimumab (N+I) or Sunitinib (S) in Advanced Renal Cell Carcinoma (aRCC)	Illumina HiSeq 2500 Illumina NovaSeq 6000	213
EGAD00001007939	This dataset contains samples from 9 patients with embryonal rhabdomyosarcoma. 9 samples have whole exome tumor data (one has multiple). 7 samples have tumor RNAseq data. 1 sample has matched normal dna sequence data	Illumina HiSeq 2000	10
EGAD00001007940	Whole exome sequencing (WES) data of paired (germline and leukemic) samples of 60 adult patients affected by acute myeloid leukemia.	Illumina HiScanSQ Illumina HiSeq 1000	120
EGAD00001007941	Whole exome sequencing (WES) data of paired (germline and leukemic) samples of 100 adult patients affected by acute myeloid leukemia. Targeted sequencing data of myeloid-related genes of 21 leukemia (not paired) samples from adult patients affected by acute myeloid leukemia.	Illumina HiScanSQ Illumina HiSeq 1000 Illumina HiSeq 2500 Illumina MiSeq NextSeq 550	221
EGAD00001007942	This is raw sequencing data, analysis of which is presented in the paper "Sensitivity to Immune Checkpoint Blockade and Progression-Free Survival is associated with baseline CD8+ T cell clone size and cytotoxicity", DOI: https://doi.org/10.1101/2020.11.15.383786	Illumina HiSeq 4000	134
EGAD00001007943	Intellance-2: rRNA-minus RNA-seq	Illumina NovaSeq 6000	224
EGAD00001007944	Intellance-2: TruSight Tumor 170 panel based RNA-seq	NextSeq 500	222
EGAD00001007945	Intellance-2: TruSight Tumor 170 panel based DNA-seq	NextSeq 500	216
EGAD00001007946	57 Bone marrow specimens for 5 healthy bone marrow and 24 CML samples profiled with 10X scRNA-seq 5' upon separation using MACS for CD34.	Illumina HiSeq 2000 Illumina HiSeq 3000 Illumina HiSeq 4000	57
EGAD00001007947		Illumina HiSeq 4000	3
EGAD00001007948	Transcriptome sequencing of rhabdoid tumor tissue, organoids and SMARCB1-reconstituted organoids	Illumina HiSeq 4000 Illumina NovaSeq 6000	6
EGAD00001007949	Cancer RNA-seq consisting of FASTQ single-end reads from 1 colon-cancer individual RNA-seq was performed on illumina This dataset contains reads from a single region.	Illumina HiSeq 3000	1
EGAD00001007950	Cancer and germline exomes consisting of FASTQ reads from 6 individuals (4 melanoma, 1 lung and 1 colon cancer). Exome sequencing was performed on illumina with a depth of 100x to 200x. 2 Melanoma datasets contain reads from 2 different tumor regions 2 Melanoma datasets contain reads from 1 tumor region and from a tumor derived cell line 1 Melanoma dataset contains reads from 2 healthy tissues Colon and lung datasets contain both 1 matched germline-tumor pair	Illumina HiSeq 4000	17
EGAD00001007951	Cancer RNA-seq consisting of FASTQ paired-end reads from 6 individuals (4 melanoma, 1 lung cancer). RNASEQ was performed on illumina, Truseq capture kit, 40M-80M clusters. 2 Melanoma datasets contain reads from 1 tumor region and from a tumor derived cell line 2 Melanoma, 1 Colon and 1 lung datasets contain each reads from a single region.	Illumina HiSeq 4000	6
EGAD00001007952	This dataset consists of RNA-seq data from human monocyte-derived macrophages that were subjected to siRNA treatment targeting RAD21 and either left untreated, or stimulated with LPS. In total, it includes 24 samples.	NextSeq 550	24
EGAD00001007953	This dataset consists of ATAC-seq data from human monocytes, monocyte-derived dendritic cells or monocyte-derived macrophages as well as monocyte-derived cells that were subjected to siRNA treatment targeting CTCF or RAD21. In total, it includes 39 samples.	NextSeq 550	39
EGAD00001007954	This dataset consists of ChIP-seq data from human monocytes, monocyte-derived dendritic cells as well as monocyte-derived cells that were subjected to siRNA treatment targeting CTCF or RAD21. ChIP-sequencing was done for H3K27, RAD21 and CTCF. In total, the data set includes 120 samples.	NextSeq 550	120
EGAD00001007955	This dataset consists of in situ HiC-seq data from human monocytes, monocyte-derived dendritic cells as well as monocyte-derived cells that were subjected to siRNA treatment targeting CTCF or RAD21. In total, the data set includes 42 samples.	NextSeq 550	42
EGAD00001007956	This dataset consists of RNA-seq data from human monocytes, monocyte-derived dendritic cells or monocyte-derived macrophages as well as monocyte-derived cells that were subjected to siRNA treatment targeting CTCF or RAD21. In total, it includes 63 samples.	NextSeq 550	63
EGAD00001007957	Bulk RNAseq data from whole blood. For further information regarding this dataset, please contact Katie Burnham and Andew Kwok at contact@combat.ox.ac.uk.	Illumina NovaSeq 6000	144
EGAD00001007958	Cellular DNA damage caused by reactive oxygen species is repaired by the base excision repair (BER) pathway which includes the DNA glycosylase MUTYH. Inherited biallelic MUTYH mutations cause predisposition to colorectal adenomas and carcinoma. However, the mechanistic progression from germline MUTYH mutations to MUTYH-Associated Polyposis (MAP) is incompletely understood. Here, we sequenced normal cell DNAs from 10 individuals with MAP and study the somatic mutation burden and mutational signatures.	Illumina NovaSeq 6000	1
EGAD00001007959	gVCF file per patient obtained from the bulk/mini-bulk RNAseq data. For further information regarding this dataset, please contact Stephen Sansom and Alexander Mentzer at contact@combat.ox.ac.uk.		228
EGAD00001007960	fastq and filtered fasta files for B-cell receptor sequencing. For further information regarding this dataset, please contact Rachael Bashford-Rogers at contact@combat.ox.ac.uk.	Illumina MiSeq	96
EGAD00001007961	fastq and filtered fasta files for T-cell receptor sequencing. For further information regarding this dataset, please contact Rachael Bashford-Rogers at contact@combat.ox.ac.uk.	Illumina MiSeq	91
EGAD00001007962	Raw Illumina sequencing data and CellRanger BAM output files. For further information regarding this dataset, please contact Stephen Sansom at contact@combat.ox.ac.uk.	Illumina NovaSeq 6000	10
EGAD00001007963	Raw Illumina sequencing data from single-cell ATACSeq experiments. For further information regarding this dataset, please contact Julian Knight and Tatjana Sauka-Spengler at contact@combat.ox.ac.uk.	Illumina NovaSeq 6000	1
EGAD00001007964	Raw Illumina sequencing data. For further information regarding this dataset, please contact Rachael Bashford-Rogers at contact@combat.ox.ac.uk.	Illumina NovaSeq 6000	10
EGAD00001007965	Raw Illumina sequencing data. For further information regarding this dataset, please contact Benjamin Fairfax and Rachael Bashford-Rogers at contact@combat.ox.ac.uk.	Illumina NovaSeq 6000	10
EGAD00001007966	WGS data set used in the study, 2 samples	Illumina HiSeq 2500	2
EGAD00001007967	RNAseq data set used in the study, 10 samples	Illumina HiSeq 2500	10
EGAD00001007968	WGBS data set used in the study, 96 samples	Illumina HiSeq 2000 Illumina NovaSeq 6000	96
EGAD00001007969	We investigated 10 female and 14 male SARS-CoV-2 positive children (age range: 0.8 to 18 years). Based on the WHO guidelines, 15 patients were classified as having mild COVID-19, while 7 children were classified as moderate COVID-19 cases. Two children were asymptomatic. 8 female and 10 male SARS-CoV-2 negative children were included as controls (age range: 4 to 16). 12 SARS-CoV-2 positive female and 9 male adults were included ( age range: 27 - 76) together with 13 female and 10 male SARS-CoV-2 negative adult controls (age: 24 - 77). 10 adult COVID-19 patients had mild disease, while 12 had moderate COVID-19. We performed single-cell RNA sequencing experiments.	Illumina NovaSeq 6000	86
EGAD00001007970	The dataset contains transcriptomic information of 36 oral potentially malignant disorders (OPMD), 14 fibroepithelial polyps (FEP), and 6 early stage oral squamous cell carcinoma (OSCC) from the Asian population. Total RNA was extracted from formalin-fixed paraffin embedded (FFPE) tissue sections. RNA libraries were prepared using the NEB NextUltra RNA kit with Illumina Ribo-Zero rRNA removal as per manufacturer’s instructions. RNA sequencing was performed on the HiSeq2500 platform to generate paired-end 150 nucleotides reads and with a coverage of 50 million reads per sample. Uploaded bam files have been mapped to the GRCh38 human genome using TopHat2. Clinical and demographic data for these patients are available from the associated publications or by request.	Illumina HiSeq 2500	10
EGAD00001007971	The study will use RNA sequencing to aid in benchmarking different culture conditions in a set of genetically annotated human organoid lines. The data will be used to assess whether there are any clonal differences introduced when culturing these lines in different conditions.	Illumina HiSeq 4000	1
EGAD00001007972	We analyzed the cell free DNA methylomes using 30 plasma samples from patients with localized prostate cancer in the CPC-GENE project. Methylation was profiled using the methylated DNA immunoprecipitation coupled to next generation sequencing (MeDIP) technology.	HiSeq X Ten	30
EGAD00001007973	Exome sequencing on HiSeq platform of 36 brain metastases with matched normal samples, 32 having matched RNA seq. Published Saunus et al J Path, (2015); https://doi.org/10.1002/path.4583		106
EGAD00001007975	The data contains paired-end fastq files of 1440 single cells transcriptome sequencing data from 4 Celiac disease patients. CD4+ T cells were sorted by HLA-DQ-gluten tetramers carrying four immunodominant gluten epitopes. All single cell libraries were constructed following SmartSeq2 and sequenced on Illumina NextSeq 500.	NextSeq 500	1440
EGAD00001007976	DNA methylation sequencing profiles of 1538 breast tumors and 244 normal breast tissues. Libraries were prepared using a custom Reduced Representation Bisulfite Sequencing pipeline. Sequencing was performed on the Illumina HiSeq 2500 (v4 chemistry), with single-end reads of 125 bp length. Multiplexing was conducted at the level of 8 samples per lane. FASTQ files are provided for 1538 breast tumors and 244 normal breast tissues. Reference: Batra et al. (2021). DNA methylation landscapes of 1538 breast cancers reveal a replication-linked clock, epigenomic instability and cis-regulation.	Illumina HiSeq 2500	1782
EGAD00001007977	There are 64 NSCLC samples, including pre-treatment, post-treatment, and normal samples, sequenced by whole genome sequencing technology and archived in bam files.	Illumina HiSeq 2500	64
EGAD00001007978	Neurofibromatosis type 1 (NF1) is caused by loss-of-function variants in the NF1 gene. Approximately 10% of these variants affect RNA splicing and are either missed by conventional DNA diagnostics or are misinterpreted by in silico splicing predictions. Therefore, a targeted RNAseq-based approach was designed to detect pathogenic RNA splicing and associated pathogenic DNA variants. an in-house developed tool (QURNAS) was used to calculate the enrichment score (ERS) for each splicing event. RNA enrichment of NF1 and SPRED1 was done using SPET (NUGEN - NF1 only) and using SureSelect (Agilent - NF1 and SPRED1).	NextSeq 500	47
EGAD00001007979	Biomarkers to identify patients without benefit from adding everolimus to endocrine treatment in metastatic breast cancer (MBC) are needed. We report the results of the Pearl trial conducted in five Belgian centers assessing 18F-FDG-PET/CT non-response (n=45) and ctDNA detection (n=46) after 14 days of exemestane-everolimus (EXE-EVE) to identify MBC patients who will not benefit. Metabolic non-response rate was 66.6%. Median PFS in non-responding patients (using as cut-off 25% for SUVmax decrease) was 3.1 months compared to 6.0 months in those showing response (HR: 0.77, 95% CI: 0.40-1.50, p=0.44). Difference was significant when using a “post-hoc” cut-off of 15% (PFS 2.2 months vs 6.4 months). ctDNA detection at D14 was associated with PFS: 2.1 months vs 5.0 months (HR-2.5, 95% CI: 1.3-5.0, p=0.012). Detection of ctDNA and/or the absence of 18F-FDG-PET/CT response after 14 days of EXE-EVE identifies patients with a low probability of benefiting from treatment. Independent validation is needed.	Ion Torrent S5 XL	126
EGAD00001007980	The dataset includes 144 BAM files of WGS, WES, and RNA-seq data from primary and PDOX samples analyzed in Smith et al, Acta Neuropathologica, 2020 (PMID: 32519082).	unspecified	144
EGAD00001007981	Patients with idiopathic, heritable, or drug-induced pulmonary arterial hypertension (referred to throughout as PAH) were recruited from expert centers across the UK as part of the PAH Cohort study (www.ipahcohort.com). In each case, diagnosis was confirmed by right heart catheterization following established international guidelines, which remained unchanged for the duration of this study. Healthy volunteers were recruited at the same centers and samples processed using the same standard operating procedure at all sites. All individuals gave written, informed consent with local ethical committee approval. Whole blood (3 ml) was collected in Tempus Blood RNA Tubes, and RNAseq was performed using established Illumina methodologies (see online supplement for further details).	Illumina HiSeq 4000	359
EGAD00001007982	Single-cell analysis of the transcriptome, T cell immune receptors, and surface proteome (CITE-seq) from peripheral blood mononuclear cells (PBMCs) of COVID-19 patients with pre-existing autoimmune diseases (rheumatoid arthritis n = 5, psoriasis n = 4, or multiple sclerosis n = 3), as well as COVID-19 patients without pre-existing autoimmunity as controls (n = 10) to investigate altered immune responses.	Illumina NovaSeq 6000	27
EGAD00001007983	Whole genome sequence of Philippine Ayta Magbukon. A total of 5 individuals.	Illumina NovaSeq 6000	5
EGAD00001007989	WXS sequence data from 112 samples, RNA-seq sequence data from 117 samples, all sequence data are raw sequence data in fastq format, sequenced by Illumina platform.	Illumina HiSeq 2500	121
EGAD00001007990	The TIGER samples dataset contains PISA cohort samples which consist of paired RNA-seq and genotyping array data. It contains 127 RNA-seq pair-end samples in fastq format and 127 individuals genotypes in PLINK format.	unspecified	127
EGAD00001007991	BAM files containing paired-end mtDNA sequencing data from morphologically normal human liver. Clonal CCO-deficient patches of hepatocytes were identified in human liver samples, and samples were taken along a line spanning approximately from the portal triad to the central hepatic vein. Individual BAM files are named according to their patch, line and cut, where cut 1 is nearest to the portal triad, and cuts 2, 3 etc. lying further from the portal triad. Other file types include "Bulk" samples, contain sequencing data of the remaining CCO-deficient cells that were not sampled as part of the line of cuts, and "Stroma" control samples (used for identifying germ-line variants). Sequenced on NextSeq 500 platform.	NextSeq 500	319
EGAD00001007992	We studied 44 rectal cancer patients enrolled onto a prospective population-based biomarker study, who were planned for curative-intent radiation therapy before definitive surgery, yet at high risk of metastatic progression beyond the pelvic cavity. The patients had full-length mtDNA sequencing of whole blood (WB) and peripheral blood mononuclear cells (PBMC), sampled at the time of diagnosis. Metastatic events were recorded up to 60 months of follow-up after completion of the multimodal treatment.	Illumina MiSeq	66
EGAD00001007993	This dataset contains 10 fastq files from 10 cell lines (4 cell lines from 3 patients and 6 cell lines from 4 controls) that have undergone 50bp single end sequencing with PolyA enrichment strategy (BGI project number HUMpcsN). *please note one of the samples (Patient_4) was named in error and should be corrected to Patient_3 during analysis	unspecified	10
EGAD00001007995	COVID-19 scRNA-seq, TCR-seq and BCR-seq for 291 samples collected from 109 patients. Among 291 samples, 249 of them have two libraries (sequencing runs) for each assay, while 42 have only one library.	Illumina NovaSeq 6000	37
EGAD00001007996	scRNAseq data of scrambled and siRNA-mediated knock-down (96h) of the minor spliceosome snRNA U6atac in androgen-sensitive LNCaP cells and in patient derived neuroendocrine organoids (PM154). Three replicates for each cell line.	Illumina NovaSeq 6000	12
EGAD00001007997	Cellular DNA damage caused by reactive oxygen species is repaired by the base excision repair (BER) pathway which includes the DNA glycosylase MUTYH. Inherited biallelic MUTYH mutations cause predisposition to colorectal adenomas and carcinoma. However, the mechanistic progression from germline MUTYH mutations to MUTYH-Associated Polyposis (MAP) is incompletely understood. Here, we sequenced normal cell DNAs from 10 individuals with MAP and study the somatic mutation burden and mutational signatures.	Illumina NovaSeq 6000	24
EGAD00001007998	ATACseq FASTq files from RT4 cells treated with KDM5i C70	Illumina HiSeq 4000	4
EGAD00001007999	RNAseq FASTq files from RT4 cells treated with FGFRi Erdafitinib	Illumina NovaSeq 6000	6
EGAD00001008000	RNAseq FASTq files from RT4 cells treated with KDM5i C70	Illumina NovaSeq 6000	6
EGAD00001008001	single-cell RNAseq FASTq files for three muscle-invasive bladder tumors	Illumina HiSeq 2500	12
EGAD00001008002	RNAseq FASTq files from 31 post-treatment tumors from PURE01	Illumina HiSeq 2500	31
EGAD00001008003	RNAseq FASTq files from 82 pre-treatment tumors from PURE01	Illumina HiSeq 4000	82
EGAD00001008004	Retinoblastoma is a rare childhood cancer of the retina. We studied retinoblastoma by Targeted Sequencing.	Illumina MiSeq	51
EGAD00001008005	Human skin samples were obtained from HS patients after informed consent (Ethical vote, University of Würzburg; No. 306/12). Lesional and perilesional were taken and epidermis and dermis separated. Isolated epidermal keratinocytes were further processed for RNA isolation. mRNA was extracted from five pairwise-matched lesional and perilesional epidermal HS pellets and RNA sequencing was performed.	NextSeq 500	10
EGAD00001008006	Study metadata, containing the clinical information on samples and patients		125
EGAD00001008007	Raw Illumina sequencing data and CellRanger BAM output files. For further information regarding this dataset, please contact Stephen Sansom at contact@combat.ox.ac.uk.	Illumina NovaSeq 6000	10
EGAD00001008008	Linker file for COMBAT CITEseq sequencing data. Links COMBAT sample IDs with sequencing pools and their associated raw sequence data. Sequence data can be found in the following datasets: ADT data: EGAD00001007962 GEX data: EGAD00001008007 VDJ (B-cell): EGAD00001007964 VDJ (T-cell): EGAD00001007965		140
EGAD00001008009	Other raw and processed phenotype data generated by the COMBAT consortium.		611
EGAD00001008010	the dataset contains Exome and RNA fastq files of Renal Cell Carcinoma patients, which belongs to "Integrated genomic analysis of tumor thrombus"/	Illumina HiSeq 2500 Illumina NovaSeq 6000	600
EGAD00001008011	This is a set of 20 10X Genomics Chromium WGS	Illumina HiSeq 2500	1
EGAD00001008012	Genome and transcriptome sequence data from a poorly differentiated chordoma of C1-C2 spine patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008013	Genome and transcriptome sequence data from an unspecified tissue chordoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008014	RNA-sequencing dataset of post-mortem human brain tissue of FTD patients with mutations in GRN, MAPT and C9orf72 and healthy controls.	Illumina HiSeq 2500 NextSeq 550	47
EGAD00001008015	We performed bulk RNA-sequencing on peripheral blood collected from 4,732 blood donors recruited as part of the INTERVAL study. Using these data, we mapped gene expression and splicing quantitative trait loci (QTLs). Then, we integrated these data with protein, metabolite and lipid QTLs in the same individuals. The study aimed to identify the shared genetic etiology across transcriptional phenotypes, molecular traits and health outcomes in humans.	Illumina NovaSeq 6000	1
EGAD00001008016	Dataset comprises of 84 bam files from exome sequencing data, including 40 tumor-normal pairs and 4 normal files. Each sample is numbered by the patient case ID such as 135, 156 and so on. The filenames are suffixed with "_tumor" and "_normal" to indicate tumor and normal bam files respectively.	Illumina HiSeq 2500	84
EGAD00001008020	To investigate intratumour heterogeneity and to better understand tumour evolution in neuroblastoma, we have performed a multi-region whole-exome sequencing on a total of 51 spatially separated tumor samples from 9 primary neuroblastomas (2 low-risk, 1 intermediate-risk and 6 high-risk) and 1 relapsed neuroblastoma. We also assessed the impact of chemotherapy on the clonal expansion by sequencing tumour regions from one medium risk and one high-risk tumour for which we had matched samples obtained at diagnosis and after chemotherapy.	Illumina HiSeq 4000	61
EGAD00001008021	Dataset contains whole mitochondrial DNA sequencing data in fastq format (Illumina MiSeq paired-end) of 62 samples, in total. Those samples include sequencing data of the endothelial cell populations of 10 different donors and of 26 early-passage iPSC clones derived thereof. Moreover, the dataset contains the data of 7 of those iPSC clones sequenced additionally in passage 30 and 50, each. Lastly, 4 iPSC clones were sequenced during directed cardiomyocyte differentiation, each at day 0, 5, and 15 of differentiation.	Illumina MiSeq	62
EGAD00001008022	This dataset was conceived to characterize the genomic differences among different types of follicular-like thyroid lesions. To do so, we performed whole exome sequencing experiments on human biopsies corresponding to nodular hyperplasias, follicular thyroid adenomas, follicular thyroid carcinomas and Follicular Variant Thyroid Gland Papillary Carcinomas.	Illumina NovaSeq 6000	54
EGAD00001008024	This dataset contains the raw sequencing data, in FASTQ format, for Hi-C assays from 17 primary prostate tissue samples. The sequencing data is paired-end, 150 bp sequencing data from an Illumina NovaSeq 6000 machine, and contains 5 benign tissue samples and 12 primary tumour samples from the Canadian Prostate Cancer Genome Network (CPC-GENE) project. Tumour samples have IDs starting with the "CPCG" prefix, and benign tissue samples have IDs starting with the "BP" suffix.	Illumina NovaSeq 6000	1
EGAD00001008026	Applying a refined m6A RNA immunoprecipitation method, we profiled the m6A epitranscriptome on 10 non-neoplastic lung (NL) tissues and 53 lung adenocarcinoma tumors.	NextSeq 500	126
EGAD00001008027	41 breast cancer patients with known functional homologous recombination status (matched normal and tumor genomes, n=82)		1
EGAD00001008029	The dataset comprises whole exome sequences from laser capture micro-dissected biopsies of 10 patients diagnosed with clear cell renal cell carcinoma. In total over 100 regions are sampled to allow 'focally exhaustive' sequencing and explore the limits of intra-tumoural heterogeneity.	Illumina HiSeq 4000	117
EGAD00001008030	The dataset comprises of 5' single cell RNA sequencing with TCR enrichment with 10x Genomics' Chromium technology of multiregional biopsies of human renal cell carcinomas. Biopsies from different tumour regions, the tumour-normal interface, normal kidney, normal adrenal, metastatic regions, peri-nephric fat, and peripheral blood were sequenced from 12 patients with kidney tumours.	Illumina HiSeq 4000 Illumina NovaSeq 6000	18
EGAD00001008031	Spatial transcriptome sequence data from HER2-positive human breast tumors obtained from the first generation of Spatial Transcriptomics arrays. The dataset contains 8 different tumors with 3 or 6 sections taken from each with paired-end sequencing.	NextSeq 550	36
EGAD00001008032	The rates and patterns of somatic mutation in normal tissues are largely unknown outside of humans. Comparative analyses can shed light on the diversity of mutagenesis across species and on long-standing hypotheses regarding the evolution of somatic mutation rates and their role in cancer and ageing. Here, we used whole-genome sequencing of 208 intestinal crypts from 56 individuals to study the landscape of somatic mutation across 16 mammalian species. We found somatic mutagenesis to be dominated by seemingly endogenous mutational processes in all species, including 5-methylcytosine deamination and oxidative damage. With some differences, mutational signatures in other species resembled those described in humans, although the relative contribution of each signature varied across species. Remarkably, the somatic mutation rate per year varied greatly across species and exhibited a strong inverse relationship with species lifespan, with no other life-history trait studied displaying a comparable association. Despite widely different life histories among the species surveyed, including ~30-fold variation in lifespan and ~40,000-fold variation in body mass, the somatic mutation burden at the end of lifespan varied only by a factor of ~3. These data unveil common mutational processes across mammals and suggest that somatic mutation rates are evolutionarily constrained and may be a determinant of lifespan.	HiSeq X Ten	36
EGAD00001008033	Whole exome sequencing from 51 patients with brain metastases from prostate cancer	Illumina NovaSeq 6000	235
EGAD00001008034	Bulk RNAseq data of scrambled and siRNA-mediated knock-down of the minor spliceosome snRNA U6atac in androgen-sensitive LNCaP cells (L), androgen-insensitive C4-2 (C) and 22Rv1 (R) cells and in patient derived neuroendocrine organoids PM154 (P).	Illumina NovaSeq 6000	32
EGAD00001008035	RNA-sequencing on neuroblastoma PDX model COG-N-519 treated with control miR-1283 and test miR-99b-5p mimics. Three samples from each of the treatment condition were analysed. Next-Seq platform was used for sequencing.	Illumina NovaSeq 6000	1
EGAD00001008036	This dataset contains raw data from polyA RNAseq, hybrid capture target TCR panel data, and bam files from whole exome sequencing on 39 tumors with matched germline blood.	Illumina HiSeq 2500 Illumina NovaSeq 6000	79
EGAD00001008037		Illumina HiSeq 2500	1
EGAD00001008038		Illumina HiSeq 2500	1
EGAD00001008039		Illumina HiSeq 2500	1
EGAD00001008040		Illumina HiSeq 2500	1
EGAD00001008041		Illumina HiSeq 2500	1
EGAD00001008042		Illumina HiSeq 2500	1
EGAD00001008043		Illumina HiSeq 2500	1
EGAD00001008044		Illumina HiSeq 2500	1
EGAD00001008045		Illumina HiSeq 2500	1
EGAD00001008046		Illumina HiSeq 2500	1
EGAD00001008047		Illumina HiSeq 2500	1
EGAD00001008048		Illumina HiSeq 2500	1
EGAD00001008049		Illumina HiSeq 2500	1
EGAD00001008050		Illumina HiSeq 2500	1
EGAD00001008051		Illumina HiSeq 2500	1
EGAD00001008052		Illumina HiSeq 2500	1
EGAD00001008053		Illumina HiSeq 2500	1
EGAD00001008054		Illumina HiSeq 2500	1
EGAD00001008055		Illumina HiSeq 2500	1
EGAD00001008056		Illumina HiSeq 2500	1
EGAD00001008057		Illumina HiSeq 2500	1
EGAD00001008058		Illumina HiSeq 2500	1
EGAD00001008059		Illumina HiSeq 2500	1
EGAD00001008060		Illumina HiSeq 2500	1
EGAD00001008061		Illumina HiSeq 2500	1
EGAD00001008062		Illumina HiSeq 2500	1
EGAD00001008063		Illumina HiSeq 2500	1
EGAD00001008064		Illumina HiSeq 2500	1
EGAD00001008065		Illumina HiSeq 2500	1
EGAD00001008066		Illumina HiSeq 2500	1
EGAD00001008067		Illumina HiSeq 2500	1
EGAD00001008068		Illumina HiSeq 2500	1
EGAD00001008069		Illumina HiSeq 2500	1
EGAD00001008070		Illumina HiSeq 2500	1
EGAD00001008071		Illumina HiSeq 2500	1
EGAD00001008072		Illumina HiSeq 2500	1
EGAD00001008073		Illumina HiSeq 2500	1
EGAD00001008074		Illumina HiSeq 2500	1
EGAD00001008075		Illumina HiSeq 2500	1
EGAD00001008076		Illumina HiSeq 2500	1
EGAD00001008077		Illumina HiSeq 2500	1
EGAD00001008078		Illumina HiSeq 2500	1
EGAD00001008079		Illumina HiSeq 2500	1
EGAD00001008080		Illumina HiSeq 2500	1
EGAD00001008081		Illumina HiSeq 2500	1
EGAD00001008082		Illumina HiSeq 2500	1
EGAD00001008083		Illumina HiSeq 2500	1
EGAD00001008084		Illumina HiSeq 2500	1
EGAD00001008085		Illumina HiSeq 2500	1
EGAD00001008086		Illumina HiSeq 2500	1
EGAD00001008087		Illumina HiSeq 2500	1
EGAD00001008088		Illumina HiSeq 2500	1
EGAD00001008089		Illumina HiSeq 2500	1
EGAD00001008090		Illumina HiSeq 2500	1
EGAD00001008091	We applied this signature to a 567-patient GC cohort to establish genomic-based molecular subtypes and then used a support vector machine to build a molecular subtype-based risk-scoring model. Both source code and supplementary datasets for risk score prediction are available at https://github.com/hwanglab/Yonsei_gastric_cancer_32genes.	Illumina NovaSeq 6000	45
EGAD00001008092	Lynch Syndrome (LS) is an autosomal dominant disease conferring a high risk of colorectal cancer due to germline heterozygous mutations in a DNA mismatch repair (MMR) gene. Although cancers in LS patients show elevated somatic mutation burdens, information on mutation rates in normal tissues and understanding of the trajectory from normal to cancer cell is limited. Here we whole-genome sequenced 152 crypts from normal and neoplastic epithelial tissues from LS patients. In normal tissues the repertoire of mutational processes and mutation rates were similar to those found in wild type individuals. A morphologically normal colonic crypt with an increased mutation burden and mutational signatures consistent with MMR deficiency was identified, which may represent a very early stage of LS pathogenesis. Phylogenetic tress of tumour crypts indicated that the most recent ancestor cell of each tumour was already MMR deficient and had experienced multiple clonal evolution cycles. This study demonstrates the genomic stability of epithelial cells with heterozygous germline MMR gene mutations and highlights important differences in the pathogenesis of LS from other colorectal cancer predisposition syndromes.	Illumina NovaSeq 6000	1
EGAD00001008094	Single-end bulk RNA sequencing results of cell lines derived from patients described with NGLY1 deficiency as well as parent and CRISPR edited controls. The cell lines represent 4 different cell types: fibroblasts, lymphoblastoid cells, induced pluripotent stem cells (iPSCs) and neural progenitor cells (NPCs.).	NextSeq 500	136
EGAD00001008095	This dataset contains whole genome sequencing data, based in BAM files of three trio members. These BAM files contain information of chromsomes 21, X, Y and mitochondrial.		3
EGAD00001008096	This dataset contains whole genome sequencing data, based in paired end Fastq files of three trio members.	Illumina HiSeq 2500	3
EGAD00001008097	This dataset contains whole genome sequencing data, based in VCF of three trio members.		3
EGAD00001008098	The dataset contains rearranged TCR‐α and TCR‐β genes of Ttet+/Tpat+, Ttet-/Tpat+ and Ttet-/Tpat- CD4+ cells from gut biopsies (exvivo) or that of T cell clones generated from gut biopsies (invitro) from 12 CeD patients.	Illumina MiSeq	36
EGAD00001008099	This dataset consists of 116 tumor and normal samples analyzed with whole exome sequencing on the HiSeq2500 instruments with 100bp paired-end reads as well as 760 tumor and normal samples analyzed with the PGDx elio tissue complete assay. The PGDx elio tissue complete assay is a hybrid capture approach targeting 500+ genes with sequencing on the NextSeq instruments with 150bp paired-end reads. The bam files provided have been adapter masked and contain duplicate reads.	Illumina HiSeq 2500 NextSeq 500	876
EGAD00001008100	May 2021 data update (fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	unspecified	17
EGAD00001008101	August 2021 data update (fastq) for reference epigenomes generated at Centre for Epigenome Mapping Technologies (Canadian Epigenetics, Environment and Health Research Consortium), Genome Sciences Center, B.C. Cancer Agency, Vancouver, Canada as part of the International Human Epigenome Consortium.	unspecified	13
EGAD00001008105	Exome sequencing samples from the Acute Care Flagship, Illumina sequencing.	Illumina HiSeq 4000	85
EGAD00001008106	Patients in IA cohort with PPIL4 mutations (Please see Supplementary Table 3 for clinical characteristics of the patients)	Illumina HiSeq 2000 Illumina HiSeq 2500	12
EGAD00001008107	A lymphocyte suffers many threats to its genome, including programmed mutation during differentiation, antigen-driven proliferation and residency in diverse microenvironments. After developing protocols for single-cell lymphocyte expansions, we sequenced whole genomes from 717 normal naive and memory B and T lymphocytes and hematopoietic stem cells. All lymphocyte subsets carried more point mutations and structural variants than haematopoietic stem cells – the extra mutations were mostly acquired during differentiation, with burdens higher in memory than naive lymphocytes, although T cells also had a higher rate of mutation accumulation throughout life. Off-target effects of immunological diversification accounted for most of the additional differentiation-associated mutations in lymphocytes. Memory B cells acquired, on average, 18 off-target mutations genome-wide for every one on-target IGV mutation during the germinal centre reaction. Structural variation was 16-fold higher in lymphocytes than stem cells, with ~15% of deletions being attributable to off-target RAG activity. Mutational processes associated with ultraviolet light exposure and other sporadic mutational processes generated hundreds to thousands of mutations in some memory lymphocytes. The mutation burden and signatures of normal B lymphocytes were broadly comparable to those seen in many B-cell cancers, suggesting that malignant transformation of lymphocytes arises from the same mutational processes active across normal ontogeny. The mutational landscape of normal lymphocytes chronicles the off-target effects of programmed genome engineering during immunological diversification and the consequences of differentiation, proliferation and residency in diverse microenvironments.	HiSeq X Ten	1
EGAD00001008108	This dataset contains single-cell RNA sequencing data from patients with thyroid cancer (n=7), multinodal Goiter (n=3) and healthy individuals (n=5). Mononuclear cells were taken from both the peripheral blood and the bone marrow compartments. We used a pooled single-cell design where multiple individuals were pooled in a single sample for sequencing (NextSeq 500-V2) and later demultiplexed using their genotype data. Associated metadata contains information on the phenotypes per individual, the pooling design and the linkage between the supplied files and sequenced pools. Due to limitations from EGA in uploading single-cell data, the raw fastq files were processed as follows: (i) I1/I2/R1/R2 fast files were concatenated over the different lanes. (ii) Concatenated I1 and I2 files were interleaved, as were the concatenated R1 and R2 files to generate two fastq files per pool containing all the information. To interleave the fastq files, the BBmap tool bbmap/reformat.sh was used, which can also be used to de-interleave the files.	NextSeq 500	7
EGAD00001008109	Full information about the T cell receptor (CR) variable regions found in the sequences of the vdj region. Columns: barcode is_cell contig_id high_confidence length chain v_gene d_gene j_gene c_gene full_length productive cdr3 cdr3_nt reads umis raw_clonotype_id raw_consensus_id		2
EGAD00001008113	Pancreatic cancer biopsies and matching normal controls from 10 patients were exome sequenced. The same biopsies and PDX models derived from these were also subject to RNA sequencing.	Illumina NovaSeq 6000 NextSeq 500	57
EGAD00001008114	This dataset contains CLL2 data used in FLTseq paper. The dataset contains the data of CITEseq, FLTseq, RaCHseq, Exonseq and bulk RNAseq.	NextSeq 500 PromethION	1
EGAD00001008115	the source data in VCF format of 46 patients primary malignant glioma cohort in Chinese population		45
EGAD00001008117	This dataset contains samples from 13 patients with osteosarcoma. 13 samples have whole exome tumor data. 12 samples have tumor RNAseq data. 3 samples have matched normal dna sequence data	Illumina HiSeq 2000	16
EGAD00001008118	This dataset contains 60 .bam files of shallow WGS data (~0.1X) from ovarian cancer cell lines. Sequencing reads were aligned to the 1000 Genomes Project GRCh37-derived reference genome using the BWA aligner (v.0.07.17; CRUK-CI alignment pipeline).	Illumina HiSeq 4000	60
EGAD00001008119	This dataset contains 148 .bam files of shallow WGS data (~0.1X) from OV04 PDX samples. Sequencing reads were aligned to the 1000 Genomes Project GRCh37-derived reference genome using the BWA aligner (v.0.07.17; CRUK-CI alignment pipeline).	Illumina HiSeq 4000	148
EGAD00001008120	This dataset contains tumor and normal whole exome DNA sequence data for a patient with neuroblastoma	Illumina HiSeq 2000	1
EGAD00001008121	This dataset contains 142 .bam files of shallow WGS data (~0.1X) from OV04 patient samples. Sequencing reads were aligned to the 1000 Genomes Project GRCh37-derived reference genome using the BWA aligner (v.0.07.17; CRUK-CI alignment pipeline).	Illumina NovaSeq 6000	142
EGAD00001008122	contain the raw data from scRNA, scATAC, genotyping.	Illumina NovaSeq 6000	4
EGAD00001008123	Paired tumor and normal WGS of primary neuroblastomas. This is an update of the „Berlin Neuroblastoma Dataset” (EGAS00001004022). This data was used for the analysis of circular RNA expression and regulation in neuroblastoma.	HiSeq X Ten	25
EGAD00001008124	Tumor Total RNA Seq data of primary neuroblastomas. This is an update of the „Berlin Neuroblastoma Dataset” (EGAS00001004022). This data was used for the analysis of circular RNA expression and regulation in neuroblastoma.	Illumina HiSeq 4000	105
EGAD00001008125	Hi-C sequencing data includes 5 samples collected from 4 B-ALL patients.	NextSeq 500	5
EGAD00001008126	Bulk RNAseq of human skeletal muscle RNAseq of FACS sorted human skeletal muscle cells scRNAseq of human skeletal muscle	Illumina HiSeq 2000 Illumina HiSeq 4000 NextSeq 500	41
EGAD00001008127	RNA sequencing of 32 primary head and neck squamous cell carcinoma (HNSCC) samples prior to treatment with neoadjuvant anti-PD-1 (n=6) or anti-PD-1 + anti-CTLA-4 (n=26) immunotherapy, and 30 paired on-treatment HNSCC samples (i.e. after neoadjuvant immunotherapy). RNA quantity used: 10ng. Library Preparation Kit: SMART Stranded Total RNA Seq Kit (Takara). Sequencing parameters: NovaSeq 6000, 2x 100 bp. File type: fastQ	Illumina NovaSeq 6000	62
EGAD00001008128	RNAseq FASTq files of 181 bulk pre-treatment and 14 post-treatment tumors from GO30140 Ph1b group A and F and 177 bulk pre-treatment tumors of IMbrave150 PhIII	Illumina HiSeq 2500	372
EGAD00001008129	WES FASTq files of 76 bulk pre-treatment tumors and 76 matched peripheral blood mononuclear cells from GO30140 group A	Illumina HiSeq 4000	152
EGAD00001008130	Clinical data from GO30140 group A and group F and IMBrave150 biomarker populations including gender, confirmed RECIST response by independent review forum (IRF), overall survival (OS), progression survival by IRF, treatment group and treatment		1
EGAD00001008131	Standard RNA-Seq datasets. Check the associated paper for more details.	Illumina HiSeq 4000	584
EGAD00001008132	NuGen 99-Gene-Panel Targeted Sequencing of 574 DLBCL Cases of Non-China Cohort from Phoenix Clinical Trial. Check the Associated Publication for More Experimental Details.	Illumina HiSeq 4000	574
EGAD00001008133	To investigate intratumour heterogeneity and to better understand tumour evolution in neuroblastoma, we have performed a multi-region RNA sequencing on a total of 51 spatially separated tumor samples from 9 primary neuroblastomas (2 low-risk, 1 intermediate-risk and 6 high-risk) and 1 relapsed neuroblastoma. We also assessed the impact of chemotherapy on the clonal expansion by sequencing tumour regions from one medium risk and one high-risk tumour for which we had matched samples obtained at diagnosis and after chemotherapy.	Illumina HiSeq 4000	50
EGAD00001008134	RNAseq data set, panALL study, 16 samples	Illumina HiSeq 2500 Illumina NovaSeq 6000	16
EGAD00001008135	Oxidative bisulfite sequencing (oxBS-Seq) for APL	Illumina HiSeq 2500	1
EGAD00001008136	Whole genome bisulfite sequencing (WGBS) for APL	Illumina HiSeq 2500	1
EGAD00001008137	This dataset includes mutation profiling by Whole-exome sequencing of 3 upper urinary tract urothelial tumours (UTUC).	Illumina HiSeq 2000	6
EGAD00001008138	This dataset includes transcription profiling by RNA-seq of 3 upper urinary tract urothelial tumours (UTUC).	Illumina HiSeq 2000	3
EGAD00001008139	Whole-exome sequencing of 32 primary head and neck squamous cell carcinoma samples prior to treatment with neoadjuvant anti-PD-1 (n=6) or anti-PD-1 + anti-CTLA-4 (n=26) immunotherapy. DNA quantity used: 50ng. Library Preparation Kit: Twist Human Core Exome Plus (Twist Bioscience). Sequencing parameters: NovaSeq 6000, 2x 100 bp. File type: fastQ.	Illumina NovaSeq 6000	64
EGAD00001008140	ATAC-seq profiling bam files from colorectal carcinoma and adenoma.	NextSeq 500	1207
EGAD00001008141	Transcriptomic data for five patients with breast cancer undergoing neoadjuvant chemotherapy and hyperpolarised 13C-MRI for early response assessment	Illumina HiSeq 2500	10
EGAD00001008142	Metagenomics data for "Combined Metabolic Activators Reduces Liver Fat in Nonalcoholic Fatty Liver Disease Patients". Samples were sequenced on NovaSeq6000(NovaSeq Control Software 1.7.0/RTA v3.4.4) with a 151nt (Read1)-10nt(Index1)-10nt(Index2)-151nt(Read2) setup using ‘NovaSeqXp’ workflow in ‘S4’ mode flow cell.	Illumina NovaSeq 6000 unspecified	189
EGAD00001008143	SNP data for Ovarian cancer PRS (cases)		217
EGAD00001008144	SNP data for 313 loci required for calculation of the Breast cancer PRS		-
EGAD00001008145	SNP data of 28 sites required for the Ovarian cancer PRS (controls)		-
EGAD00001008146	APL nanopore sequencing data are deposited into 2 data formats: 1. CRAM files 2. h5 files	GridION	1
EGAD00001008147	This dataset contains BAM files for 9 samples from individuals involved in a retrospective IVF trial. The BAM files were derived from whole genome sequencing. The 9 individuals consist of three trios containing mother, father and child samples.	Illumina NovaSeq 6000	9
EGAD00001008149	This dataset comprises complete exome data from from the study PMID27216186 (Harbst & Lauss et al, Cancer Research 2016). These data are from 49 samples (tumor and matched normal) from 8 patients representing multi-region sequencing of human melanoma. Files are in the BAM format and contain aligned and processed data used for e.g. somatic variant calling. The sequencing libraries were constructed using SureSelect target enrichment with Clinical Research Exome Panel (Agilent) and sequenced on a HiSeq2500 (Illumina).		1
EGAD00001008150	Four PAIRED WGS samples, tumor and control, were sequenced on a HiSeq X Ten and the library preparation kit used was Illumina TruSeq Nano DNA. The tumor was multiple myeloma from bone marrow.	HiSeq X Ten	4
EGAD00001008151	Raw fast5 file of Oxford Nanopore sequencing for an APL patient sample	GridION	1
EGAD00001008152	RNA-Seq data for systematic gene fusion detection in Pediatric Cancer		1
EGAD00001008153	smallRNA sequencing from healthy individuals and MCI patients, along with phenotypic information.	Illumina HiSeq 2000	145
EGAD00001008155	Intratumoral heterogeneity is a critical frontier in understanding how the tumor microenvironment (TME) propels malignant progression. Here, we deconvolute the human pancreatic TME through large-scale integration of histology-guided regional multiOMICs with clinical data and patient-derived preclinical models. We discover subTMEs, histologically definable tissue states anchored in fibroblast plasticity, with regional relationships to tumor immunity, subtypes, differentiation, and treatment response. Reactive subTMEs rich in complex but functionally coordinated fibroblast communities were immune-hot and inhabited by aggressive tumor cell phenotypes. The matrix-rich deserted subTMEs harbored less activated fibroblasts and tumor- suppressive features yet were markedly chemoprotective and enriched upon chemotherapy. SubTMEs originated in fibroblast differentiation trajectories and transitory states were notable both in single cell transcriptomics and in situ. The intratumoral co-occurrence of subTMEs produced patient-specific phenotypic and computationally predictable heterogeneity tightly linked to malignant biology. Therefore, heterogeneity within the plentiful, notorious pancreatic TME is not random but marks fundamental tissue organizational units.	Illumina HiSeq 2500 unspecified	29
EGAD00001008156	To investigate intratumour heterogeneity and to better understand tumour evolution in neuroblastoma, we have performed a multi-region targeted re-sequencing on a total of 140 samples from 9 primary neuroblastomas (2 low-risk, 1 intermediate-risk and 6 high-risk) and 2 relapsed neuroblastoma.	Illumina HiSeq 4000	141
EGAD00001008157	This dataset contains targeted sequencing data of 204 surgical samples from resected NSCLC. Genomic profiling identifies five predictive biomarkers, which is then integrated into the Multiple-gene INdex to Evaluate the Relative benefit of Various Adjuvant therapies (MINERVA) score. The MINERVA score categorizes patients into three subgroups with relative disease-free survival and overall survival benefits from either adjuvant gefitinib or chemotherapy. This study demonstrates that predictive genomic signatures could potentially stratify resected EGFR-mutant NSCLC patients and provide precise guidance towards future personalized adjuvant therapy.		204
EGAD00001008158	Illumina-based RNA-Seq analysis of 93 liver samples. Biopsies of tumors and non-tumor tissues are included. Samples are stratified by response and non-response to TACE treatment.	Illumina HiSeq 2500	93
EGAD00001008159	Single cell RNA sequence generated from human primary nasal epithelium differentiated at air-liquid interface. This dataset comprises tissue from two donors, with cultures either unexposed or exposed to SARS-CoV2. Libraries were prepared using the 10X Genomics pipeline as per manufacturer's instructions.	NextSeq 500	24
EGAD00001008160	RNA sequences of a total of 24 samples, including 13 unrelated ASD patients (13 males) and 11 adult individuals of Spanish origin as controls (4 males, 7 females). The RNAseq study was conducted on a HiSeq 4000 (Illumina) and paired-end sequences were obtained (fastq files).	Illumina HiSeq 4000	24
EGAD00001008161	We performed single-cell RNA-sequencing of cells in the bronchoalveolar lavage (BAL) fluid of late severe COVID-19. This study provides detailed insights into the alveolar macrophage response to SARS-CoV-2 infection and reveals a profibrotic macrophage response in severe COVID-19 patients.	Illumina NovaSeq 6000	5
EGAD00001008162	Whole exome sequencing data from 90 diagnostic lymphoma samples run and published as part of the Leukemia manuscript. 133 total exomes were sequenced including tumour and normal controls. Copy number array data was also generated for 95 patients.	Illumina HiSeq 2500	91
EGAD00001008163	ADAPTeR study RNAseq from multi-region samples taken pre and post nivolumab.	Illumina HiSeq 4000	52
EGAD00001008164	ADAPTeR study WES from multi-region samples taken pre and post nivolumab	Illumina HiSeq 4000	72
EGAD00001008165	ADAPTeR study multi-region TCRseq of pre and post nivolumab tumour and PBMC samples	NextSeq 500	234
EGAD00001008166	ADAPTeR study scRNA and scTCR data from TILs from two ccRCC patients treated with nivolumab	NextSeq 550	64
EGAD00001008182	All 122 HCC biopsies and 115 non-tumoral tissues from 114 patients were subjected to whole-exome sequencing. Whole-exome capture was performed using the SureSelectXT Clinical Research Exome (Agilent Technologies) or SureSelect Human All Exon V6+COSMIC (Agilent Technologies) platforms according to the manufacturer’s guidelines. Sequencing was performed on an Illumina HiSeq 2500 at the Genomics Facility Basel according to the manufacturer’s guidelines. Paired-end 101-bp reads were generated.	Illumina HiSeq 2500	237
EGAD00001008183	This dataset includes TruSeq paired-end, total RNA sequencing data from primary B-precursor acute lymphoblastic leukaemia (B-ALL) xenografts. It comprises 43 pairs of matched bone marrow (BM) and central nervous system (CNS) human leukaemia cells from individual immunodeficient mice. Xenografts were generated from 6 patients with B-ALL and include samples taken at diagnosis and relapse from 3 of 6 patients.	Illumina HiSeq 2500 Illumina HiSeq 4000	86
EGAD00001008184	Targeted sequencing using BD Rhapsody with 462 mRNA of healthy young adult bone marrow mononuclear cells from iliac crest aspirations (BM3/Young3).	Illumina HiSeq 4000	1
EGAD00001008185	Shallow targeted sequencing with 462 mRNA and 97 antibodies of AML patient’s bone marrow mononuclear cells from iliac crest aspirations from. Please note raw and integrated gene expression data, cell type annotation, metadata and dimensionality reduction are available as Seurat v3 objects through figshare. Access link is https://doi.org/10.6084/m9.figshare.14780127.v1 AMLQ4_SMK1 AML314 male AMLQ1_SMK2 AML116 female AMLQ3_SMK3 AML127 female AMLQ6_SMK4 AML183 male AMLQ2_SMK5 AML327 female AMLQ5_SMK6 AML334 male APLQ5_SMK7 APL124 male APLQ3_SMK8 APL142 male APLQ6_SMK9 APL218 female APLQ4_SMK10 APL147 male APLQ2_SMK11 APL223 female APLQ1_SMK12 APL224 female	Illumina NovaSeq 6000	1
EGAD00001008186	Whole transcriptome sequencing using BD Rhapsody with 97 antibodies of a healthy young adult bone marrow (Young3/BM3) mononuclear cells from iliac crest aspirations. Please note raw and integrated gene expression data, cell type annotation, metadata and dimensionality reduction are available as Seurat v3 objects through figshare. Access links is https://figshare.com/s/901bcddb9ee18e226031.	Illumina HiSeq 4000	1
EGAD00001008187	SmartSeq2 read out of index cultured cell sorted with the classification and erythroid-myeloid panel developed in main Figure 6.	Illumina HiSeq 4000	10
EGAD00001008188	Targeted sequencing using BD Rhapsody with 462 mRNA and 97 antibodies of healthy young and aged adult as well as AML bone marrow mononuclear cells from iliac crest aspirations. Please note raw and integrated gene expression data, cell type annotation, metadata and dimensionality reduction are available as Seurat v3 objects through figshare. Access link is https://figshare.com/s/0fda29b169c719223ee3.	NextSeq 500	9
EGAD00001008189	Targeted sequencing using BD Rhapsody with 462 mRNA and 197 antibodies of healthy young adult bone marrow mononuclear cells from iliac crest aspirations. Please note raw and integrated gene expression data, cell type annotation, metadata and dimensionality reduction are available as Seurat v3 objects through figshare. Access link is https://figshare.com/s/313b5739ff469dac8c01	Illumina HiSeq 4000	2
EGAD00001008190	Targeted sequencing with 462 mRNA and 97 antibodies of fresh, frozen and stored on ice (6h) healthy adult blood cells. Multiplexed sample fresh thawed ice SMK1-Frozen, SMK2-thawed, SMK3-fresh and Targeted sequencing with 462 mRNA and 197 antibodies of CD34+Immature cells Multiplexed sample CD34+ immature Abseq SMK4-CD38+CD45RA-, SMK5-CD38+CD45RA+, SMK6-CD38-CD45RA+/-	Illumina NovaSeq 6000	1
EGAD00001008191	Dataset consists of Oncomine Comprehensive Cancer Panel v3 sequencing of 16 tumor-normal mucosa pairs. Tumors include 8 sessile serrated lesions (SSL), 3 sessile serrated lesions with dysplasia (SSL/D), 2 traditional serrated adenomas (TSA) and 3 tubular adenoma s(TA).	Ion Torrent S5	32
EGAD00001008192	Samples prepared using TruSeq Stranded Total RNA library kit. Samples sequenced on a HiSeq 2000.	Illumina HiSeq 2000	32
EGAD00001008193	Exome sequencing of panALL exome data set, total of 1948 samples	Illumina HiSeq 2500	598
EGAD00001008194	This dataset contains paired-end RNA sequencing data of blood CD34+ cells from random blood donors (155 paired-end FastQ files). The data were used to perform expression quantitative trait locus (eQTL) analysis.	Illumina NovaSeq 6000	155
EGAD00001008195	Variation and transmission of the human gut microbiota across generations - 16S sequencing data	Illumina HiSeq 2500	102
EGAD00001008196	In this study, we sequenced multiple stages of the B-lineage in elderly individuals and patients with lymphoplasmacytic lymphoma to study whether mutations are accumulated in normal-cell counterparts prior to lymphoma	Illumina NovaSeq 6000	73
EGAD00001008197	We isolated naive and memory CD4+ T cells from 119 healthy individuals and stimulated the cells using anti-CD3/anti-CD28 coated beads. We profiled gene expression using single cell RNA-seq (10X-Genomics 3’ v2 kit) at resting state and three time points of activation (16h, 40h and 5 days post stimulation) and mapped expression quantitative trait loci.	Illumina HiSeq 4000 Illumina MiSeq	167
EGAD00001008198		Illumina NovaSeq 6000	1
EGAD00001008199		Illumina NovaSeq 6000	1
EGAD00001008200	This dataset contains the exome sequencing data (BAMs, VCFs and CNVs) from 5 schwannoma tumors from the same patient.	Illumina NovaSeq 6000	5
EGAD00001008201	This dataset include FASTQ files of 808 samples from GCAT cohort. Technology used HiSeq 4000, read length 150 bp, inner mate distance 300 bp. For each sample the paired -ends are generated in separated files. Each FASTQ is splitted in multiple LANEs and grouped by the Multiplex index.	Illumina HiSeq 4000	1
EGAD00001008202	This dataset include BAM files of 808 samples from GCAT cohort. Technology used HiSeq 4000, read length 150 bp, inner mate distance 300 bp. For each sample the paired -ends are generated in separated files. Each FASTQ is splitted in multiple LANEs and grouped by the Multiplex index.		1
EGAD00001008203			-
EGAD00001008204	RNA-seq dataset (BAM files) of 28 HCCs and 19 non-tumor livers derived from 8 patients undergoing sorafenib treatment.	Illumina HiSeq 2500	47
EGAD00001008205	RNA-sequencing of 122 hepatocellular carcinoma biopsies and 15 normal liver biopsies. RNA-seq library prep was performed with 200 ng total RNA using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero Gold (Illumina) according to manufacturer’s specifications. Single-end 126-bp sequencing was performed on an Illumina HiSeq 2500 using v4 SBS chemistry at the Genomics Facility Basel according to the manufacturer’s guidelines. Primary data analysis was performed with the Illumina RTA version 1.18.66.3.	Illumina HiSeq 2500	137
EGAD00001008207	Variation and transmission of the human gut microbiota across generations - shotgun sequencing data	Illumina HiSeq 2500	102
EGAD00001008208	The dataset is composed by processed whole genome sequencing data generated from 53 children and their respective parents, forming 49 trios (mother, father and child) and 2 quartets (mother, father and two siblings). A total of 18 children were born after spontaneous conception (n = 18); 17 children were born after in vitro fertilization (IVF) and another 18 children were born after intracytoplasmic sperm injection combined with testicular sperm extraction (ICSI-TESE)	unspecified	155
EGAD00001008210	This dataset contains raw genotypes ( SNVs, Indels and SVs), from 785 samples,without applying any filter, from the 808 WGS GCAT cohort.		1
EGAD00001008211	Whole exome sequencing of Diffuse intrinsic pontine gliomas, primary patient derived DIPG cell cultures and isogenic trametinib resistant clones	Illumina NovaSeq 6000	56
EGAD00001008212	RNA sequencing of Diffuse intrinsic pontine gliomas, primary patient derived DIPG cell cultures and isogenic trametinib resistant clones	NextSeq 500	28
EGAD00001008213	RNA sequencing of 79 libraries were prepared, from sample of Osteosarcoma tumors biopsied at diagnosis, with TruSeq Stranded mRNA kit following recommendations: the key steps consist of PolyA mRNA capture with oligo dT beads using 1µg total RNA, fragmentation to approximately 400pb, DNA double strand synthesis, and ligation of Illumina adaptors amplification of the library by PCR for sequencing. Libraries sequencing was performed using Illumina sequencers (NextSeq 500 or Hiseq 2000/2500/4000) in 75 bp paired-end mode.	Illumina HiSeq 4000	79
EGAD00001008214	This is the RNAseq data from mucosal biopsies.	Illumina HiSeq 2000	440
EGAD00001008215	This is the dataset of 16S data from mucosal biopsies.	Illumina MiSeq	833
EGAD00001008216	Dataset consists of 25 HCCs and 9 non-tumor livers from 8 patients.	Illumina HiSeq 2500	34
EGAD00001008218	Whole genome sequencing for single cells for library A96228B 1125 samples; filetype=bam	HiSeq X Five	6
EGAD00001008219	Whole genome sequencing for single cells for library A90679 1034 samples; filetype=bam	Illumina HiSeq 2500	4
EGAD00001008220	Whole genome sequencing for single cells for library A95618A 876 samples; filetype=bam	HiSeq X Five	9
EGAD00001008221	Whole genome sequencing for single cells for library A95628B 1335 samples; filetype=bam	HiSeq X Five	5
EGAD00001008222	Whole genome sequencing for single cells for library A95632D 644 samples; filetype=bam	Illumina HiSeq 2500	5
EGAD00001008223	Whole genome sequencing for single cells for library A95635B 593 samples; filetype=bam	Illumina HiSeq 2500	4
EGAD00001008224	Whole genome sequencing for single cells for library A95635D 367 samples; filetype=bam	Illumina HiSeq 2500	5
EGAD00001008225	Whole genome sequencing for single cells for library A95654A 901 samples; filetype=bam	HiSeq X Five	5
EGAD00001008226	Whole genome sequencing for single cells for library A95662A 637 samples; filetype=bam	Illumina HiSeq 2500	5
EGAD00001008227	Whole genome sequencing for single cells for library A95664B 630 samples; filetype=bam	Illumina HiSeq 2500	5
EGAD00001008228	Whole genome sequencing for single cells for library A95707A 1359 samples; filetype=bam	HiSeq X Five	9
EGAD00001008229	Whole genome sequencing for single cells for library A95724A 503 samples; filetype=bam	NextSeq 550	5
EGAD00001008230	Whole genome sequencing for single cells for library A95724B 581 samples; filetype=bam	NextSeq 550	5
EGAD00001008231	Whole genome sequencing for single cells for library A95732B 655 samples; filetype=bam	HiSeq X Five	5
EGAD00001008232	Whole genome sequencing for single cells for library A96145A 642 samples; filetype=bam	HiSeq X Five	9
EGAD00001008233	Whole genome sequencing for single cells for library A96146A 1195 samples; filetype=bam	HiSeq X Five	5
EGAD00001008234	Whole genome sequencing for single cells for library A96149A 751 samples; filetype=bam	HiSeq X Five	5
EGAD00001008235	Whole genome sequencing for single cells for library A96149B 1792 samples; filetype=bam	HiSeq X Five	5
EGAD00001008236	Whole genome sequencing for single cells for library A96155B 1028 samples; filetype=bam	HiSeq X Five	5
EGAD00001008237	Whole genome sequencing for single cells for library A96157C 938 samples; filetype=bam	HiSeq X Five	7
EGAD00001008238	Whole genome sequencing for single cells for library A96162B 1316 samples; filetype=bam	HiSeq X Five	7
EGAD00001008239	Whole genome sequencing for single cells for library A96165A 779 samples; filetype=bam	HiSeq X Five	7
EGAD00001008240	Whole genome sequencing for single cells for library A96172A 1476 samples; filetype=bam	HiSeq X Five	7
EGAD00001008241	Whole genome sequencing for single cells for library A96172B 1694 samples; filetype=bam	HiSeq X Five	5
EGAD00001008242	Whole genome sequencing for single cells for library A96174B 1447 samples; filetype=bam	HiSeq X Five	7
EGAD00001008243	Whole genome sequencing for single cells for library A96175C 1473 samples; filetype=bam	HiSeq X Five	5
EGAD00001008244	Whole genome sequencing for single cells for library A96177B 683 samples; filetype=bam	HiSeq X Five	9
EGAD00001008245	Whole genome sequencing for single cells for library A96179B 1396 samples; filetype=bam	HiSeq X Five	7
EGAD00001008246	Whole genome sequencing for single cells for library A96180B 1189 samples; filetype=bam	HiSeq X Five	5
EGAD00001008247	Whole genome sequencing for single cells for library A96183C 1036 samples; filetype=bam	HiSeq X Five	5
EGAD00001008248	Whole genome sequencing for single cells for library A96184A 1733 samples; filetype=bam	HiSeq X Five	7
EGAD00001008249	Whole genome sequencing for single cells for library A96186A 850 samples; filetype=bam	HiSeq X Five	5
EGAD00001008250	Whole genome sequencing for single cells for library A96186C 525 samples; filetype=bam	HiSeq X Five	5
EGAD00001008251	Whole genome sequencing for single cells for library A96199B 1170 samples; filetype=bam	HiSeq X Five	5
EGAD00001008252	Whole genome sequencing for single cells for library A96201A 536 samples; filetype=bam	HiSeq X Five	5
EGAD00001008253	Whole genome sequencing for single cells for library A96205A 1907 samples; filetype=bam	HiSeq X Five	7
EGAD00001008254	Whole genome sequencing for single cells for library A96210C 1191 samples; filetype=bam	HiSeq X Five	5
EGAD00001008255	Whole genome sequencing for single cells for library A96211C 1397 samples; filetype=bam	HiSeq X Five	5
EGAD00001008256	Whole genome sequencing for single cells for library A96215A 1312 samples; filetype=bam	HiSeq X Five	5
EGAD00001008257	Whole genome sequencing for single cells for library A96216A 1001 samples; filetype=bam	HiSeq X Five	9
EGAD00001008258	Whole genome sequencing for single cells for library A96220B 1267 samples; filetype=bam	HiSeq X Five	5
EGAD00001008259	Whole genome sequencing for single cells for library A96244A 1374 samples; filetype=bam	HiSeq X Five	5
EGAD00001008260	Whole genome sequencing for single cells for library A98176A 1003 samples; filetype=bam	HiSeq X Five	5
EGAD00001008261	Whole genome sequencing for single cells for library A98176B 1389 samples; filetype=bam	HiSeq X Five	5
EGAD00001008262	Whole genome sequencing for single cells for library A98284A 1265 samples; filetype=bam	HiSeq X Five	4
EGAD00001008263	Whole genome sequencing for single cells for library A98284B 1582 samples; filetype=bam	HiSeq X Five	2
EGAD00001008264	Whole genome sequencing for single cells for library A98289B 1032 samples; filetype=bam	HiSeq X Five	5
EGAD00001008265	Whole genome sequencing for single cells for library A98293B 703 samples; filetype=bam	HiSeq X Five	5
EGAD00001008266	Whole genome sequencing for single cells for library A98294A 994 samples; filetype=bam	HiSeq X Five	5
EGAD00001008267	Whole exome sequencing of a Chinese girl with congenital cataract. The dataset contains one sample with two fastq files. The novel PAX6 mutation (c.221G>A) is associated with congenital cataract, and the WFS1 mutation (c.2070_2079del) interactively aggravates this process.	Illumina HiSeq 2000	1
EGAD00001008268	panALL exome sequencing, data set2, 700 samples	Illumina HiSeq 2500	700
EGAD00001008269	Whole exome, shallow whole genome, and RNA-sequencing data from a cohort of 168 women with breast cancer receiving neoadjuvant chemotherapy.	Illumina HiSeq 4000	336
EGAD00001008270	This dataset contains raw count data (10X CellRanger) for 28 Hodgkin lymphoma samples and 5 reactive lymph nodes, and merged data from all samples (RData object) including cell cluster assignments.		34
EGAD00001008271	Targeted sequencing of candidate regions on chromosome 22q predisposing to multiple schwannomas	NextSeq 550	51
EGAD00001008272			1
EGAD00001008273	Hypertensive disorders in pregnancy, of which the multisystem syndrome pre-eclampsia is the most severe, leading to preterm delivery, maternal mortality, and life-long complications. To elucidate early disease dynamics, we present the first spatio-temporal study comparing single-nuclei transcriptomes of human preterm pre-eclamptic placentae and healthy controls, contextualizing this in a comprehensive study including early and late gestational placentae. This study includes early placentae samples from the fetal part (villi; n=10), maternal part (Decidua; n=3), late placentae samples from healthy pregnancies, villi (n=6), decidua (n=4), and late placentae samples from early-onset preeclamptic pregnancies, villi (n=5) and decidua (n=5).	Illumina HiSeq 4000	22
EGAD00001008274	Whole exome sequencing of localized prostate cancer patients in this study contained pair tumor-normal samples and validated tumor content.	Illumina NovaSeq 6000	6
EGAD00001008275	Fastq files for the 16S rDNA amplicon library of 714 fecal samples of 20 time series (as described in Vandeputte et al. 2021, Nature Communications)	Illumina MiSeq	714
EGAD00001008276	This data set contains whole exome sequencing (WXS) and RNA-Seq on germline BRCA- mutant tumors from 18 patients. BAM files are provided for WXS on tumor and germline samples. FASTQ files are provided for the RNA-Seq samples. Sequencing was done on an Illumina Hi-Seq 2500.	Illumina HiSeq 2000	38
EGAD00001008277	Whole genome sequencing of cell free DNA from CSF across timepoints from medulloblastoma clinical trial patients.	Illumina NovaSeq 6000	534
EGAD00001008278	We performed RNA-Seq in DIPG and hemispheric HGG.	Illumina HiSeq 2500	40
EGAD00001008279	We performed whole exome sequencing in DIPG and hemispheric HGG.	Illumina HiSeq 2500 Ion Torrent Proton	30
EGAD00001008280		Illumina HiSeq 4000	287
EGAD00001008281	Activating mutations in PIK3CA generate large clones in the aging human esophagus. Here we investigate the underlying cellular mechanisms regulating their expansion by lineage tracing. Expression of an activating heterozygous Pik3caH1047R mutation in single progenitor cells of the mouse esophagus tilts cell fate towards proliferation, generating mutant clones that outcompete their wild type neighbours. The mutation leads to increased aerobic glycolysis through the activation of Hif1α transcriptional targets. In vitro and in vivo interventions that level out differences in activation of the PI3K/HIF1α/aerobic glycolysis axis between wild type and Pik3caH1047R cells attenuate the competitive advantage of the mutants. In contrast, metabolic conditions that alter Insulin/PI3K signalling, such as type-1 diabetes or diet-induced insulin resistance, further increase Pik3caH1047R mutant competitiveness in mice. Consistently, the density of activating PIK3CA mutations in human esophagus is increased in overweight individuals. We conclude that the metabolic environment influences the mutational landscape of normal epithelia. Clinically feasible interventions that even out signalling imbalances between wild type and mutant cells may therefore limit the expansion of oncogenic mutants in normal tissues.	Illumina HiSeq 2500	157
EGAD00001008282	This dataset contains samples from 5 patients with ewings sarcoma. 5 samples have whole exome tumor data. 1 sample has tumor RNAseq data. 4 samples have matched normal dna sequence data	Illumina HiSeq 2000	9
EGAD00001008283	This dataset contains samples from 5 patients with wilm's tumor. 5 samples have whole exome tumor data. 4 samples have tumor RNAseq data. 2 samples have matched normal dna sequence data	Illumina HiSeq 2000	7
EGAD00001008284	16 DS and Control brain samples were prepared using the 10X Single Cell 3' v3 kit. Pre-fragmented libraries were selectively amplified for APP using custom designed primers.	Sequel	16
EGAD00001008285	16 DS and Control brain samples were prepared using the 10X Single Cell 3' v3 kit. Pre-fragmented libraries were selectively amplified for SPP1 using custom designed primers.	Sequel	16
EGAD00001008286	16 DS and Control brains samples were prepared using the 10X Genomics Single Cell 3' v3 kit. Pre-fragmented cDNA libraries were loaded onto a Pacific Biosciences Sequel II to sequence single-nucleus isoforms.	Sequel	16
EGAD00001008287	29 DS and Control brain samples were prepared using the 10X Genomics Single Cell 3' v3 kit. The cDNA libraries were sequenced on an Illumina Novaseq 6000.	Illumina NovaSeq 6000	29
EGAD00001008288	16 DS and Control brain samples were prepared using the 10X Single Cell 3' v3 kit. Pre-fragmented libraries were selectively amplified for BIN1 using custom designed primers.	Sequel	16
EGAD00001008289	Whole-genome sequence (WGS) data of tumor-normal pairs from 139 ATL patients and RNA sequence (RNA-seq) data of tumors from 28 ATL patients.	HiSeq X Ten Illumina HiSeq 2000 Illumina NovaSeq 6000	139
EGAD00001008290	panALL exome data set3, 650 samples	Illumina HiSeq 2500	650
EGAD00001008291	The is dataset includes the whole exome sequencing of the tumor from a sinonasal glomangiopericytoma case together with the matching blood. The whole exome sequencing revealed somatic PIK3CA and CTNNB1 mutations.	Illumina NovaSeq 6000	2
EGAD00001008297	Oligodendroglioma (WHO gr. 2)	Illumina NovaSeq 6000	1
EGAD00001008298	Oligodendroglioma, Anaplastic (WHO gr. 3	Illumina NovaSeq 6000	1
EGAD00001008299	Oligodendroglioma (WHO gr. 2	Illumina NovaSeq 6000	1
EGAD00001008300	Oligodendroglioma (WHO gr. 2	Illumina NovaSeq 6000	1
EGAD00001008301	Oligodendroglioma, IDH-mutant, 1p19q codeleted	Illumina NovaSeq 6000	1
EGAD00001008302	Oligodendroglioma, Anaplastic (WHO gr. 3	Illumina NovaSeq 6000	1
EGAD00001008303	Unknown	Illumina NovaSeq 6000	1
EGAD00001008304	Anaplastic Astrocytoma, IDH-mutant	Illumina NovaSeq 6000	1
EGAD00001008305	Astrocytoma (WHO gr. 2)	Illumina NovaSeq 6000	1
EGAD00001008306	Oligodendroglioma (WHO gr. 2)	Illumina NovaSeq 6000	1
EGAD00001008307	Astrocytoma, Anaplastic (WHO gr. 3	Illumina NovaSeq 6000	1
EGAD00001008308	Oligodendroglioma (WHO gr. 2)	Illumina NovaSeq 6000	1
EGAD00001008309	Sequencing data (BAM/CRAM) of diagnosis-relapse pairs of 12 children who relapsed very early, followed by a deep-sequencing validation of all identified mutations.	Illumina HiSeq 4000 NextSeq 500 unspecified	88
EGAD00001008310	BAM files containing paired-end mtDNA sequencing data from human esophageal samples of individuals that had progressed to dysplasia or developed Barrett's esophagus (BE) post-esophagectomy. BE biopsies and the background mucosa were analysed. Each patient (JE*) has associated mtDNA sequencing data from biopsies of stroma, BE and squamous and cardia tissue. Two technical replicates, denoted "A" and "B", were analysed for each biopsy. Libraries were sequenced via the Illumina MiSeq platform v2 (Illumina, San Diego, CA, USA) 300 cycles (150 nt paired-end).	Illumina MiSeq	80
EGAD00001008311	Single-cell count data generated by the Cellranger (10X Genomics).	HiSeq X Ten	57
EGAD00001008314	This study is multi-omics study of a Asian longitudinal metastatic breast cancer (MBC) cohort treated with palbociclib plus endocrine therapy. It contains NGS of baseline (BL) and progressive disease (PD) from 70 patients, consisting of 79 tumor/normal matched whole exome sequencing (WES) from 62 patients and 90 tumor whole transcriptome sequecing samples (WTS) from 70 patients. There were 56 BL biopsies profiled by WES and 64 by WTS; 23 PD biopsies were profiled by WES and 26 by WTS. Twenty and 23 patients had paired BL and PD biopsies profiled by WES and WTS, respectively.	Illumina HiSeq 2500	228
EGAD00001008315	This dataset contains FASTQ files generated from MT amplicon sequencing of 159 Sudanese individuals.	HiSeq X Ten	159
EGAD00001008316	The dataset contains 295 plasma cfDNA samples from various stages of resectable esophageal adenocarcinoma from the PERFECT cohort and the nCRT cohort. Shallow WGS was performed on an Illumina Novaseq S4 PE150bp. Samples are provided as raw reads without any prior processing.	Illumina NovaSeq 6000	295
EGAD00001008317	Bisulfite sequencing of a 3kb region within the CpG island of the NR3C1 exone 1 was performed with Illumina Miseq. 24 samples from major hepatic or pancreatic surgery with complications (cases) and without complications (controls). The files are in FASTQ format.	Illumina MiSeq	24
EGAD00001008318	Total RNA sequencing (SMARTer Stranded Total RNA-Seq Kit v2) data of extracellular RNA (exRNA) from liquid biopsies of a BRC0004PR PDX and SK-N-BE(2C) CDX mouse model, and total RNA sequencing profiles of the matching PDX tumors.	NextSeq 500	60
EGAD00001008319	The dataset contains sequencing data generated for the publication 'In utero origin of myelofibrosis presenting in adult monozygotic twins after a prolonged disease latency.	Illumina NovaSeq 6000	5
EGAD00001008320	This dataset includes Illumina RNA Sequencing Data for 59 chronic lymphocytic leukemia patient samples. 57 samples are single end, 2 samples are paired end sequencing.	NextSeq 500	59
EGAD00001008321	The dataset contains 106 lung cancer, 12 healthy control and 11 non-cancerous lesion plasma cfDNA sample. Shallow WGS was performed on an Illumina Novaseq S4 PE150bp. Samples are provided as raw reads without any prior processing.	Illumina NovaSeq 6000	129
EGAD00001008322	The dataset contains 6 lung cancer and 60 healthy control plasma cfDNA samples collected in EDTA, PAXGene and Norgen blood collection tubes at various locations. Shallow WGS was performed on an Illumina Novaseq S4 PE150bp. Samples are provided as raw reads without any prior processing.	Illumina NovaSeq 6000	66
EGAD00001008323	Illumina sequencing data (fastq files) representing single-nucleus (sn) ATAC-seq, snRNA-seq, bulk ATAC-seq, and snATACseq+snRNAseq multiomics data from human and rat skeletal muscle samples (19 libraries total). Includes a README file that describes the relationship between libraries, samples, and files.	Illumina NovaSeq 6000	15
EGAD00001008324	RNA-sequencing analysis was carried out on mesothelial cells isolated from peritoneal adhesion biopsies and compared to control human peritoneal mesothelial cells to identify a mesothelial-to-mesenchymal transition (MMT)-related gene signature in patients.	Illumina NovaSeq 6000	7
EGAD00001008325	In this study, we profiled single-cell transcriptome (10X genomics) of Patient-derived xenografts (PDX) T-ALL replase samples from P1 patient. Primary human T-ALL cells were recovered from cryopreserved bone marrow aspirates of patients enrolled in the ALL-BFM 2009 study. Patient-derived xenografts (PDX) were generated as previously described by intrafemoral injection of 1 Million viable primary ALL cells in NSG mice110 PDX-derived (P1)28 cells were frozen until processing. For scRNA-seq library preparation, cryopreserved cells were thawed rapidly at 37 ℃ and resuspended in 10 ml warm Roswell Park Memorial Institute (RPMI) medium with 100 μg/ml Dnase I. Cells were centrifuged for 5 mins at 300 g, and resuspended in ice-cold phosphate buffered saline (PBS) with 2% foetal bovine serum (FBS) and 5mM EDTA. Cells were stained on ice with anti-murine-CD45-PE (mCD45)(clone 30-F11; BioLegend; 1:20) in the dark for 30 mins. 1:100 DAPI was added and incubated in the dark for 5 mins before sorting. Triple negative cells (DAPI-mCD45-GFP-) were sorted (Fig. S27) using a BD FACSAria™ Fusion Cell Sorter into ice cold 0.03% bovine serum albumin (BSA) in PBS. All isolated cells were immediately used for scRNA-seq libraries, which were generated as per the standard 10x Genomics Chromium 3′ (v.3.1 Chemistry) protocol. Completed libraries were sequenced on a NextSeq5000 sequencer (HIGH-mode, 75 bp paired-end).	NextSeq 500	1
EGAD00001008326	Whole genome and Whole exome sequencing of patient-derived xenograft models of endometrial cancer	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000	50
EGAD00001008327	This dataset contains .fastq files generated by targeted DNA sequencing of 542 cancer-associated and cadidate genes (52 individuals), and targeted duplex sequencing of PIK3CA and TP53 genes (4 individuals).	HiSeq X Ten NextSeq 550	156
EGAD00001008329	Exome sequencing and amplicon-based single-cell sequencing dataset on the patients and family members that were analyzed in this study.	Illumina HiSeq 2500	11
EGAD00001008330	Set of 8 bam files from patients affected with Lupus. BAM alignments for exonic variants present in P2RY8 gene. VCF file describing the variants.	HiSeq X Ten	5
EGAD00001008331	To model recovery dynamics, using severe COVID-19 as the example, we align heterogeneous recovery trajectories via a novel computational scheme applied to longitudinally sampled blood transcriptomes. We thus generate pseudotime trajectories, which we then link to cellular and molecular mechanisms based on cell deconvolution analysis and molecular pathway prediction, thus presenting a unique framework for studying recovery processes over time.	NextSeq 500	258
EGAD00001008332	Tumor-blood paired whole-exome sequencing of 58 pairs of non-muscle-invasive bladder cancer samples (stageT1). Targeted sequencing of 112 non-muscle-invasive bladder cancer samples (34 stage T1; 78 stage Ta) Please note the following files have been removed: EGAR00003025153, EGAR00003025435, EGAR00003025294, EGAR00003025262, EGAR00003025224.	Illumina HiSeq 3000	339
EGAD00001008333	Small variants in mtDNA of several Canary Islanders sequenced with Illumina WGS and WES and Oxford Nanopore Technologies WGS.		36
EGAD00001008334	Genomic data from a cohort of 19 MMR deficient colorectal cancers and 1 MMR proficient colorectal cancer. All cases were target gene DNA sequenced using multiple primary and where available metastatic tumour regions from surgical resection samples.	Illumina NovaSeq 6000	91
EGAD00001008335	The dataset contains raw RNA-seq data of human adipocytes from 13 individuals.	Illumina HiSeq 3000	13
EGAD00001008336	The dataset include sequencing data from 23 patients diagnosed with metastatic melanoma. The 23 metastatic melanoma subtypes consisted of cutaneous melanoma (CM, n=10); head and neck melanoma (HNM, n=7); uveal melanoma (UM, n=4); acral lentiginous (AM, n=1) and mucosal melanoma (MM, n=1).	unspecified	23
EGAD00001008337	This project includes RNA-sequencing data from human FSHD and control skeletal muscle biopsies. This project includes data from 28 FSHD patients (total 37 samples, including vastus lateralis and tibialis anterior muscles) and 12 control individuals (total 24 samples, including vastus lateralis and tibialis anterior muscles).	Illumina HiSeq 4000 Illumina NovaSeq 6000 NextSeq 500	65
EGAD00001008338	38 samples with DCIS and matched recurrences sequenced with a targeted mutation panel on IonTorrent.	Ion Torrent PGM	76
EGAD00001008339	Mutational signatures in esophageal squamous cell carcinoma from eight countries of varying incidence – filtered vcf files		551
EGAD00001008340	Single-cell RNAseq dataset of paired normal and tumor human prostate biopsies from n=10 participants. Fastq files corresponding to R1, R2 and I1 are uploaded and were generated from cellranger mkfastq. Data was sequenced on Illumina HiSeq 4000.	Illumina HiSeq 4000	24
EGAD00001008341	Whole genome sequencing from paired tumour and germline malignant pleural mesothelioma samples	HiSeq X Ten	74
EGAD00001008342	Acne meta-analysis		1
EGAD00001008343	Patient neuroblastoma hybrid capture sequencing panel. 5 samples from 2 donors (BAM files). For each donor, we obtained neuroblastoma tumor samples and neuroblastoma ALKi resistant samples. This dataset was used to study ALKi resistance in neuroblastoma.	Illumina MiSeq	5
EGAD00001008344	Enriched tumor epithelium, tumor-associated stroma, and whole tissue were collected by laser microdissection from thin sections across spatially separated levels of ten high-grade serous ovarian carcinomas (HGSOCs) and analyzed by mass spectrometry, reverse phase protein arrays, and RNA sequencing. Unsupervised analyses of protein abundance data revealed independent clustering of an enriched stroma and enriched tumor epithelium, with whole tumor tissue clustering driven by overall tumor “purity.” Comparing these data to previously defined prognostic HGSOC molecular subtypes revealed protein and transcript expression from tumor epithelium correlated with the differentiated subtype, whereas stromal proteins (and transcripts) correlated with the mesenchymal subtype. Protein and transcript abundance in the tumor epithelium and stroma exhibited decreased correlation in samples collected just hundreds of microns apart. These data reveal substantial tumor microenvironment protein heterogeneity that directly bears on prognostic signatures, biomarker discovery, and cancer pathophysiology and underscore the need to enrich cellular subpopulations for expression profiling.	Ion Torrent S5 XL	49
EGAD00001008345	Using the chromium 3' expression assay, we generated an atlas of neuroblastoma and the human fetal adrenal gland. These data were complemented with whole genome sequencing of normal and tumour DNA from the neuroblastoma samples.	Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina MiSeq Illumina NovaSeq 6000	15
EGAD00001008346	Raw RNAseq paired end fastq files of MCL control samples (3 samples) and MCL samples transduced with a retrovirus expressing mutated NOTCH1 (3 samples) or NOTCH2 (3 samples). Instrument used: Illumina NovaSeq 6000	Illumina NovaSeq 6000	1
EGAD00001008347	We profiled 4 high-grade gliomas patient brain tumor samples by single-cell ATAC-seq using the 10X Chromium 3' technology. The raw fastq and index files are provided.	Illumina NovaSeq 6000	4
EGAD00001008348	We profiled 9 high-grade gliomas patient tumor samples by bulk RNA-seq. The raw fastqs are provided.	Illumina HiSeq 2000 Illumina HiSeq 4000 Illumina NovaSeq 6000 unspecified	9
EGAD00001008349	We profile 10 high-grade gliomas patient brain tumor samples by single-cell multiome ATAC + gene expression, using the 10X Chromium technology. 3 sets of fastq are provided for each samples: R1 and R2 for gene expression, R1 and R2 for ATAC-seq as well as index1 and index2 for ATAC-seq.	Illumina NovaSeq 6000	10
EGAD00001008350	We profiled 15 patient brain tumor samples by ChIP-seq. Inputs are provided for 16 samples, H3K27ac is provided for 15 samples, H3K27me3 is provided for 10 samples and H3K27me3 is provided for 5 samples. The raw bam files are provided.	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina NovaSeq 6000	52
EGAD00001008351	We profiled 7 high-grade gliomas patient brain tumor samples by single-cell RNA-seq and 18 single-nuclei RNA-seq using 10X Chromium 3' techonology. The raw fastq files are provided.	Illumina HiSeq 4000 Illumina NovaSeq 6000	25
EGAD00001008352	WGS data of buffy coat from CRC patients	Illumina HiSeq 4000	7
EGAD00001008353	The data contained in this dataset is ChipSeq BAM files aligned to reference genome hg38. The ChipSeq was based on a combination of six histone modifications as follows: H3K4me1, H3K4me3, H3K9me3, H3K27me3, H3K27Ac and H3K36me3. The samples are patient-derived xenografts generated by passaging primary patient CD138+ selected cells through the SCID-rab myeloma mouse model.	unspecified	42
EGAD00001008354	50 Whole genome sequences from 50 Mexican individuals with a high proportion of Native American ancestry.	HiSeq X Ten	50
EGAD00001008356	RNA sequencing data of in vitro differentiated megakaryocyte cells transduced with E527K and WT SRC. CD34+ hematopoietic stem cells (HSC) were isolated from healthy controls before transduction with WT-SRC and E527K-SRC lentiviral vectors in triplicate and differentiation to MK. Three replicates each of two pools were generated for both WT and E527K SRC transduced cells, resulting in 3 WT pool 1 samples, 3 WT pool 2 samples, 3 E527K pool 1 samples and 3 E527K pool 2 samples for a total of 12 samples. RNA was extracted and sequenced with following parameters: Platform: Illumina HiSeq4000, Library Prep Kit: TruSeq stranded mRNA, Sequencing Kit: Illumina HiSeq4000 100 cycles (76-8-8-7), Fragments: single end / fr-firststrand.	Illumina HiSeq 4000	12
EGAD00001008357	Briefly, twenty paired tumor and germline DNAs were extracted from patients’ BM and from buccal mucosa, respectively. Samples were subjected to massively parallel sequencing using the HiSeq 2000, HiSeq2500, HiSeq X Ten, and/or NovaSeq 6000 according to the manufacturer’s instructions. Sequencing reads were aligned to NCBI Human Reference Genome Build 37 (hg19) by Burrows−Wheeler Aligner, version 0.7.10, with default parameters (http://bio-bwa.sourceforge.net/). PCR duplicates were eliminated using Picard tools version 1.39 (GATK).	Illumina HiSeq 2500	40
EGAD00001008358	In vitro and in vivo drug screens of tumor cells identify novel therapies for high-risk child cancer	HiSeq X Ten NextSeq 500	1
EGAD00001008359	WGS and WGBS data from monocyte-derived macrophages that were infected with Influenza A virus strain PR8WT, or a matching non-infected control.	HiSeq X Ten	70
EGAD00001008360	Mutational burden and profiles to be studied in approx. 500 human primary melanomas with matched normal samples, part of the Leeds melanoma cohort. New custom design targeted capture panel covering melanoma-specific copy number alterations, promoter mutations, gene fusions, coding genes, HLA regions and IFNg/JAK/STAT pathway genes.	Illumina HiSeq 4000	1
EGAD00001008361	RNA-seq transcriptomics of whole blood samples from longitudinal follow-up of a cohort of visceral leishmaniasis (VL) patients with and without HIV coinfection, from active disease through apparent cure and potential relapse. Analysis will identify potential correlates of relapse to identify immune mechanisms underlying the high rate of relapse in HIV/VL coinfection. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ .	Illumina HiSeq 4000	249
EGAD00001008362	27 Fresh frozen tissue specimens were crushed by mortar and pestle, homogenized using the QIAShredder kit (Qiagen), and genomic DNA and total RNA were extracted using the AllPrep DNA/RNA Mini kit (Qiagen), according to the manufacturer’s instructions. RNA libraries were synthesized using 200 ng of total RNA using the Ilumina TruSeq Stranded RNA LT Sample Prep Kit (Illumina), and subsequently sequenced on the NextSeq550 platform to a read depth of 80 million clusters and 160 million paired end reads (75 bp X 75 bp) using V2 chemistry.	NextSeq 550	27
EGAD00001008363	two tables containing RNASeq expression values to patients with RNA-Seq data in the study "Comprehensive genomic characterization of refractory multiple myeloma (HIPO_067)". From the bam files gene expression was calculated with the annotation of Gencode.v19. Raw Counts and TPM values are given in one table, the other contains filtered TMM normalized CPM values (genes < 1CPM omitted).		-
EGAD00001008364	Genomic profiling of effusion-based fluid samples from 8 HHV8-negative effusion-based lymphoma patients.	Illumina HiSeq 2500	8
EGAD00001008365	Captured single-cell long-read data of a cohort of CLL patients receiving VEN treatment for resistance study	PromethION	25
EGAD00001008366	CITEseq data of CLL patients receiving VEN treatment for resistance study	NextSeq 500	25
EGAD00001008367	Single-cell Long read data of a cohort of CLL patients receiving Venetoclax treatment for VEN resistance study.	PromethION	25
EGAD00001008368		Illumina MiSeq	56
EGAD00001008370	ATAC-seq dataset on a patient (P) presenting with defects of immunity and two (C5, C6) healthy donors. This dataset contains raw and processed files from ATAC-seq chromatin accessibility analysis. There are 3 single-read (50 bp) fastq files (1 per patient/ donor). Processed files consist of narrowPeak files (1 per patient/ donor) and one file that contains read counts in consensus peaks.	Illumina HiSeq 4000	3
EGAD00001008371	RNA-seq dataset on a patient (P) presenting with defects of immunity and three healthy donors (C1, C5, C6). This dataset contains raw and processed files from RNA-seq transcriptome analysis performed according to the Smart-seq2. There are 24 single-read (50 bp) fastq files, 6 per patient/donor consisting of 2 cell types and 3 replicates per cell type. There is one count matrix file generated using featureCounts against Ensembl v98 gene models.	Illumina HiSeq 4000	24
EGAD00001008372	scRNA-seq dataset on a patient (P) presenting with defects of immunity and four healthy donors (C1, C2, C3, C4). This dataset contains raw and processed files from scRNA-seq performed on samples using the 10x Genomics Chromium Controller with the Chromium Single Cell 3′ Reagent Kit (v3 chemistry). There are 15 paired-end fastq files (3 per patient/donor - I1, R1, R2) and 15 processed files generated with 10x Genomics Cell Ranger v3.0.2 software against GRCh38 human reference transcriptome (3 per patient/donor - barcodes.tsv.gz, features.tsv.gz, matrix.mtx.gz).	Illumina HiSeq 4000	5
EGAD00001008373	To gain insight into the clonal heterogeneity of diagnosis (Dx) and relapse (Re) pairs, we employed single-cell RNA-seq (SORT-seq) to longitudinally profile two t(8;21) (AML1-ETO = RUNX1-RUNX1T1), and four FLT3-ITD AML cases. All the samples are Bone marrow aspirates.	NextSeq 500	30
EGAD00001008374	To gain insight into the clonal heterogeneity of diagnosis (Dx) and relapse (Re) pairs, we employed RNA-seq to longitudinally profile two t(8;21) (AML1-ETO = RUNX1-RUNX1T1), and four FLT3-ITD AML cases. All the samples are bone marrow aspirates.	NextSeq 500	12
EGAD00001008375	To gain insight into the clonal heterogeneity of diagnosis (Dx) and relapse (Re) pairs, we employed whole exome sequencing to longitudinally profile two t(8;21) (AML1-ETO = RUNX1-RUNX1T1), and four FLT3-ITD AML cases. All the samples are bone marrow aspirates.	Illumina MiSeq	18
EGAD00001008376	105 Normal, DCIS and recurrences samples target-sequenced	Illumina HiSeq 2500	105
EGAD00001008377	The raw fastq files for 30 whole exome and 30 whole genome sequencing for normal endometrial glands. The paired-end sequencing data sets (R1 and R2) are deposited.	Illumina NovaSeq 6000	60
EGAD00001008379	KCL SNP array samples for copy number analysis	unspecified	96
EGAD00001008380	KCL lpWGS samples for copy number analysis	Illumina HiSeq 2500	33
EGAD00001008381	We used single-cell transcriptomics to study cells from the developing human cerebellum, and show that different molecular subgroups of medulloblastoma resemble distinct glutamatergic progenitors.	Illumina NovaSeq 6000	13
EGAD00001008382	Multi-region WES from 4 NSCLC patients, totaling 12 tumor samples and 4 matched control samples. The files were submitted as bam files.	Illumina HiSeq 2000	16
EGAD00001008383	This dataset consists of 60 mRNA sequencing runs from full blood of 31 myotonic dystrophy type 1 patients, of which for 27 patients reliable data is available before and after 10 months of cognitive behavioural therapy. >30 million 150 bp paired end reads were obtained with UMI-labeled adapters to facilitate filtering of PCR duplicates. Via UMI-analysis we found samples with the aliases sample_01 and sample_02 to contain a very high number of PCR duplicates and recommend the use of these samples only with highest caution or not at all.	Illumina NovaSeq 6000	60
EGAD00001008384	RNA Sequencing upon shRNA mediated depletion of RAF kinases or treatment with Cobimetinib (GDC-0973, 250nM, 6hrs) or with pan RAFi (AZ-628, 10uM, 6hrs)	Illumina HiSeq 2500	24
EGAD00001008385	Stage I and stage III/IV Follicular lymphoma samples, shallow whole genome sequencing for copy number analysis and targeted capture sequencing for mutation and translocation analysis.	Illumina HiSeq 4000	269
EGAD00001008386	Shallow whole genome sequencing and targeted sequencing of DLBCL patients treated in the PETAL trial	Illumina HiSeq 4000 Illumina NovaSeq 6000	216
EGAD00001008387	Shallow whole genome sequencing for copy number analysis and targeted capture sequencing data for translocation and mutation anslysis of paired primary and relapse PCNSL and PTL samples	Illumina HiSeq 4000	335
EGAD00001008389	Shallow whole genome sequencing and targeted sequencing of DLBCL patients treated in the HOVON84 trial	Illumina HiSeq 4000	220
EGAD00001008390	This dataset contains log2(TPM + 1) for 192 tumor samples profiled by RNA-seq for the entire transcriptome for samples originating from POPLAR (GO28753).		-
EGAD00001008391	This dataset contains log2(TPM + 1) for 699 tumor samples profiled by RNA-seq for the entire transcriptome for samples originating from OAK (GO28915).		-
EGAD00001008392	The purpose of this project is to provide public human datasets for the study of rare diseases. The use of public human genomic background combined with the in-silico insertion of real disease-causing variants enable to have a representative dataset for testing purposes without facing ethical and legal issues associated with the use of human sensitive data. This project aims to help development of technical implementations for rare disease data integration, analysis, discovery, and federated access.	Illumina HiSeq 2000	18
EGAD00001008393	Raw FASTQ files obtained by RNA sequencing of tumor samples from patients (age 12-29) with newly diagnosed, recurrent intermediate or high-grade sarcoma.	Illumina HiSeq 4000	26
EGAD00001008394	Raw FASTQ files obtained from whole exome sequencing (WES) of tumor samples from patients with newly diagnosed, recurrent intermediate or high-grade sarcoma.	Illumina HiSeq 4000	51
EGAD00001008396	Targeted next-generation sequencing (NGS) of 93 frequently mutated genes in breast cancer using the QIAseq Human Breast Cancer Targeted Panel (QIAGEN), which uses digital sequencing by incorporating unique molecular barcodes (UMI).	Illumina MiSeq NextSeq 550	187
EGAD00001008397	Paired end shallow whole genome sequencing (sWGS) data for the identification of genomewide somatic copy number alterations (SCNA) and the estimation of tumor fractions.	NextSeq 550	185
EGAD00001008398	Exome sequencing data of 24 Brugada syndrome individuals	NextSeq 500	23
EGAD00001008399	42 NGS libraries of a 13y/o FFPE sample, a tissue-and-patient-matched FF sample, and a GIAB sample (NA12878). In technical replicates (untreated DNA, treated DNA, two different library types, at least library duplicates for each case). Illumina NextSeq, HiSeq and NovaSeq paired-end sequencing.	Illumina HiSeq 4000 Illumina NovaSeq 6000 NextSeq 500	42
EGAD00001008400	116 Whole Genome Sequencing (WGS) samples from the TB-DAR study, based on a cohort of adult pulmonary tuberculosis patients recruited in Dar es Salaam, Tanzania. WGS was performed at the Health2030 Genome Center in Geneva on the Illumina NovaSeq 6000 instrument (Illumina Inc, San Diego CA, USA), starting from1μg of whole blood genomic DNA and using Illumina TruSeq DNA PCR-Free reagents for library preparation and the 150nt paired-end sequencing configuration. Average coverage was above 30X for 75 samples, between 10X and 30Xfor 40 samples, and approximately 8X for a single sample. Sequencing reads were aligned to the GRCh38 (GCA_000001405.15) reference genome using bwa (Version 0.7.17).	Illumina NovaSeq 6000	116
EGAD00001008401	128 samples with DCIS and matched recurrences sequenced with lpWGS	Illumina HiSeq 2500 Ion Torrent PGM	52
EGAD00001008402	small RNA next generation sequencing in head and neck cancer	Illumina HiSeq 2000	51
EGAD00001008403	This dataset contains raw sequencing reads in FASTQ format from single-nuclei (30 samples) and bulk tissue (40 samples) transcriptome sequencing of pheochromocytoma and paraganglioma tissue specimens. Additionally, data from single-nuclei sequencing of two normal adrenal medulla specimens is included.	Illumina HiSeq 2000 Illumina NovaSeq 6000	70
EGAD00001008405	Raw FASTQ files obtained from whole exome sequencing (WES) of normal samples from patients with newly diagnosed, recurrent intermediate or high-grade sarcoma.	Illumina HiSeq 4000	1
EGAD00001008407	RNAseq files for Klco RPAML study titled "Genomics of pediatric myeloid neoplasms"	Illumina HiSeq 2000	173
EGAD00001008408	RNA sequencing was performed on 15 T-LGLL patients and five control samples. The raw data is provided as fastq files.	unspecified	20
EGAD00001008409	Single-cell RNA sequencing was performed on viably frozen cells from 11 T-LGLL samples from 9 T-LGLL patients and 6 age-matched healthy samples. The raw data is available as fastq files.	Illumina NovaSeq 6000	92
EGAD00001008410	Nascent transcriptome (GRO-seq) data representing bone marrow mononuclear cells of two diagnostic T-ALL samples.	Illumina HiSeq 2000	2
EGAD00001008411	Organoid cultures derived from normal colon and/or colorectal adenomas and/or colorectal carcinomas. RNA and DNA was isolated from these cultures for genome wide profiling.	Illumina HiSeq 2500	6
EGAD00001008412	In this study, we identified miR-130a as a regulator of HSC self-renewal and differentiation. To characterize gene expression changes following enforced expression of miR-130a OE, we performed RNA-seq in CD34+ cord blood (CB) cells transduced with control and miR-130a OE lentiviruses. To capture miRNA targets in an unbiased, transcriptome-wide manner, we perfomed enhanced CLIPseq procol in 2 replicates of CD34+ CB cells and Kasumi-1 cell line, which represent a model system for t(8;21) AML. We chose this cell line, as we found miR-130a to be highly expressed in this AML subtype where it is critical for maintaining the oncogenic molecular program mediated by AML1-ETO. Chimeric Ago2 eCLIPseq in CD34+ CB cells combined with Mass Spectrometry data analysis identified TBL1XR1 as a principal target of miR-130a. To elucidate gene expression changes associated with TBL1XR1 loss of function, we performed RNA-seq in CD34+CD38- CB cells transduced with control and shRNA targeted against TBL1XR1. To determine the functional significance of high miR-130a expression levels in Kasumi-1 cells on the molecular network controlled by AML1-ETO, we performed CUT&RUN assay and RNA-seq in Kasumi-1 cells following miR-130a knock-down (KD). Collectively, our findings reveal a unique role of miR-130a in regulating normal hematopoietic stem cell self-renewal and how elevated levels of miR-130a in t(8;21) AML contribute to the leukemogenesis of this AML subtype.	Illumina HiSeq 2500 NextSeq 500	36
EGAD00001008413	WGS files for Klco RPAML study titled "Genomics of pediatric myeloid neoplasms"	Illumina HiSeq 2000	158
EGAD00001008415	This batch is a subset of the full DETECT-A dataset, containing 43963 fastq files generated from 3024 subjects. All sequencing was conducted using Illumina HiSeq 4000 and Illumina MiSeq platforms. Note that the division into batches follows no specific criteria and that the sequencing data for each subject has multiple files which may span multiple datasets. Thus, for a comprehensive analysis, it is recommended to request access to all datasets that comprise this study.	Illumina HiSeq 4000 Illumina MiSeq	4127
EGAD00001008416	WGS (tumor and germline samples) was performed to identify structural variants in the UBTF/CDX2 subgroup. RNA-seq was performed to detect gene fusion in the UBTF/CDX2 subgroup. HiChIP was performed to investigate 3D chromatin architecture and enhancer landscapes of representative patient samples and cell lines harboring Type I and II FLT3-PAN3 deletions and amplifications.	Illumina HiSeq 3000 Illumina NovaSeq 6000	15
EGAD00001008417	Transcriptomic profiling of skin biopsies from psoriasis patients following treatment with deucravacitinib		120
EGAD00001008418	To understand the impact of enzymatic treatments on gene expression and epitope preservation on major immune cell populations, skin dissociation (SkinD) and solid soft tumor dissociation (TumorD) were tested on three healthy PBMC samples in triplicate (D1, D2, D3), against an untreated control. CITE-seq performance was assessed on a solid biopsy cohort of 11 samples (5 healthy skin samples, 3 primary melanoma samples, 3 melanoma metastasis samples) as well as on a liquid biopsy PBMC cohort consisting of three healthy donors and three immunotherapy-treated melanoma patients. This dataset contains the GEX data for each sample. Data is provided in the form of pooled BAM files. Linkage between samples, BAM files and hashtags is provided in a separate linkage file.	unspecified	10
EGAD00001008419	To understand the impact of enzymatic treatments on gene expression and epitope preservation on major immune cell populations, skin dissociation (SkinD) and solid soft tumor dissociation (TumorD) were tested on three healthy PBMC samples in triplicate (D1, D2, D3), against an untreated control. CITE-seq performance was assessed on a solid biopsy cohort of 11 samples (5 healthy skin samples, 3 primary melanoma samples, 3 melanoma metastasis samples) as well as on a liquid biopsy PBMC cohort consisting of three healthy donors and three immunotherapy-treated melanoma patients. This dataset contains the ADT/SPEX data for each sample. Data is provided in the form of pooled BAM files. Linkage between samples, BAM files and hashtags is provided in a separate linkage file.	unspecified	10
EGAD00001008420	Exome sequencing study on 4 individuals from a pedigree with CHH and cerebellar hypoplasia.	Illumina HiSeq 2500	8
EGAD00001008421	This dataset contains RNAseq data of 20 paired pre-post neoadjuvant chemotherapy breast cancer samples. In total the set contains n=20 biopsies, n=20 surgery specimens. Each sample has 2 fastq files, so n=80 fastq files are uploaded in total.	Illumina HiSeq 2000	40
EGAD00001008422	RNA-seq, ATAC-seq and ChIPmentation data from monocyte-derived macrophages that were infected with Influenza A virus strain PR8WT, or a matching non-infected control.	Illumina NovaSeq 6000	8
EGAD00001008423	3 control iPSC lines differentiated into iPSC-derived motor neurons transduced with either EGFP or NOVA1 lentivirus.	Illumina NovaSeq 6000	6
EGAD00001008424	iPSC-derived motor neurons form sporadic ALS and Controls. 4 sALS iPSC lines and 4 Ctrl iPSC lines.	Illumina HiSeq 4000	8
EGAD00001008425	iPSC-derived motor neurons form familial ALS and Controls. 2 fALS iPSC lines and 3 Ctrl iPSC lines.	Illumina NovaSeq 6000	5
EGAD00001008426	eCLIP of TDP-43 from iPSC-derived motor neurons in 2 control lines. Per line input and IP samples and analysis including bigWig files and peak files.	Illumina HiSeq 4000	4
EGAD00001008427	iPSC-derived motor neurons from 5 NOVA1 knock out and 5 NOVA1 wt lines in the CVB background.	Illumina NovaSeq 6000	10
EGAD00001008428	eCLIP of NOVA1, NOVA2 and RBFOX2 from iPSC-derived motor neurons in 2 control lines. Per line and RNA-binding protein input and IP samples and analysis including bigWig files and peak files.	Illumina NovaSeq 6000	12
EGAD00001008429	Set of FASTQ sequences generated from Urine Liquid Biopsy in 12 Bladder Cancer Patients using the IDT PanCancer Panel, Illumina Nextera Flex for Enrichment libraries (aka DNA Prep libraries) and Illumina NovaSeq 2x150bp sequencing.	Illumina NovaSeq 6000	22
EGAD00001008430	WES data of a HCC with neuroendocrine differentiation (HCC-NED), normal and organoid from a 74-year-old man.	Illumina NovaSeq 6000	3
EGAD00001008431	Single-cell RNA-seq of first-, second-, and third-generation patient-derived organoids. Obtained using the 10X Genomics single-cell 3' expression solution (v3 chemistry). First- and second-generation PDOs from one patient and first-, second-, and third-generation PDOs from three additional patients.	Illumina NovaSeq 6000	11
EGAD00001008432	Targeted panel sequencing data from PanNEN samples. Sample ID is annotated in the following manner: each patient is given a number and "P" is appended to the patient number if it is a primary tumor, "M” if it is metastasis and "N" if it is normal (healthy tissue) sample. All NETG1 and NETG2 samples underwent panel sequencing using a custom panel (in-house PanNEN panel). All NEC and NETG3 samples (except PNET2, PNET77P and PNET77M) underwent panel sequencing using a commercial CCP panel.		103
EGAD00001008433	This dataset contains RNAseq data of n=87 pre-treatment biopsies of triple negative and luminal- type breast cancer patients, all scheduled to receive neoadjuvant chemotherapy. Gene expression data is linked with treatment response and survival.	Illumina HiSeq 2000	1
EGAD00001008434	This dataset includes 23 specimens from osteosarcoma patients (primary, relapsed, metastatic). It contains bam files from RNA sequencing using a library in which coding regions of cDNA are captured and short-read, paired-end sequencing.	Illumina HiSeq 2000	19
EGAD00001008435	This dataset contains 86 osteosarcoma samples and their matched normals that underwent RNA sequencing using size fractionation, NuGEN Ovation Ultralow Library System V2 preparation, and paired-end sequencing on Illumina HiSeq 2000.	Illumina HiSeq 2000	1
EGAD00001008436	This dataset contains 86 osteosarcoma samples and their matched normals that underwent RNA sequencing using size fractionation, NuGEN Ovation Ultralow Library System V2 preparation, and paired-end sequencing on Illumina HiSeq 2000.	Illumina HiSeq 2000	86
EGAD00001008437	Aggregated VCF file from cancer genes panel seq for the initial (n=500) cohort of solid tumors screened for the Basket of Baskets study		1
EGAD00001008438	This dataset contains fastq files from four tumours that underwent targeted sequencing on panel for suspected VHL disease. The samples contained within the dataset and their corresponding sample ID are: ccRCC - M19-12422, Pheochromocytoma - M19-13800, Expelled lung tissue- M19-13801, and Liver biopsy- M19-13802.	NextSeq 500	4
EGAD00001008441	This dataset contains multi-region sequencing of 16 RCC patients with venous tumor thrombus (VTT), 11 of which were either metastatic on diagnosis or recurred with metastasis. Whole exome sequencing is available for 94 samples across all 16 patients, including 1 matched normal sample per patient, 2-3 primary tumor samples per patient, 1-2 VTT samples per patient, and 0-3 metastasis samples per patient (metastatic lesions were only sampled for 8 of the 11 metastatic patients). RNAseq, generated by exome capture, is available for 67 samples across 12 patients, including 0-1 matched normal samples per patient, 3 primary tumor samples per patient, 0-1 VTT samples per patient, and 0-3 metastasis samples per patient (RNAseq was only available for 4 of the 8 patients from whom metastatic lesions were sampled).	Illumina HiSeq 2500	94
EGAD00001008442	This dataset contains whole exome sequencing data (WES) of 20 paired pre- and post neoadjuvant chemotherapy breast cancer samples. From every patient a pre-treatment biopsy (B) and a post-treatment surgery (S) specimen has been sequenced. From most patients a paired normal blood sample (N) has been sequenced as a reference control.	Illumina HiSeq 2500	1
EGAD00001008443	scRNA-seq dataset on a patient (P_IKZF2-het) presenting with immune dysregulation. This dataset contains raw and processed files from scRNA-seq performed on samples using the 10x Genomics Chromium Controller with the Chromium Single Cell 3′ Reagent Kit (v3 chemistry). There are three paired-end (75 bp) fastq files (I1, R1, R2) and three processed files generated with 10x Genomics Cell Ranger v3.0.2 software against GRCh38 human reference transcriptome (scrnaseq_P_IKZF2-het_barcodes.tsv.gz, scrnaseq_P_IKZF2-het_features.tsv.gz, scrnaseq_P_IKZF2-het_matrix.mtx.gz).	Illumina HiSeq 4000	1
EGAD00001008444	Long-range sequencing with low error rate has been challenging. Sequence assembly and phasing usually require a high-quality reference genome for mapping, so working on highly-variable genomic regions or regions with no reference genome information would be difficult. In this study, we describe novel bench protocols and algorithms to obtain ultra-low-error-rate haplotype-phased sequence assemblies of regions 10 KB in length using a short-read sequencing platform that simultaneously solves the above two problems. We accomplish this by imprinting each template strand from a target region with a dense and unique mutation pattern. The mutation process randomly and independently converts ~50% of cytosines to uracils. Short-read sequencing libraries are made from both mutated and unmutated templates. A conservative de Bruijn graph approach seeds an assembly of the mutated templates, which we then extend by mapping paired-end reads. We next partition the template assemblies into two or more haplotypes after using the unmutated sequence library to recover almost all of the mutated bases. The final haplotype is assembled and corrected for residual template mutations and PCR errors. We obtain per-base-error rates below 10 9. We apply this method to a human family, correctly assembling and phasing three genomic intervals, including the highly polymorphic HLA-B gene.	Illumina MiSeq	4
EGAD00001008445	Functional screening on patient-derived organoids identifies a therapeutic bispecific antibody that triggers EGFR degradation in LGR5+ tumor cells	Illumina HiSeq 2000	131
EGAD00001008446	Remaining WGS files for study titled "Genomics of pediatric myeloid neoplasms"	Illumina HiSeq 2000	10
EGAD00001008447	Whole genome sequence from paired tumour and germline samples from mesothelioma patients	HiSeq X Ten	42
EGAD00001008448	Stool samples were collected from 2,509 Estonian Biobank participants. The shotgun metagenomic paired-end sequencing was performed by Novogene Bioinformatics Technology Co., Ltd. using the Illumina NovaSeq6000 platform, resulting in 4.62 ± 0.44 Gb of data per sample (insert size, 350 bp; read length, 2 × 250 bp). A total of 2,513 samples belonging to 2,509 individuals were sequenced, including 4 biological replicates from one individual. First, the reads were trimmed for quality and adapter sequences. The host reads that aligned to the human genome were removed using SOAP2.21 (parameters: -s 135 -l 30 -v 7 -m 200 -x 400).	Illumina NovaSeq 6000	2513
EGAD00001008449	RNA-seq 10 subsets; 5 donors. ATAC-seq 9 subsets, 4 donors; Histone modification profiling 10 subsets, 2 donors all using human NK cell and T cell subsets. TF ChIP-seq Bcl11b, Bach2, Runx2, Gata3, PLZF. Illuminia sequencing platform, ATAC-seq is Paired-end, RNA-seq/ChIP-seq is Single-end	Illumina HiSeq 2500 Illumina HiSeq 3000	1
EGAD00001008450	This study contains whole genome sequencing data and whole exon sequencing data of IMPC tumor and normal tissue sample.	unspecified	460
EGAD00001008452	We extracted DNA from whole blood or lymphoblast-derived cell lines and assessed the DNA quality with PicoGreenTM and gel electrophoresis. Whole genome sequencing was performed (Illumina HiSeq2000 and Illumina HiSeq X). WGS reads were mapped to the human reference genome assembly hg19 (GRCh37) using Burrows-Wheeler Aligner v.0.7.12 (TCAG) or Isaac v.2.0.13 (Macrogen). For each genome, we performed local realignment and quality recalibration and detected SNVs and small indels using GATK Haplotype Caller v.3.4.6 without genotype refinement. We detected CNVs using ERDS (estimation by read depth with single nucleotide variants) and CNVnator. We detected structural variants using Manta v.0.29.6. When available by the variant caller (i.e. GATK and Manta), trio-based joint variant calling was conducted for each family.	HiSeq X Five Illumina HiSeq 2000	112
EGAD00001008453	Raw, unfiltered fastq files obtained through RNA-seq of endometrial organoids from MRKH patients and controls. The dataset divides into three parts, depending on the growth conditions of the organoids, ie expansion medium or treated with hormones. Each sample consists of two paired-end fastq files.	Illumina NovaSeq 6000	33
EGAD00001008454	We also collected samples from 8 NSCLC patients and 4 ovarian cancer patients and. For all 8 NSCLC patients, a tumor biopsy sample, a WBC sample, and three plasma samples were collected. For all 4 ovarian cancer patients, a WBC sample and two serum samples were collected. We collected tumor tissue sample from one ovarian cancer patient (OV4). The cfDNA was extracted from their plasma samples using the QIAamp circulating nucleic acid kit from QIAGEN (Germantown, MD). For serum cfDNA, ampure XP beads size selection was further performed to eliminate gDNA contamination. In brief, 0.5 volume of beads were first added to the cfDNA samples. After incubation, the supernatant was transferred to a new tube and an additional 2.0 volume of beads were added. After 80% ethanol wash, cfDNA was eluted from the beads. FA assays (Agilent Technologies) were performed to rule out the contamination of gDNA in the size selected samples. The cfDNA WES library of all patients and the genomic DNA WES library of the 4 ovarian cancer patients were constructed with the SureSelect XT HS kit from Agilent Technologies (Santa Clara, CA) according to the manufacturer’s protocol. In brief, 10ng of cfDNA was used as input material. After end repair/dA-tailing of cfDNA, the adaptor was ligated. The ligation product was purified with Ampure XP beads (Beckman-Coulter, Atlanta, GA) and the adaptor-ligated library was amplified with index primer in 10-cycle PCR. The amplified library was purified again with Ampure XP beads, and the amount of amplified DNA was measured using the Qubit 1xdsDNA HS assay kit (ThermoFisher, Waltham, MA). 700-1000 ng of DNA sample was hybridized to the capture library and pulled down by streptavidin-coated beads. After washing the beads, the DNA library captured on the beads was re-amplified with 10-cycle PCR. The final libraries were purified by Ampure XP beads. The library concentration was measured by Qubit, and the quality was further examined with Agilent Bioanalyzer before the final step of 2x150bp paired-end sequencing at an average coverage of 200. Whole-exome capture libraries of genomic DNA from the 8 NSCLC patients were constructed via Roche SeqCap EZ Exome V6 (Roche). Enriched exome libraries were sequenced on the Illumina HiSeq 3000 platform (Illumina) to generate 2x100bp paired-end reads at an average coverage of 200.	HiSeq X Ten Illumina HiSeq 3000	53
EGAD00001008455	54 samples consisting of COAD, ESCC, GA and OSCC	Illumina NovaSeq 6000	107
EGAD00001008456	Computationally reconstructed B-cell receptor sequences (using BraCeR) from scRNA-seq data for all cells passing quality control.		1
EGAD00001008457	Single cell multiomics from 2 donor controls, expression and chromatin accessibility. Samples belong to gray matter tissue from the brain.	Illumina NovaSeq 6000	2
EGAD00001008458	We used single-cell transcriptomics to study cells from the developing human cerebellum, and show that different molecular subgroups of medulloblastoma resemble distinct glutamatergic progenitors.	Illumina HiSeq 2000 Illumina HiSeq 2500	391
EGAD00001008460	Circulating tumor DNA (ctDNA) in blood plasma is an emerging tool for clinical cancer genotyping and longitudinal disease monitoring. We performed deep whole-genome sequencing of serial plasma and synchronous metastases in patients with aggressive prostate cancer. We comprehensively assess all classes of genomic alterations and demonstrate that ctDNA harbors multiple dominant populations whose evolutionary histories frequently indicate whole-genome doubling and shifts in mutational processes. Although tissue and ctDNA showed concordant clonally-expanded cancer driver alterations, each individual metastasis contributed only a minor share of total ctDNA. By comparing serial ctDNA before and after clinical progression on potent androgen receptor (AR) pathway inhibitors, we reveal population restructuring converging solely on AR augmentation as the dominant genomic driver of acquired treatment-resistance. Finally we leverage nucleosome footprints in ctDNA to infer mRNA abundance in synchronously biopsied metastases, including treatment-induced changes in AR pathway transcriptional activity.	unspecified	117
EGAD00001008462	This dataset consists of genome-wide 5hmC methylomes at various stages of prostate cancer, including not only 93 metastases from castration-resistant prostate cancer (mCRPC) patients, but also 5hmC patterns in cell-free DNA (cfDNA). There are 2000 runs in total as fastq files.	Illumina HiSeq 2500 NextSeq 550	596
EGAD00001008463	Exome (_{N,T}{1,2}) RNAseq (polyA - _PolyA, and RiboZero - _RibZ) Methylation (SeqCapEpi - MAPD).	Illumina HiSeq 2500	1
EGAD00001008464	We recruited 98 hospitalised patients displaying severe COVID-19 symptoms from the first wave of infection. A stringent exclusion criteria based on non-genetic factors such as age, blood oxygen, radiologic findings and other typical COVID-19 signs was performed. Gingival or peripheral blood samples were taken for 98 individuals and whole exome sequencing performed using ExomeCapture-Seq capture KAPA HyperExome on Illumina machines.	unspecified	100
EGAD00001008465	The raw fastq files target sequencing of 112 genes for 1,298 endometrial glands and matched blood samples. The paired-end sequencing data sets (R1 and R2) are deposited. ABCC1, ACRC, ANK3, ARHGAP35, ARID1A, ARID5B, ATCAY, ATM, ATR, BARD1, BCOR, BRCA1, BRCA2, BRD4, BRIP1, CAMTA1, CDC23, CDYL, CFAP54, CHD4, CHEK1, CHEK2, CTCF, CTNNB1, CUX1, DGKA, DISP2, DYNC2H1, EMSY, FAAP24, FAM135B, FAM175A, FAM65C, FANCA, FANCB, FANCC, FANCD2, FANCE, FANCF, FANCG, FANCI, FANCL, FANCM, FAT1, FAT3, FBN2, FBXW7, FGFR2, FRG1, GPR50, HEATR1, HIST1H4B, HNRNPCL1, HOOK3, KIAA1109, KIF26A, KMT2B, KMT2C, KRAS, LAMA2, LRP1B, MLH1, MON2, MRE11A, MSH2, MSH6, MTOR, NBN, PALB2, PHEX, PIK3CA, PIK3R1, PLXNB2, PLXND1, PMS2, POLE, POLR3B, PPP2R1A, PTEN, PTPN13, RAD50, RAD51, RAD51B, RAD51C, RAD51D, RAD52, RAD54B, RAD54L, RICTOR, SACS, SIGLEC9, SLC19A1, SLX4, SPEG, STT3A, TAF1, TAF2, TAS2R31, TFAP2C, TNC, TONSL, TP53, TTC6, UBA7, VNN1, WT1, XIRP2, ZBED6, ZC3H13, ZFHX3, ZFHX4, ZMYM4.	Illumina HiSeq 2500	1334
EGAD00001008466			1
EGAD00001008467			1
EGAD00001008468	The dataset includes 6 FASTQ files with single cell transcriptome sequencing data of normal breast myoepithelial cells from ducts and TDLUs derived from reduction mammoplasties from three patients. Chromium Single Cell 3’ Reagent Kit v2 or v3 (10x Genomics) were used for processing of cells, whereafter sequencing was performed using the Illumina® NextSeq500/550 High Output Kit v2. Cell Ranger was used for generating FASTQ files and files from different lanes were concatenated prior to uploading the data to EGA.	NextSeq 550	6
EGAD00001008469	SDH deficient renal cell carcinomas are a rare and recently defined subtype of kidney cancer, often associated with an inherited mutation in one of the SDH gene subunits. This dataset sought to understand the genomic events that underpin tumour formation, from putative cell of origin, characterisation of the tumour microenvironment, to the genomic evolution of these rare tumours. We performed whole genome and RNA sequencing of 4 patients with SDH deficient renal cell carcinomas, including one patient who had an additional paraganglioma. An addition patient in this cohort had the initial diagnosis revised to a clear cell renal cell carcinoma.	Illumina NovaSeq 6000	-
EGAD00001008470	SDH deficient renal cell carcinomas are a rare and recently defined subtype of kidney cancer, often associated with an inherited mutation in one of the SDH gene subunits. This dataset sought to understand the genomic events that underpin tumour formation, from putative cell of origin, characterisation of the tumour microenvironment, to the genomic evolution of these rare tumours. We performed whole genome and RNA sequencing of 4 patients with SDH deficient renal cell carcinomas, including one patient who had an additional paraganglioma. An addition patient in this cohort had the initial diagnosis revised to a clear cell renal cell carcinoma.	Illumina HiSeq 4000	10
EGAD00001008473	This dataset includes RNAseq data of the fetal ISCs and iPSCs derived of the fetal ISCs to confirm successful reprogramming	unspecified	6
EGAD00001008474	this data set includes deep targeted re-sequencing of fetal bulk tissues of the 4 foetuses (T21=2, D21=2). The tissues include: fetal skin and intestinal organoid cultures passage 0 of all 4 fetuses, and spleen of fetus N01 (T21)	NextSeq 500	9
EGAD00001008475	This data set includes WGS data of the in vivo acquired mutations in fetal ISCs and HSPCs of 4 foetuses ( T21=2, D21= 2). In addition, this data set includes sub-clonal fetal ISCs to determine the culture-associated mutations of fetal ISCs. Also, it concludes clone +subclone WGS data of iPSCs derived of the fetal ISCs	Illumina NovaSeq 6000	47
EGAD00001008476	Data about copy number aberrations was obtained from primary CRC (n=90). DNA was collected from second primary CRC (in HL survivors, n = 39), and primary SBA (n=14). For second primary SBA (in HL and TC survivors), DNA was isolated for molecular analyses (n=7). Copy number aberrations were evaluated after low-coverage whole genome sequencing.	Illumina HiSeq 4000	60
EGAD00001008477	Whole genome sequencing of 209 pediatric probands with primary cardiomyopathy and their family members. All samples were sequenced using Illumina short read platform.	HiSeq X Ten	114
EGAD00001008478	We analyzed the T-cell receptor (TCR) repertoires from ten kidney transplant recipients. Five out of the ten kidney transplant recipients received ATLG while the other five recipients received basiliximab as induction therapy. TCR repertoires of CD4+ and CD8+ positive T-cells were assessed prior to transplantation and within the first month after transplantation as well as at three- and 12-months post-transplant. In addition, the pre-formed alloreactive TCR repertoire for each kidney transplant recipient was identified using mixed lymphocyte reaction and donor reactive T-cells were subjected to TCR beta sequencing. This dataset comprises a total of 106 samples. NGS TCR beta libraries of all samples were sequenced on an Illumina NextSeq 500 and raw sequencing data (in the form of fastq files) as well assembled clonotypes and their counts (in the form of clonotype tables) are provided.	NextSeq 500	84
EGAD00001008479	Organoid cultures derived from normal colon and/or colorectal adenomas and/or colorectal carcinomas. RNA and DNA was isolated from these cultures for genome wide profiling.	Illumina HiSeq 2500 Illumina MiSeq	37
EGAD00001008480	Organoid cultures derived from colorectal adenomas were transduced with a miR-17-92 expressing vector. RNA from miR17-92-overexpressing organoids and respective non-transduced organoids (controls) was isolated for expression analysis.	Illumina HiSeq 2500	12
EGAD00001008481	scRNA-seq data of B-lineage cells from the cerebrospinal fluid of 21 patients with multiple sclerosis. The data was generated with the Smart-seq2 protocol and sequenced on Illumina NextSeq500.	NextSeq 500	21
EGAD00001008482	Multi-region RNAseq from 4 NSCLC patients, totaling 12 tumor samples and 7 matched control samples.	Illumina HiSeq 2000	19
EGAD00001008484	Whole transcriptome RNA-sequencing of purified bone marrow blasts of 136 de novo, treatment naive AML patients. For further details, we refer to the manuscript "The Proteogenomic Landscape of AML" by Jayavelu, Wolf, Buettner et al. mRNA extraction and whole transcriptome sequencing For transcriptome analysis the TruSeq Total Stranded RNA kit was used, starting with 250ng of total RNA, to generate RNA libraries following the manufacturer’s recommendations (Illumina, San Diego, CA, USA). 100bp paired-end reads were sequenced on the NovaSeq 6000 (Illumina) with a median of 57 mio. reads per sample. RNA Data Analysis Data quality control was performed with FastQC v0.11.9. Reads were aligned to the human reference genome (Ensembl GRCh38 release 82) using STAR v2.6.1. Gene count tables were generated while mapping, using Gencode v31 annotations. All downstream analyses were carried out using R v4.0 and BioConductor v3.12 (Huber et al., 2015; R Core Team, 2020). Size-factor based normalization was performed using DESeq2 v1.28.1(Love et al., 2014).	Illumina NovaSeq 6000	177
EGAD00001008485	Sequencing data from a targeted myeloid DNA-Panelsequnencing at the MLL Dx, Munich lab. Targeted sequencing was performed using the Nextera DNA Flex library preparation kit, starting with 100ng of genomic DNA (Illumina, San Diego, CA, USA). The target regions were enriched by a custom xGen Lockdown panel using a hybridization capture workflow (IDT Integrated DNA Technologies, Coralville, IA, USA). All libraries were sequenced with 100bp paired-end reads on a NovaSeq6000 (Illumina) with a mean coverage of 3206x. Somatic variant calling was performed with Pisces and a sensitivity cut off of 2%. Large deletions and medium-sized insertions, as they are for example found in CALR and FLT3, were called with Pindel. Variant annotation considered the publicly available data bases Cosmic (v91), ClinVar (2020-03), gnomAd (non-cancer, v2.1.1), dbNSFP (v3.5) and UMD TP53 (2017_R2). Variants that are described as somatic, protein truncating or affecting splice sites were considered as mutations while variants with no or discrepant data base information were considered as variant of uncertain significance.	Illumina NovaSeq 6000	15
EGAD00001008487	224 pairs of FASTQ files from metastatic Castration-Resistant Prostate Cancer (mCRPC) sequenced on HiSeq 4000 instruments. Patients were enrolled in the West Coast Dream Team study. Biopsies include various tissue sites including bone, soft tissue, and lymph node.	Illumina HiSeq 4000	224
EGAD00001008488	To infer the proteomic Mito signature in the LSC subcompartiment, myeloid blasts for 10 patients from the discovery cohort were FACS-sorted into CD34-GPR56+NKG2DLigands- (CD34-), alias 61dc5fb798e2520001702c03 CD34+GPR56+NKG2DLigands- (CD34+), alias 61dc5fb798e2520001702c03 Detailed gating strategy will be described in Donato, Correia, Andresen and Trumpp et al., (manuscript in preparation)	NextSeq 550	1
EGAD00001008489	Whole exome and RNASeq raw sequencing data for a cohort of 7 male patients with oesophageal adenocarcinoma. Median age at diagnosis was 68. Tumour tissue and PBMCs were used for whole exome sequencing and RNA sequencing. This data was generated as part of a study funded by a Cancer Research UK Centres Network Accelerator Award Grant (A21998).	Illumina NovaSeq 6000	21
EGAD00001008491	This dataset includes linked-read whole-genome sequencing data from the normal ileal of the patient. The normal sample was sequenced using the 10x Genomics linked-read whole-genome sequencing (WGS) approach.	Illumina HiSeq 2500	18
EGAD00001008492	This dataset includes linked-read whole-genome sequencing data (subfolder HF3FKCCXY) for multifocal ileal tumor samples from one patient. Samples were sequenced using the 10x Genomics linked-read whole-genome sequencing (WGS) approach.	Illumina HiSeq 2500	176
EGAD00001008493	This dataset includes linked-read whole-genome sequencing data (subfolder HF3J5CCXY) for multifocal ileal tumor samples from one patient. Samples were sequenced using the 10x Genomics linked-read whole-genome sequencing (WGS) approach.	Illumina HiSeq 2500	176
EGAD00001008494	This dataset includes linked-read whole-genome sequencing data (subfolder HF3NYCCXY) for multifocal ileal tumor samples from one patient. Samples were sequenced using the 10x Genomics linked-read whole-genome sequencing (WGS) approach.	Illumina HiSeq 2500	176
EGAD00001008495	This dataset includes linked-read whole-genome sequencing data (subfolder HFFWLCCXY) for multifocal ileal tumor samples from one patient. Samples were sequenced using the 10x Genomics linked-read whole-genome sequencing (WGS) approach.	Illumina HiSeq 2500	176
EGAD00001008496	This dataset includes linked-read whole-genome sequencing data (subfolder HFG3FCCXY) for multifocal ileal tumor samples from one patient. Samples were sequenced using the 10x Genomics linked-read whole-genome sequencing (WGS) approach.	Illumina HiSeq 2500	176
EGAD00001008497	CRAM files and VCF for DDD_1 and their parents. Also de novo mutations file for hypermutated DDD_1 child as described in the manuscript ‘Genetic and chemotherapeutic influences of germline hypermutation’ by Kaplanis et al. which will be published in Nature shortly.	HiSeq X Ten	3
EGAD00001008498	Exome sequencing data from transformed Follicular Lymphoma samples that express a PMBL-like gene expression signature		1
EGAD00001008499	The dataset includes 12 paired FASTQ files (6 samples) with single cell transcriptome sequencing data of normal breast luminal cells from ducts and TDLUs derived from reduction mammoplasties from three patients. Chromium Single Cell 3’ Reagent Kit v2 or v3 (10x Genomics) were used for processing of cells, whereafter sequencing was performed using the Illumina® NextSeq500/550 High Output Kit v2. Cell Ranger was used for generating FASTQ files and files from different lanes were concatenated prior to uploading the data to EGA. "R1" files include the feature barcodes and UMIs, while "R2" files include the reads.	NextSeq 550	6
EGAD00001008501	Targeted myeloid DNA-Panelsequencing from purified bone marrow blasts of 104 treatment naive AML patients from the discovery cohort. For more details, we refere to Jayavelu, Wolf, Buettner et al. Libraries were prepared from 40 ng DNA using the QIASeq Human Myeloid Neoplasms Panel (Qiagen) according to the manufacturer’s protocol. Samples were tagged with the QIAseq 96-Unique Dual Index Set A for Illumina platforms (Qiagen) to yield unique combinations of i5 and i7 barcodes for each sample. Sample fragment size distribution and concentration was estimated using the Agilent High Sensitivity DNA kit on a 2100 Bioanalyzer (Agilent). Samples were pooled in an equimolar fashion, denatured, and diluted to 1.5 pM according to Illumina’s recommendations. The diluted library was sequenced on a NextSeq 500 benchtop sequencer (Illumina) using NextSeq High Output cartridges. Demultiplexing was performed using the BaseSpace cloud platform (Illumina).	NextSeq 500	19
EGAD00001008504	Paired tumor-normal exome data from 47 Microsatellite stable Early-onset sporadic rectal cancer exome from the Indian population	Illumina HiSeq 2500 Illumina NovaSeq 6000	94
EGAD00001008505	Dataset consists of fastq files from bulk RNA-seq done on peripheral blood acquired from two well characterised hospitalised cohorts, a cohort of patients infected with influenza and a cohort of patients infected with SARS-CoV-2 during the first wave of the pandemic and prior to availability of COVID-19 treatments and vaccines.	Illumina NovaSeq 6000	160
EGAD00001008506	Profiling of co-mutations was done by targeted resequencing using the TruSight Myeloid assay (Illumina, Chesterford, UK) covering 54 genes recurrently mutated in AML: BCOR, BCORL1, CDKN2A, CEBPA, CUX1, DNMT3A, ETV6, EZH2, IKZF1, KDM6A, PHF6, RAD21, RUNX1, STAG2, ZRSR2, ABL1, ASXL1, ATRX, BRAF, CALR, CBL, CBLB, CBLC, CDKN2A, CSF3R, FBXW7, FLT3, GATA1, GATA2, GNAS, HRAS, IDH1, IDH2, JAK2, JAK3, KIT, KRAS, MLL, MPL, MYD88, NOTCH1, NPM1, NRAS, PDGFRA, PTEN, PTPN11, SETBP1, SF3B1, SMC1A, SMC3, SRSF2, TET2, TP53, U2AF1 and WT1. For each reaction, 50 ng of genomic DNA was used. Library preparation was done as recommended by the manufacturer (TruSight Myeloid Sequencing Panel Reference Guide 15054779 v02, Illumina). Samples were sequenced paired-end (150 bp PE) on NextSeq- (Illumina) or (300 bp PE) MiSeq-NGS platforms, with a median coverage of 3076 reads (range 824–30565). Sequence data alignment of demultiplexed FastQ files, variant calling and filtering was done using the Sequence Pilot software package (JSI medical systems GmbH, Ettenheim, Germany) with default settings and a 5% variant allele frequency (VAF) mutation calling cut-off. Human genome build HG19 was used as reference genome for mapping algorithms.		-
EGAD00001008507	RPPA analysis from FAIRLANE Trial of Neoadjuvant Ipatasertib Plus Paclitaxel for Triple-Negative Breast Cancer.		1
EGAD00001008508	Whole transcriptome and 850k mehylome profiling of human intraoperative or snap frozen and FFPE MBM.	Illumina NovaSeq 6000	21
EGAD00001008510	Mice with medulloblastoma (Group 3) were treated with sham (isofluorane and imaging xray) or CSI as described by Abbas et al (2022). Total RNA was isolated with RNeasy Plus Mini Kit (Qiagen), library preparation (SureSelect, Agilent), rRNA depletion (Ribo-Zero Plus, Illumina) and sequencing were carried out by GenomicsWA or Australian Genome Research Facility. Libraries were sequenced on NovaSeq 6000 S1 flow cells as paired-end 150bp reads (Illumina).	Illumina NovaSeq 6000	24
EGAD00001008511	Mice with medulloblastoma (Group 3) were treated with saline, cyclophosphamide, or gemcitabine as described by Abbas et al (2020). Total RNA was isolated with RNeasy Plus Mini Kit (Qiagen), library preparation (SureSelect, Agilent), rRNA depletion (Ribo-Zero Plus, Illumina) and sequencing were carried out by GenomicsWA or Australian Genome Research Facility. Libraries were sequenced on NovaSeq 6000 S1 flow cells as paired-end 150bp reads (Illumina).	Illumina NovaSeq 6000	20
EGAD00001008512	RNA-seq libraries were prepared using the KAPA Stranded RNA-Seq Kit with RiboErase (Kapa Biosystems, Wilmington, MA) and sequenced to a target depth of 200-M reads on the Illumina HiSeq platform (Illumina, San Diego, CA).	Illumina HiSeq 4000	162
EGAD00001008514	We performed single cell RNA sequencing (scRNA-seq) from bone marrow on 11 pediatric (0-14 years-old) and adolescent and young adult (AYA) (15-39 years-old) de novo AML samples (Dx) (4 inv(16), 3 t(8;21) and 4 rMLL). In addition, for some patients also relapse sample was sequenced (2 inv(16), 2 t(8;21) and 3 rMLL). Cells were sorted into CD34+/CD38- and CD34-/CD38+ and sequenced separately.	Illumina NovaSeq 6000	18
EGAD00001008515	This deposit consists of DNA and RNA sequencing data from 32 EPS patients. 28 samples had tumor DNA sequencing data. 2 had matched normal sequencing data. 27 samples had tumor RNA sequencing data.	Illumina HiSeq 2000	32
EGAD00001008516	WGS and RNA-Seq data from a GBM patient PT-BM8772	Illumina HiSeq 2500	1
EGAD00001008517	WGS and RNA-Seq data from a GBM patient PT-CM3220	Illumina HiSeq 2000 Illumina HiSeq 2500	2
EGAD00001008518	WGS and RNA-Seq data from a GBM patient PT-DM9089	Illumina HiSeq 2000 Illumina HiSeq 2500	2
EGAD00001008519	WGS and RNA-Seq data from a GBM patient PT-GE7528	Illumina HiSeq 2000	1
EGAD00001008520	WGS and RNA-Seq data from a GBM patient PT-GI2070	Illumina HiSeq 2000	1
EGAD00001008521	WGS and RNA-Seq data from a GBM patient PT-JR9883	Illumina HiSeq 2000	1
EGAD00001008522	WGS and RNA-Seq data from a GBM patient PT-LC6372	Illumina HiSeq 2500	1
EGAD00001008523	WGS and RNA-Seq data from a GBM patient PT-ML9537	Illumina HiSeq 2500	1
EGAD00001008524	WGS data from a GBM patient PT-MS8478		-
EGAD00001008525	WGS and RNA-Seq data from a GBM patient PT-PR5617	Illumina HiSeq 2000 Illumina HiSeq 2500	2
EGAD00001008526	WGS and RNA-Seq data from a GBM patient PT-PV2594	Illumina HiSeq 2000 Illumina HiSeq 2500	5
EGAD00001008527	WGS and RNA-Seq data from a GBM patient PT-RV2286	Illumina HiSeq 2500	1
EGAD00001008528	WGS data from a GBM patient PT-SB3465		-
EGAD00001008529	WGS and RNA-Seq data from a GBM patient PT-SJ5453	Illumina HiSeq 2000 Illumina HiSeq 2500	3
EGAD00001008530	WGS and RNA-Seq data from a GBM patient PT-SS3647	Illumina HiSeq 2000 Illumina HiSeq 2500	3
EGAD00001008531	WGS and RNA-Seq data from a GBM patient PT-WR7927	Illumina HiSeq 2500	1
EGAD00001008532	WGS and RNA-Seq data from a GBM patient PT-WT4796	Illumina HiSeq 2000	1
EGAD00001008533	This data comes into 2 pairs of experiments: - RNA-seq Control versus Formate treated colorectal cancer T18 cells - Humix device, RNA-seq of control versus co-culture colorectal cancer T18 cells with Fusobacterium nucleatum	NextSeq 500	12
EGAD00001008534	Set of 2 bam files from patients affected with Lupus. Fastq alignments for exonic variants present in TLR7 gene.	HiSeq X Ten	1
EGAD00001008535	WGS data relative to 36 triple negative breast cancer PDX models.	Illumina NovaSeq 6000	36
EGAD00001008537	RNA-seq dataset of high-grade serous ovarian cancer (HGSC) tumours from long-term survivors performed as part of the Multidisciplinary Ovarian Cancer Outcomes Group (MOCOG) study. The dataset includes fastq files from 56 HGSC tumours (53 primary tumours and 3 recurrent tumours) from 53 long-term survivor patients. Libraries were generated using the Illumina Stranded mRNA Prep and 150 bp paired-end sequencing was performed to a minimum of 100 million reads on Illumina NovaSeq 6000 instruments.	Illumina NovaSeq 6000	56
EGAD00001008538	We monitored patient's anti SARS-CoV-2 immune responses using an in vitro cross presentation assay. The goal of this study was to identify immune correlates of clinical protection against SARS-CoV-2 infection. Briefly, peripheral blood mononuclear cells of patient were divided into a monocyte and lymphocyte. Monocyte were differentiated into monocyte derived dendritic (mo-DC)cells using GM-CSF and Interferon alpha. Mo-DC were then loaded with SARS-CoV-2 culture lysates , or VeroE86 lysates. SARS-CoV-2 loaded mo-DC were then used to stimulates their autologous lymphocytes and T cell cytokine secretion was monitored in the supernatant. We discriminated patients producing IL-2 and patients producing IL-5. RNA sequencing was performed for 18 patients, to identify gene profile associated with IL-2 or IL-5 production.	Illumina NovaSeq 6000	36
EGAD00001008541	In other analysis in the current manuscript, we find a similar gene signature (to dissociation based artifacts in mouse and human tissue) is present in post-mortem microglia and astrocytes, across all snRNA-seq datasets analyzed, although it is highly variable between subjects. Using acutely-resected neurosurgical tissue, we performed single-nucleus RNA-seq and reveal that a similar signature can be detected in microglia following prolonged exposure to room temperature. Tissue handling and methods details, as well as sequencing and analysis details) can be found in the methods section of related manuscript (Marsh et al., 2022). Together, these results suggest that the presence of this signature in post-mortem brain samples may be the result of a combination of acute pre-mortem (agonal state, cause of death, comorbidities, etc.) and post-mortem (post-mortem interval (PMI), storage time, RNA quality, etc.) variables and may not represent normally present cell state.	Illumina NovaSeq 6000	4
EGAD00001008542	RNAseq data relative to 41 triple negative breast cancer patients.	Illumina HiSeq 4000	41
EGAD00001008543	DNA-seq libraries were captured to exome regions using xGen Exome Research Panel v1.0 (IDT), and libraries were prepared using the KAPA Hyper prep kit. DNA libraries were sequenced to a target depth of ×200 for tumor sample, ×100 for normal samples on the Illumina HiSeq platform.	Illumina HiSeq 4000	256
EGAD00001008544	Intrahepatic cholangiocarcinomas (iCCs) are characterized by their rarity, difficulty in diagnosis, and overall poor prognosis. We performed comprehensive transcriptomic characterization of treatment-naive iCC. Whole transcriptome analyses identified two prognostic subtypes, concordant with previous reports.The findings could assist in patient stratification with iCCs and in developing rational therapeutic strategies.	Illumina HiSeq 2500	91
EGAD00001008545	The compressed file contains plink format file for the Illumina MEGA SNP array data of 255 individuals generated and analyzed in Liu et al study of genom-wide variation of the Massim region.		1
EGAD00001008546	This is a prospective study with 100 participants. The enzymatic digestion profiles after conventional PCR allowed the identification of different haplotypes of hemoglobin in Abidjan.		-
EGAD00001008547	RNAseq data relative to 56 primary and treatment-naive ovarian carcinomas, from independent donors.	Illumina NovaSeq 6000	56
EGAD00001008548	Relevant clinical data for POPLAR including treatment arm, histology, overall survival, progression-free survival, and best confirmed overall response.		-
EGAD00001008549	Relevant clinical data for OAK including treatment arm, histology, overall survival, progression-free survival, and best confirmed overall response.		-
EGAD00001008550	Additional relevant biomarker data for OAK including PD-L1 tumor cell IHC by the 22C3 assay, tumor mutational burden status, and STK11, KEAP1, and EGFR mutation status.		-
EGAD00001008551	Clinical data from IMblaze370: Clinical data include disease, treatment arm, MSI status, KRAS oncogenic mutation status, sex, and overall survival (1=dead, 0=alive)		-
EGAD00001008552	RNA-seq count matrix for 296 bulk pre-treatment tumors from IMblaze370		-
EGAD00001008553	RNA-seq FASTQ files from 296 bulk pre-treatment tumors from IMblaze370	unspecified	296
EGAD00001008554	WGS and WES data for manuscript titled: ctDNA as a biomarker of progression in oesophageal adenocarcinoma	HiSeq X Ten Illumina NovaSeq 6000	44
EGAD00001008555	Raw sequencing reads were processed as single end sequencing, aligned to the human reference genome GRCh38 and processed using CellRanger 3.1.	Illumina HiSeq 4000	89
EGAD00001008556	Whole exome sequecing data of 224 Chinese Clear Cell Renal Cell Carcinoma patients.		1
EGAD00001008557	Intrahepatic cholangiocarcinomas (iCCs) are characterized by their rarity, difficulty in diagnosis, and overall poor prognosis. We performed comprehensive genomic characterization of treatment-naive iCC. This study reports a large-scale genomic analysis of iCC. The findings could assist in patient stratification with iCCs and in developing rational therapeutic strategies.	Illumina HiSeq 2500	10
EGAD00001008558	For the cohort of 59 samples, we performed TruSeq DNA PCR-Free whole-genome sequencing library preparation according to manufacturer’s instructions (llumina, ILMN, San Diego, CA) on the automated NGS Star liquid handling platform (Hamilton, Bonaduz, Switzerland) followed by 2x150 bp paired-end sequencing on the HiSeqX or NovaSeq6000 (ILMN). An average coverage of >100x was achieved. For whole transcriptome analysis, the TruSeq Total Stranded RNA kit was used, starting with 250 ng of total RNA, to generate RNA libraries following the manufacturer’s recommendations (ILMN). 2x100bp paired-end reads were sequenced on the NovaSeq 6000 with a median of 50 mio. reads per sample (ILMN).	Illumina NovaSeq 6000	59
EGAD00001008559	Libraries were prepared from RNA-extracted cell lines using Illumina RNA library prep kit. Samples were sequenced on Illumina HiSeq 4000 or HiSeq 2500.	Illumina HiSeq 2500 Illumina HiSeq 4000	103
EGAD00001008560	The sample AD_Library_1, AD_Library_2 and Control_Library were run on a Chromium Chip B with the Chromium Single Cell 3′ Library & Gel Bead Kit v3 kit (10x Genomics, CA, USA) . The 3’ gene expression libraries were sequenced at an approximate depth of 50,000 reads per cell using the NovaSeq 6000 S1 (Illumina, San Diego, CA, USA) flow cells. Cell Ranger v.3.0.2 was used to analyze the raw base call files. FASTQ files and raw gene-barcode matrices were generated and aligned human genome GRCh37 (hg19). The samples were integrated in R v.4.0.3 and generated Seurat objects, two related to AD samples and one to control samples, were analyzed using the Seurat package v.4.0.3 to perform downstream analysis, clustering of the cells and differential expression.		1
EGAD00001008562	ChIP-seq for AR, FOXA1 and H3K27ac in primary prostate tumors before and after 3 months of neoadjuvant enzalutamide treatment. RNA-seq expression data of primary prostate tumors before and after 3 months of neoadjuvant enzalutamide treatment.	Illumina HiSeq 2500	245
EGAD00001008564	Targeted DNA sequencing data of paired primary and relapse tumor material taken from a pediatric patient with neuroblastoma.	Illumina MiSeq	10
EGAD00001008566	Whole Genome sequencing of colorectal cancer patients (SG-BULK-1)	Illumina HiSeq 4000	69
EGAD00001008567	We have assessed the molecular profile of a cohort of 70 patients with MDS by next-generation sequencing (NGS) using cfDNA and compared the results to paired bone marrow (BM) DNA.	NextSeq 500	140
EGAD00001008568	Whole-exome sequencing data (Agilent SureSelectXT Human All Exon V7). Retrospective study of matched pairs of initial and post-therapeutic GBM cases treated with temozolomide+radiotherapy with a recurrence period greater than one year. Matched normal, initial and post-therapeutic samples for 27 patients and 1 patient (GBM046) with a matched normal and two post-therapeutic samples.	Illumina NovaSeq 6000	84
EGAD00001008569	Whole genome sequencing data (bam) of tetralogy of Fallot study, including data derived from iPSCs of two control and four patients with tetralogy of Fallot (two with DiGeorge syndrome (DG), two without DiGeorge syndrome (ND).	Illumina NovaSeq 6000	6
EGAD00001008571	RNA-seq was performed from 3 separate GINS3 patient fibroblast cultures and 1 replicate of fibroblasts derived from each of the two parents. RNA-seq libraries were generated with NEBNext Ultra II Directional RNA library prep for Illumina with NEBNext Poly(A) mRNA Magnetic Isolation Module (New England Biolabs) and sequenced on Illumina NextSeq500 with paired-end 150 bp read length. SIRV Set 3 (Lexogen) spike-ins were added. Two fastq files are provided for each RNAseq sample.	NextSeq 500	5
EGAD00001008572	The PYDP dataset includes 26 bam files of Y chromosome sequences for Papua New Guinean individuals from different locations, extracted from whole genome sequences. DNA was extrated from saliva samples (Oragen kit). Sequencing libraries were prepared using the TruSeq DNA PCR-Free HT kit. 150 bp paired-end sequencing was performed on the Illumina HiSeq X5 sequencer.	HiSeq X Five	24
EGAD00001008573	The IYDP dataset includes BAM files of 126 Y chromosomes extracted from whole genome sequences. These are from individuals from a broad range of Indonesian islands - communities close to mainland Asia through to New Guinea. The original whole genome sequencing libraries were prepared using TruSeq DNA PCR-Free and TruSeq Nano DNA HT kits depending on DNA quantity. 150 bp paired-end sequencing was performed on the Illumina HiSeq X sequencer. Individuals were sequenced to expected mean depth of 30x, with an achieved median depth of raw reads across samples of 43x.	HiSeq X Ten	126
EGAD00001008574	Whole Genome sequencing of colorectal cancer patients (SG-BULK-2)	Illumina HiSeq 4000	66
EGAD00001008575	Common variable immunodeficiency (CVID) is the most prevalent primary immunodeficiency. Here the authors perform single cell omics analyses in CVID discordant monozygotic twins and show epigenetic and transcriptional alterations associated with activation in memory B cells.	Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina NovaSeq 6000	2838
EGAD00001008576	Contains 14 control samples and 26 case samples.	HiSeq X Ten	26
EGAD00001008577	Joint called VCF for whole genome sequence data from 410 samples described in the paper: PMID:33116287. It includes 314 high coverage (average 30X) samples sequenced on the Illumina X-Ten, also available as individual datasets under the H3Africa Chip study (EGAS00001002976) and 112 medium coverage (average 10X) samples from the TrypanGen study (EGAS00001002602) sequenced on Illumina HiSeq 2500. Supplementary table 3 of the paper describes the geographic breakdown of the samples. 16 samples from the Southern African Human Genome project have been removed from this VCF.		410
EGAD00001008578	Dataset includes fastq files for RNA-Seq experiments for tumor samples of PPGL patients. Single end reads fastq files are available for 102 different samples	Illumina NovaSeq 6000	102
EGAD00001008579	Dataset includes fastq files for WES experiments for tumor samples of PPGL patients. Paired end reads fastq files are available for 87 different samples	Illumina HiSeq 2500 Illumina NovaSeq 6000	87
EGAD00001008580	Includes: Finnish -THL Finrisk	Illumina Genome Analyzer IIx	1175
EGAD00001008581	WGS data relative to 63 primary and treatment-naive ovarian carcinomas, from independent donors.	Illumina NovaSeq 6000	63
EGAD00001008583	Sanger sequencing and RT-qPCR data for validation used in Primary lymphomas of the central nervous system (PCNSL).		1
EGAD00001008584	All libraries were sequenced on Illumina HiSeq4000 until sufficient saturation was reached.	Illumina HiSeq 4000	9
EGAD00001008585	All libraries were sequenced on Illumina NextSeq or NovaSeq6000 until sufficient saturation was reached.	unspecified	26
EGAD00001008586	BAM files from RNAseq study from regions of insitu and invasive human mammary ductal disease		1
EGAD00001008587	This database contains 46 samples for early stage ovarian high grade serous carcinoma project. Amplicon sequencing on 37 tumour samples from early stage ovarian high grade serous carcinoma as well as 5 adjacent normal tissue samples and 4 whole blood samples.	Illumina HiSeq 4000	46
EGAD00001008588	shallow whole genome sequencing dataset contains 44 samples. all the samples are early stage high ovarian high grade serous carcinoma.	Illumina HiSeq 4000	44
EGAD00001008589	This study compared different assays for the detection of circulating tumour DNA (ctDNA) in serial plasma from stage IA-IV breast cancer patients, targeting structural variants (SVs), single nucleotide variants (SNVs) and/or somatic copy-number aberrations (SCNAs). SV-multiplex PCR, SNV-/SV-hybrid capture, and different depths of whole-genome sequencing (WGS) were used to evaluate ctDNA levels, demonstrating concordant results. SNV-hybrid capture targeting 1,347-7,491 mutations was the most sensitive assay, detecting 67% (36/54) of samples down to an allele fraction (AF) of 0.00024%. SV-multiplex PCR, targeting 21-47 mutations, detected 63% (34/54) of samples down to 0.00047% AF and has potential as a clinical assay.	HiSeq X Ten Illumina HiSeq 4000 Illumina MiSeq Illumina NovaSeq 6000	1284
EGAD00001008590	This dataset contains 10x Genomics Single Cell 3’ Solution (version 2) scRNA-seq data from peripheral blood leukocytes of a single healthy donor. Data from 20939 cells were collected over 8 lanes and 2 sequencing runs.	Illumina HiSeq 4000	32
EGAD00001008591	This batch is a subset of the full DETECT-A dataset, containing 498 fastq files generated from 71 subjects. All sequencing was conducted using Illumina HiSeq 4000 and Illumina MiSeq platforms. Note that the division into batches follows no specific criteria and that the sequencing data for each subject has multiple files which may span multiple datasets. Thus, for a comprehensive analysis, it is recommended to request access to all datasets that comprise this study.	Illumina HiSeq 4000	81
EGAD00001008592	Whole Genome sequencing of colorectal cancer patients (SG-BULK-3)	Illumina HiSeq 4000	64
EGAD00001008593	Whole genome sequencing from two resectable patients with pancreatic cancer for both normal and tumour tissue samples; whole exome sequencing from the two resectable patients and five unresectable patients of peripheral blood mononuclear cells and blood plasma (1-5 time points per patient), and whole exome sequencing of plasma samples from three chronic pancreatitis patients.	Illumina HiSeq 4000 Illumina NovaSeq 6000	34
EGAD00001008594	consists 17 cases, 7 control cases and 10 cancer cases.	NextSeq 500	17
EGAD00001008595	Fastq files for the single cell RNAseq data of Follicular lymphoma study. This dataset includes the paired single cell RNA sequencing data for 23 samples.	Illumina HiSeq 4000	23
EGAD00001008597	This batch is a subset of the full DETECT-A dataset, containing 29294 fastq files generated from 3653 subjects. All sequencing was conducted using Illumina HiSeq 4000 and Illumina MiSeq platforms. Note that the division into batches follows no specific criteria and that the sequencing data for each subject has multiple files which may span multiple datasets. Thus, for a comprehensive analysis, it is recommended to request access to all datasets that comprise this study.	Illumina HiSeq 4000 Illumina MiSeq	5111
EGAD00001008598	This dataset contains the FASTQ files for a portion of the samples in Tang F. et al. “Chromatin accessibility profiles of castration-resistant prostate cancers reveal novel subtypes and therapeutic vulnerabilities” published in Science. It contains 51 samples sequenced with Illumina HiSeq 2500 or HiSeq 4000. The remaining samples can be found at dbGaP: phs000909.v1.p1	Illumina HiSeq 2500 Illumina HiSeq 4000	51
EGAD00001008600	WGS files for Genomic Landscape ALL paper titled "The genomic landscape of pediatric acute lymphoblastic leukemia"	Illumina HiSeq 2000	278
EGAD00001008601	Non-small cell lung cancer (NSCLC) is the leading cause of cancer deaths worldwide. Only a fraction of NSCLC harbour actionable driver mutations and there is an urgent need for patient-derived model systems that will enable the development of new targeted therapies. We generated NSCLC patient-derived xenografts (PDXs) that recapitulate the histology and molecular features of primary NSCLC. Here, we completed whole exome sequencing on 122 NSCLC PDXs.	Illumina HiSeq 2000	122
EGAD00001008608	The Genomics of MPNST (GeM) Consortium dataset includes de-identified whole genome sequencing data (.bam) for germline samples (DNA primarily derived from blood) sequenced at standard (30x) coverage (n=88) and for tumor samples (DNA derived from fresh frozen tissue) sequenced at 90x coverage (n=105). This dataset also includes transcriptome profiling data (.fastq) for paired normal nerve samples (n=7) and for tumor samples (n=132).	Illumina HiSeq 4000 Illumina NovaSeq 6000	332
EGAD00001008609	For scRNA-Seq, single live cells were suspended in 0.4% BSA in DPBS buffer (1000 cells/µL) and subjected for GEM generation and barcoding. Library preparation was performed according to the recommended procedures of the manufacture Chromium Single Cell 3’ reagent Kit V3.1 chemistry. 10,000 cells were targeted for capturing and 9 cycles were used for cDNA amplification, while 12 cycles were performed for library formation, and sequencing was performed on an Illumina NovaSeq 6000 sequencer. For bulk RNA-Seq, RNA was purified using the miRNeasy™ RNA MiniPrep (Qiagene) and RNA-seq libraries were generated either using the Illumina TruSeq RNA Library Preparation Kits and sequenced on the Illumina HiSeq 2500 sequencer as 76 bp paired-end reads, or using the NEBnext UltraDirectional RNA Library Preparation Kits after rRNA depletion using the NEBNext rRNA depletion kit and sequenced on an Illumina HiSeq 2500 sequencer using 50 cycles of single-end sequencing.	Illumina HiSeq 2500 Illumina NovaSeq 6000	21
EGAD00001008610	Genomic DNA of 81 cases from Japanese gastric cancer was extracted from tumor and matched normal tissues, and libraries with an insert size of 350–550 bp were prepared. The libraries were sequenced on a HiSeq 2500 instrument (Illumina) with paired-end reads of 101 bp. The read data are stored as FASTQ formatted files.	Illumina HiSeq 2500	161
EGAD00001008611	This deposit consists of DNA and RNA sequencing data from 67 CCS samples. 55 samples had tumor DNA sequencing data. 6 had matched normal sequencing data. 34 samples had tumor RNA sequencing data.	Illumina HiSeq 2000 Illumina HiSeq 4000	67
EGAD00001008616	Targeted sequencing was applied to an unselected population-based diffuse large B-cell lymphoma cohort (n=928) diagnosed in the UK's Haematological Malignancy Research Network catchment population of ~4 million (14 centres). DNA extracted from tumour samples was sequenced with a 293-gene panel using the Illumina HiSeq 2500. All data are provided in the CRAM format.	Illumina HiSeq 2500	928
EGAD00001008617	WGS data relative to 42 primary and treatment-naive triple negative breast cancers, from independent donors.	HiSeq X Ten	42
EGAD00001008618	Whole genome paired sequencing of Multiple Myeloma CD138positive bone marrow plasma cells and saliva control samples (6 tripletts, tumor1, tumor2, saliva control) from 6 patients. WGS was done on HiSeq X-Ten or Novaseq 6000 with Illumina TruSeq Nano DNA.	HiSeq X Ten Illumina NovaSeq 6000	18
EGAD00001008619	RNA-Seq data on multiple myeloma CD138positive bone marrow plasma cells, 11 samples, sequenced on HiSeq 4000 and HiSeq X-Ten, using mostly the TruSeq stranded mRNA Kit.	HiSeq X Ten Illumina HiSeq 4000	11
EGAD00001008621	The dataset consists of whole exome sequencing data (fastq format) of 100 non-syndromic autism spectrum disorder patients from India. Whole exome sequencing data is generated using Agilent SureSelect v6 capture kit and Illumina HiSeq sequencing platform. Paired end fastq files are available.	Illumina HiSeq 2500	100
EGAD00001008624	This dataset includes 10X Genomics 3' single cell-RNA-seq profiles from 24 human rashes and 7 healthy controls. BAM files are provided for each sample.	Illumina HiSeq 4000	31
EGAD00001008625	Whole Genome sequencing of colorectal cancer patients (SG-BULK-4)	Illumina HiSeq 4000	64
EGAD00001008626	PacBio HiFi sequencing was performed on 48 barcoded patients' genomic DNA after a telobait-capture protocol to enrich for telomeric regions. The sequencing reads of each patient were de-multiplexed and presented as patient-specific PacBio CCS BAM files.	Sequel	48
EGAD00001008627	The dataset consists of shallow whole genome sequencing from plasma DNA of 1002 individuals, including 1048 samples. Raw fastq files from Illumina HiSeq series are available.	Illumina HiSeq 4000	1048
EGAD00001008628	This dataset contains counts for 699 tumor samples profiled by RNA-seq for the entire transcriptome for samples originating from OAK (GO28915).		-
EGAD00001008629	This dataset contains counts per million for 699 tumor samples profiled by RNA-seq for the entire transcriptome for samples originating from OAK (GO28915).		-
EGAD00001008630	This dataset contains counts for 192 tumor samples profiled by RNA-seq for the entire transcriptome for samples originating from POPLAR (GO28753).		-
EGAD00001008631	This dataset contains counts per million for 192 tumor samples profiled by RNA-seq for the entire transcriptome for samples originating from POPLAR (GO28753).		-
EGAD00001008632	This dataset contains 8 samples, each of which has paired-end WXS fastq files for Tumour and Normal samples, as well as RNA-Seq fastq file.	Illumina HiSeq 4000	8
EGAD00001008633	Bone marrow or peripheral blood samples were collected of adult patients at first diagnosis of B-precursor acute lymphoblastic leukemia. RNA was isolated from mononuclear cells and subjected to mRNA library prep using Poly-A selection and sequencing on a NovaSeq 6000 system. Obtained gene expression profiles and gene fusion calls were used to allocate samples to molecular disease subtypes.	Illumina NovaSeq 6000	560
EGAD00001008634	We performed whole genome sequencing (WGS) in an ASD cohort of 68 individuals from 22 families enriched for recent shared ancestry. Samples were sequenced using Illumina HiSeq X platform, and Variants (single nucleotide variants (SNVs) and insertions or deletions (indels)) were detected using GATK with HaplotypeCaller. Quality control checks for (i) duplicate samples, (ii) samples per platform, (iii) genome call rate, (iv) missingness rate, (v) singleton rate, (vi) heterozygosity rate, (vii) homozygosity rate, (viii) Ti/Tv ratio, (ix) inbreeding coefficient, and (x) sex inference were performed as previously described. Variant call format (VCF) files for SNVs and indels were annotated with ANNOVAR using allele frequencies from the 1000 Genomes project (2015; 1000G), the Genome Aggregation Database (gnomAD), and the Greater Middle East Variome Project (GME).		1
EGAD00001008635	17 scRNA-seq and 16 scATAC-seq datasets on bone marrows derived from 16 patients. The sequencing dataset consists of 5 samples without bone marrow infiltration, defined as controls (C1, C2, C3, C4, and C5) and 11 neuroblastoma infiltrated bone marrow samples from patients with MYCN amplification (M1, M2, M3, M4), ATRX mutations (A1, A2), and cases lacking either alteration (S1, S2, S3, S4, S5).	Illumina NovaSeq 6000	2
EGAD00001008636	Spatial transcriptome sequencing data from prostate cancer needle biopsies. Dataset contains biopsies from before and after androgen deprivation therapy of 3 patients with 8 biopsies per patient in total. 2 sections (replicates) are taken from each biopsy.	NextSeq 550	24
EGAD00001008637	Whole Genome sequencing of colorectal cancer patients (SG-BULK-5)	Illumina HiSeq 4000	66
EGAD00001008638	FPKM expression values of the CUP/reference/validation cohort used for tissue-of-origin prediction based on transcriptomic data		-
EGAD00001008639	The dataset comprises total RNA sequencing data obtained from two samples of testicular tissue from the individual M1911, who was diagnosed with meiotic arrest.	unspecified	2
EGAD00001008640	Whole-genome sequencing BAM files of a census-based elderly cohort of Brazilians (n=1171)	HiSeq X Ten	1
EGAD00001008641	This dataset includes DNA methylation profiles before and after GH treatment (with a duration of ~18 months in average) on 47 healthy children using customized methylC-seq capture sequencing. It includes 360 fastq files (i.e. 180 paired-end fastq files) where 40 fastq files were generated with HiSeq and 320 fastq files were generated with NovaSeq.	Illumina HiSeq 4000 Illumina NovaSeq 6000	94
EGAD00001008642	WGS profiling bam files from colorectal carcinoma and adenoma.	Illumina NovaSeq 6000	527
EGAD00001008643	9 tumor biopsies & 84 blood samples	Illumina MiSeq	93
EGAD00001008644	Spatial transcriptome sequence data from two tumour containing prostates. Entire cross section of organ divided into cubes to fit spatial transcriptomics arrays. The dataset contains paired-end sequences from 21 sections of 1k array sections and 9 sections of 10x Visium sections for patient 1 as well as 28 sections of 10x Visium sections for patient 2.	Illumina NovaSeq 6000 unspecified	58
EGAD00001008645	Results of scRNA-seq analysis of a PBMC collected from a male with a mosaic 45,X/48,XYYY karyotype	NextSeq 550	1
EGAD00001008646	WGS data for manuscript titled: Multi-omic features of oesophageal adenocarcinoma in patients treated with preoperative neoadjuvant therapy	HiSeq X Ten	178
EGAD00001008647	This dataset includes WGS data of samples from our paper titled "Dynamic phenotypic heterogeneity and the evolution of multiple RNA subtypes in Hepatocellular Carcinoma: The PLANET study." (National Science Review, nwab192. https://doi.org/10.1093/nsr/nwab192)	HiSeq X Ten Illumina HiSeq 4000	-
EGAD00001008648	This dataset includes RNA-seq data of samples from our paper titled "Dynamic phenotypic heterogeneity and the evolution of multiple RNA subtypes in Hepatocellular Carcinoma: The PLANET study." (National Science Review, nwab192. https://doi.org/10.1093/nsr/nwab192)	Illumina HiSeq 4000	-
EGAD00001008649	We performed single-cell RNA-sequencing (scRNA-seq) of cells in the bronchoalveolar lavage (BAL) fluid at 3-month follow-up of a multiple myeloma patient experiencing sarcoidosis-like pulmonary reactions after anti-BCMA CAR T-cell therapy (Sample alias: A8_3, A9_3; technical replicates). In addition we performed scRNA-seq of a extramedullary relapse lesion at 6-month follow-up (Sample alias: B10_3).	Illumina NovaSeq 6000	3
EGAD00001008650	This dataset contains 8 .bam files of shallow WGS (~0.1×) from fresh frozen tumour tissues of four matched patients and first generation PDX models. Sequencing reads were aligned to the 1000 Genomes Project GRCh37-derived reference genome using the BWA aligner (v.0.07.17; CRUK-CI alignment pipeline).	Illumina HiSeq 4000	8
EGAD00001008651	This dataset contains paired-end fastq sequencing files (n=212) from shallow WGS of 106 dried blood spot (DBS) samples, containing 91 DBS collected from OV04 ovarian cancer PDX mice, 10 DBS collected from healthy non-tumour bearing NSG mice, and 5 DBS generated from whole blood samples from 4 OV04 ovarian cancer patients.	Illumina NovaSeq 6000	106
EGAD00001008652	To study global transcriptional dynamics during human spermatogenesis we sequenced total RNA of human testicular biopsies with 5 specific histological phenotypes: Sertoli cell-only (SCO, n=3), spermatogenic arrests at the spermatogonial (SPG, n=4), spermatocyte (SPC, n=3), and spermatid (SPD, n=3) level, as well as normal spermatogenesis (Normal, n=3).	NextSeq 500	16
EGAD00001008653	Single cell RNA-seq from 6 and single nuclei ATAC-seq from 3 human fetal tissue samples. Samples from 8 to 11 weeks. Includes a 8.5 weeks samples with matching both ATAC-seq and RNA-seq runs. Data was sequenced using 10X Genomics chromium technology, for scRNA-seq samples belong to v2 and v3.	Illumina HiSeq 2500 Illumina NovaSeq 6000	9
EGAD00001008656	RNA-Seq data for manuscript titled: Multi-omic features of oesophageal adenocarcinoma in patients treated with preoperative neoadjuvant therapy	Illumina HiSeq 2500 Illumina HiSeq 4000 NextSeq 2000	79
EGAD00001008657	We generated a large transcriptome atlas of human skeletal muscles by collecting biopsies from 6 different muscles to determine molecular signatures that may be distinct between leg muscles. The biopsies were collected from gracilis (GR), semitendinosus (ST), vastus lateralis (VL), vastus medialis (VM), rectus femoris (RF), and gastrocnemius lateralis (GL) muscles. We also investigated molecular differences within the muscle by including two biopsies from the middle and distal sides of the semitendinosus muscle (STM and STD, respectively). In total, 128 samples from 20 individuals (aged 25 ± 3.6 yr) were analyzed.	Illumina NovaSeq 6000	128
EGAD00001008658	Dataset includes whole genome and transcriptomic sequencing data from five T-cell acute lymphoblastic leukemia (T-ALL) patients. Whole genome sequencing has performed from both diagnostic (T-ALL sample) and control (remission sample) samples. RNA-sequencing has performed from diagnostic samples. Samples has been taken from the bone marrow or peripheral blood.	HiSeq X Ten Illumina NovaSeq 6000	10
EGAD00001008659	Whole-genome sequencing data from 38 leukemia patients and 12 leukemia cell lines; Containing 100 fastq files; Two files for each sample.	unspecified	11
EGAD00001008660	Iron accumulation in microglia has been observed in Alzheimer’s disease and other neurodegenerative disorders and is thought to contribute to disease progression through various mechanisms including neuroinflammation. To study the interaction between iron accumulation and inflammation, we treated human induced pluripotent stem cell-derived microglia (iPSC-MG) with an increasing concentration of iron, in combination with inflammatory stimuli such as interferon gamma and amyloid β, and performed RNA sequencing.	Illumina NovaSeq 6000	24
EGAD00001008661	Whole-genome sequencing (WGS) genotype data generated as part of the Interval project. The data are reported, separately per chromosome, in variant call format (VCF). The genotypes are denoted in diploid format (for chrY the genotype 1 denoted as 1/1 and 0 denoted as 0/0). Note that multi-allelic variants are present in the data, but encoded to appear on separate, consecutive lines. The data are reported in following versions - unphased, phased, phased with imputation, sites only. Note: the unphased version has additional genotype information, while the phased versions only contain the genotypes.		1
EGAD00001008662	Despite extensive studies on the chromatin landscape of exhausted T cells, the transcriptional wiring underlying functional and dysfunctional states of human tumor infiltrating lymphocytes (TILs) is incompletely understood. Here, we identify tissue-specific and general gene-regulatory landscapes in the wide breadth of CD8+ TIL functional states covering four cancer entities using single-cell chromatin profiling. We map enhancer-promoter interactions in human TILs by integrating single-cell chromatin accessibility with single-cell RNA-seq data from tumor entity-matching samples and prioritize key elements by super enhancer analysis. Our analysis reveals a human core chromatin trajectory to TIL dysfunction and identifies key enhancers, transcriptional regulators, and deregulated target genes involved in this process. Finally, we validate enhancer regulation at loci encoding PD1, TCF1, and TIM3 by targeting non-coding regulatory elements with potent CRISPR activators and repressors. In summary, our study advances the understanding of molecular regulation of TIL (dys-)function and provides a framework for modulating immunotherapeutic relevant TIL genes via their enhancers.	NextSeq 550	49
EGAD00001008663	To investigate the influence of lifelong exercise training on the response of skeletal muscle to a bout of acute exercise we generated global transcriptomic data from long-term endurance (8 men, 8 women) and strength (8 men, 8 women) trained individuals and healthy age-matched untrained controls (8 men, 8 women). Skeletal muscle biopsies were taken from M. vastus lateralis before, directly after, and after 1h and 3hrs following acute exercise. All subjects completed one bout of acute endurance exercise and one bout of acute resistance exercise, separated by 4-8 weeks. All 384 samples were multiplexed in 4 lanes and sequenced (2x250bp paired end) on the Illumina NovaSeq 6000.	Illumina NovaSeq 6000	1
EGAD00001008664	Whole exon sequencing data of CLL patients	NextSeq 500	27
EGAD00001008665	ChIPseq data for CLL patients	NextSeq 500	3
EGAD00001008666	The dataset contains 90 lung cancer and 5 non-cancerous lung lesion plasma cfDNA samples collected in EDTA blood collection tubes. Shallow WGS was performed on an Illumina Novaseq S4 PE150bp. Samples are provided as raw reads without any prior processing.	Illumina NovaSeq 6000	95
EGAD00001008667	Groups of cells belonging to different ploidy populations (based on PI staining) were collected from an undifferentiated soft tissue sarcoma. The different ploidy populations underwent RRBS, after which copy number signatures for the ploidy-sorted populations and the bulk population were extracted.	Illumina HiSeq 2500	6
EGAD00001008668	We investigated the impact of ploidy heterogeneity on copy number inference at a single cell level using fluorescence-activated cell sorted (FACS) nuclei from an undifferentiated soft tissue sarcoma. FACS revealed the presence of three aberrant subpopulations, including a haploid, a near diploid and a whole genome doubled population. Once sorted, single cell nuclei underwent whole genome sequencing using the chromium CNV single cell DNA library kit (10X Genomics). We sequenced single normal nuclei (2n) and single aberrant / tumour nuclei (1n, 2n and 2n+).	Illumina HiSeq 4000	4
EGAD00001008669	Beta values of methylation data of the CUP/reference/validation cohort (H021) used for the validation cohort described in the publication		1
EGAD00001008670	RNA-Seq data for manuscript titled: A sporadic Alzheimers blood-brain barrier model for developing ultrasound-mediated delivery of Aducanumab and anti-Tau antibodies	NextSeq 550	24
EGAD00001008671	This study used whole exome sequencing on 21 patients with cholesteatoma from 10 families in order to identify variants that may attribute to cholesteatoma aetiology. Exomes were enriched for using hybridisation selection and subject to DNA-sequencing. This datasets is formed of two batches as they were sequencing at different times. Batch-1 exomes were selected for using Nimblegen capture (4-plex) and sequenced on Illumina Hiseq 4000 and batch-2 was exome selected using Agilent SureSelect Human All Exon v6 and sequenced on the Illumina NovaSeq 6000. All samples within the same family were processed within the same batch. This dataset is comprised on BAM files mapped using cgpMAP v3.2.0 (bwa-mem) using the GRCh38.	Illumina HiSeq 4000 Illumina NovaSeq 6000	46
EGAD00001008672	The genomic VCF data of the Integrative proteogenomic characterization of early esophageal cancer project ,this dataset contains 90 VCF files.		90
EGAD00001008673	This study contains methyl-binding domain sequencing and shallow whole genome sequencing from circulating free DNA (cfDNA) for 79 patients with small cell lung cancer (SCLC) and 78 non-cancer controls. We also sequence genomic DNA (both methyl-binding domain sequencing and shallow whole genome sequencing) from 30 circulating-tumour-cell derived explant models (CDXs, from 23 unique patients with SCLC), 20 patient derived explant models (PDXs, from 10 unique patients with SCLC) and 13 lung tissue samples.	NextSeq 550	1
EGAD00001008674	To investigate the mode of action and potential side-effects we analyzed differential gene expression in Postmitotic C9orf72 iPSC-Neurons by RNAseq. The cells were treated in a 2 dose regime at 10 µm in 0.1 % DMSO for 9 days. Compound treated iPSC-Neurons were washed with PBS, frozen on dry ice and stored at -80°C until RNA isolation. The RNA was isolated using miRNA Mini Kit (Qiagen) using 700 μl of Qiazol. A total of 250 ng of RNA per sample was processed for mRNA library preparation as per the manufacturer’s instructions for Illumina® Stranded mRNA Prep Ligation to be used with the IDT® for Illumina® RNA UD Indexes Set B and sequenced using NextSeq 500/550 High Output Kit v2.5 (Illumina) on NextSeq 550 (Illumina). The data was processed and analyzed using CLC genomics workbench (Version 21.0.3, Qiagen)	NextSeq 550	27
EGAD00001008675	Whole-exome sequencing data in fastq format of matched tumour and germline DNA from 8 patients with metastatic basal cell carcinoma. Samples are labeled as Primary, Local or Metastasis: Primary: Sample was obtained from primary tumour. Local: Sample was obtained from local recurrence of primary tumour. Metastasis: Sample was obtained from a metastatic site. Germline: Sample was obtained from normal adjacent tissue. DNA was isolated from FFPE tissue sections of the tumor biopsies using the AllPrep DNA/RNA FFPE Kit (Qiagen) and quality controls conducted using the Qubit fluorometer (Thermo Fisher Scientific). Library preparation was performed using the Agilent SureSelect Human All Exon v7 XTHS2 probes and sequenced on a NovaSeq 6000 S2 2x100bp	Illumina NovaSeq 6000	19
EGAD00001008676	We detected a uniparental paternal isodisomy event of chromosome 4 in a child. DNA was extracted from the blood. HiSeq X generated the sequence data.	HiSeq X Ten	3
EGAD00001008677	We generated whole genome sequence data from a family of monozygotic twins. DNA was obtained from blood, buccal epithelial cells, placenta, and umbilical cords of monozygotic twins. DNA from the parents were also obtained. Libraries were generated using Truseq-PCR free, Truseq nano, and NEBnext Ultra II depending on the availability of DNA. Data was generated on Illumina NovaSeq platform. Raw sequence data was aligned to the human reference genome GRCh38 using bwa mem aligner.	Illumina NovaSeq 6000	3
EGAD00001008683	This dataset contains 278 miRNA and 20 mRNA transcriptomes generated as part of the study "miR-374a-5p regulates inflammatory genes and monocyte function in inflammatory bowel disease."	Illumina HiSeq 4000 NextSeq 500	298
EGAD00001008684	The dataset encompasses 208 Runs from the WGSPD Project 3 - Genomic Strategies to Identify High-impact Psychiatric Risk Variants Project	Illumina Genome Analyzer IIx	208
EGAD00001008685	The dataset contains whole exome sequencing data (libraries prepared using the Agilent SureSelect Human All Exon V6 kit, and paired-end sequenced on Illumina HiSeq4000 (2 x 150bp)) of 6 samples taken from peripheral blood mononuclear cells (PBMCs) (Samples 1-5) and bone marrow (BM). Data are provided as fastq files. Sample 1 was taken prior to initial venetoclax treatment. Sample 2 was taken as disease progression on venetoclax. Sample 3 was taken during response to BTK and PI3K inhibitor therapy. Sample 4 and the BM sample were taken at disease progression/prior to venetoclax re-treatment. And Sample 5 was taken during venetoclax re-treatment response.	Illumina HiSeq 4000	6
EGAD00001008686	This dataset contain RNA-seq, ChIP-seq, WGBS and ATAC-seq of 1 human muscle stem cell sample. NCAM+ITGB1+ CD31−CD45−CD34− were used as the sorting strategy for the sample. H3K27ac, H3K27me3, H3K4me1, H3K4me3, H3K36me3 and H3K9me3 are the targets of ChIP-seq.	NextSeq 500 unspecified	1
EGAD00001008687	The mutagenicity of bacteria was assessed by serially exposing human small intestinal organoids to various bacterial species or isolated toxins. We have used the following abbreviations: EWT: Organoids exposed to E. coli described in PMID: 32106218 EKO: Organoids exposed to isogenic E. coli as EWT, with knockout of the deltaClbQ gene, rendering them unable to produce colibactin DYE: Organoids exposed to FastGreen injection control dye NIS: Organoids exposed to E. coli Nissle ETBF: Organoids exposed to the protease toxin BTF produced by ETBF-bacteria.	Illumina NovaSeq 6000	15
EGAD00001008688	ChIP-seq and matched input data of AR, FOXA1 and H3K27ac for mCRPC patient samples taken prior to and after treatment with AR targeted therapy	Illumina HiSeq 2500	68
EGAD00001008689	372 samples consisting of 185 patient paired CD138+ tumor and non-involved DNA pairs, plus 5 Horizon Diagnostic known mutation standards (HD). Samples were processed using the KAPA HyperCap protocol and hybridized onto a targeted panel for multiple myeloma and associated diseases. Reference Sudha et al Clinical Cancer Research, 2022.		372
EGAD00001008690	Long-read genome sequencing performed on the Oxford Nanopore Technologies' PromethION to resolve variants underlying breast cancer susceptibility in sixteen individuals with pathogenic germline SVs in BRCA1, BRCA2, CHEK2 or PALB2.	PromethION	16
EGAD00001008691	TCRab sequencing was performed on viably frozen cells from 11 T-LGLL samples from 9 T-LGLL patients and 6 age-matched healthy samples. The raw data is available as fastq files.	Illumina HiSeq 2500	68
EGAD00001008692	This data set contains whole exome and transcriptome data from 47 case with BPDCN. For exome data, bam files are provided (mapped against GRCh38), for transcriptome raw fastq-files (paired-end data).	Illumina NovaSeq 6000	97
EGAD00001008693	In this study we employed Laser Capture Microdissection (LCM) for the multimodal profiling of lung macrophages cell populations as a function of location within the healthy tissue. In detail, macrophage mini-bulks (~100 cells) were collected from 4 healthy human donors in 5 different locations of the airways (a total of 20 biopsies), including parenchyma (L1 – lower left lobe (LLL); L6 – 80% distance from LLL tip), trachea (L2), bronchi (L3 – 1st/2nd generation; L5 – 3rd/4th generation), and processed for ATAC-seq.	Illumina NovaSeq 6000	39
EGAD00001008694	In the reported study, we employed Laser Capture Microdissection (LCM) for the transcriptome profiling of lung macrophages cells populations as a function of location within the healthy tissue. In detail, macrophage mini-bulks (100 cells each) were collected by LCM from 4 healthy human donors in 5 different locations of the airways (a total of 20 biopsies), including parenchyma (L1 – lower left lobe (LLL); L6 – 80% distance from LLL tip), trachea (L2), bronchi (L3 – 1st/2nd generation; L5 – 3rd/4th generation) and processed for RNA-seq.	Illumina NovaSeq 6000	38
EGAD00001008695	This dataset contains bam files mapped to hg19 (exome and panel) or hg38 (RNA) that either were primary bone marrow cells or sorted human cells after long term engraftment in NSG mice treated with LOXL inhibitor	Illumina NovaSeq 6000	147
EGAD00001008696	Circulating cell-free methyl-DNA (mcfDNA) contains promising cancer markers but its low abundance and possibly diverse origin pose challenges toward the accurate diagnosis of early stage cancers. By whole-genome bisulfite sequencing (WGBS) of cell-free DNA (cfDNA) from about 0.5 mL plasma of mice xenografted with human tumors, we obtained and aligned the reads to the human genome, filtered out the mouse and carrier bacterial sequences, and confirmed the tumor origin of methyl-cfDNA (mctDNA) by methylation-sensitive restriction enzyme digestion prior to species-specific PCR. We estimated that human tumor-specific reads (ctDNA) or mctDNA comprised about 0.29 or 0.01%, respectively of the xenograft mouse cfDNA, and about 0.029 or 0.001% of the cfDNA of human early stage cancer patients. Similar WGBS of early stage (0-II, node- and metastasis-free) breast, lung or colorectal cancer samples identified hundreds of specific DMRs (differentially methylated regions) compared to healthy controls. Their association with tumourigenesis was supported by stage-dependent methylation, tumor suppressor or oncogene clusters, and genes also identified in the xenograft samples. Using 20 three-cancer-common and 17 colorectal cancer-specific DMRs in combination (top 0.0018% of the WGBS methylation clusters) was sufficient to distinguish the stage I colorectal cancers from breast and lung cancers and healthy controls. Our data thus confirmed the tumor origin of mctDNA by sequence specificity, and provide a selection threshold for authentic tumor mctDNA markers toward precise diagnosis of early stage cancers solely by top DMRs in combination.	HiSeq X Ten	24
EGAD00001008697	This dataset includes genome-wide autosomal array data and whole mtDNA sequences for 12 Resande and 17 Swedish individuals.	Illumina MiSeq	27
EGAD00001008699	Transcriptome sequencing of nine patients diagnosed with chondrosarcoma. cDNA was generated using the NuGEN Ovation RNA-Seq FFPE system. Total RNA was randomly primed and thermally sheared and the resulting cDNA fragment was amplified. The cDNA was then mechanically sheared using sonication to generate ~250 bp fragments. Sequencing libraries were generated using the NuGEN Ovation Ultralow System V2 library prep kit and sequenced on HiSeq2500 TruSeq v3.	Illumina HiSeq 2500	8
EGAD00001008700	Dataset comprises one vcf file containing variants from a list of genes (DNA repair and metabolism associated genes) subset from WES of an adult AML cohort. The cohort contains 145 patient samples. WES was performed using Illumina platform.		1
EGAD00001008701	Advamced Visium Spatial Gene Expression assay for FFPE tissues with human and SARS-CoV-2 whole transcriptome (WT) information at a 55 µm resolution. The dataset consists of 13 tissue sections from 5 patient lung tissue samples, 3 from COVID-19 patients and 2 from control patients.	Illumina NovaSeq 6000	13
EGAD00001008702	Dataset contains 5 exome BAM files from a child with neurofibromatosis and relapsed refractory acute lymphoblastic leukaemia. The samples are CD19 positive and CD19 negative bone marrow mononuclear cells at both diagnosis and relapse as well as mesenchymal stem cells as the germline control. Libraries were prepared using the SureSelect Clinical Research Exome v2 kit (Agilent Technologies, Santa Clara, CA, USA) and run on the Illumina NextSeq 500 platform.	NextSeq 500	5
EGAD00001008705	The dataset is composed by the raw RNA sequencing (n=6), targeted DNA sequencing (n=18) and whole exome sequencing (n=17) from 19 patients with IG-MYC-rearrangements.	Illumina HiSeq 2500 Illumina NovaSeq 6000 NextSeq 500	1
EGAD00001008706	Primary central nervous system lymphoma (PCNSL) is a distinct extranodal lymphoma presenting with limited stage disease but variable response rates to treatment despite homogenous pathological presentation. The likely underlying molecular heterogeneity and its clinical impact is poorly understood. The present dataset contents paired-ended whole-exome sequencing information (n=115; HyperExome Kapa hyperprep), paired-ended RNA-seq information (n=123; KAPA mRNA HyperPrep Kits), and paired-ended bisulfite sequencing (n=64; TruSeq Methyl Capture EPIC) from fresh-frozen tumor tissue immunocompetent, treatment naïve PCNSL patients. Additionally, the dataset includes single-ended RNA-seq (n=93; QuantSeq 3’ mRNA-Seq Library Prep Kit) from formalin-fixed, paraffin-embedded tissue of immunocompetent, treatment naïve PCNSL patients. All samples were sequenced in an Illumina NovaSeq 6000 instrument.	Illumina HiSeq 2000 Illumina NovaSeq 6000	1
EGAD00001008707	Total RNA sequencing of cultured OM cells derived from patients with Alzheimer's disease (AD), individuals with mild cognitive impairment (MCI) and cognitively healthy controls.		1
EGAD00001008708	Human (n=34) and mice (n=68) melanoma tumor WXS dataset.	NextSeq 500	102
EGAD00001008709			1
EGAD00001008710	Wes for 15 multiple meningioma samples from 6 individual	Illumina NovaSeq 6000	15
EGAD00001008711	We analyzed the cell free DNA methylomes using 67 plasma samples from patients with mCRPC prostate cancer in the VPC project. Methylation was profiled using the methylated DNA immunoprecipitation coupled to next generation sequencing (MeDIP) technology.	HiSeq X Ten	62
EGAD00001008712	We analyzed the cell free DNA methylomes using 14 plasma samples from patients with mCRPC in the Barrier cohort. Methylation was profiled using the methylated DNA immunoprecipitation coupled to next generation sequencing (MeDIP) technology.	HiSeq X Ten	14
EGAD00001008713	We analyzed the cell free DNA methylomes using 22 plasma samples from patients with mCRPC prostate cancer in the WCDT project. Methylation was profiled using the methylated DNA immunoprecipitation coupled to next generation sequencing (MeDIP) technology.	HiSeq X Ten	22
EGAD00001008714	Heatrich-BS was performed on 14 patients monitored across 5-8 timepoints each. The predicted tumor fraction trend was compared with CEA values and tumor measurements from CT scans.	Illumina MiSeq Illumina NovaSeq 6000	79
EGAD00001008715	Heatrich-BS was performed on 5 healthy volunteers and 15 CRC patient cell-free DNA. The Heatrich-BS predicted tumor fractions were compared with tumor burden values obtained by genomic methods such as targeted amplicon sequencing and low pass sequencing.	Illumina HiSeq 2000 Illumina MiSeq	20
EGAD00001008716	This dataset consists of shallow whole genome sequencing data and amplicon sequencing data for 26 ovarian cancer patients (21 high-grade serous ovarian cancer, 4 low-grade serous ovarian cancer and 1 clear cell ovarian cancer). The data are provided as single end FASTQ files for the shallow whole genome sequencing data (31 libraries) and paired end FASTQ files for the amplicon sequencing data (98 libraries).	Illumina HiSeq 2500 Illumina HiSeq 4000	26
EGAD00001008717	Dataset including 13 sequenced mtDNA genome samples.	Illumina MiSeq	13
EGAD00001008718	this dataset contains the raw data generated for CD14 monocytes WGBSof 7 covid19 hospitalized patients sampled at various time points (Admission, Day 5 and Day 15) in total 15 biospecimen are available. WGBS libraries have been sequenced on Illumina NovaSeq 6000.	Illumina NovaSeq 6000	15
EGAD00001008721	Bam files consists PET cases and healthy cases	Sequel	20
EGAD00001008722	Engineered Human Primary T Cell transcriptome study	Illumina NovaSeq 6000	45
EGAD00001008723	CLL PBMC cells were isolated using ficoll gradient. They have been treated with IBET762 or DMSO as solvent control and ATAC Seq has been performed on them.	NextSeq 500	8
EGAD00001008724	Mixture of 4 unrelated individuals sequenced by 10x as a scRNA-seq. The dataset was then processed by Cell Ranger and deconvoluted to yield each individuals genetic profile. The clustering of SNPs is submitted as the processed file. The Sequencing fastqs are submitted as unprocessed files.	Illumina NovaSeq 6000	1
EGAD00001008725	Deconvoluted files of the 5-9 individuals of in silico datasets (combination of biological mixture sequencing and publicly available data). The dataset includes the phenotypes used for clustering.		1
EGAD00001008726	Mixture of 2 unrelated individuals sequenced by 10x as a scRNA-seq. The dataset was then processed by Cell Ranger and deconvoluted to yield each individuals genetic profile. The clustering of SNPs is submitted as the processed file. The Sequencing fastqs are submitted as unprocessed files.	Illumina NovaSeq 6000	1
EGAD00001008727	Mixture of 2 unrelated individuals (of close mtDNA haplogroup) sequenced by 10x as a scRNA-seq. The dataset was then processed by Cell Ranger and deconvoluted to yield each individuals genetic profile. The clustering of SNPs is submitted as the processed file. The Sequencing fastqs are submitted as unprocessed files.	Illumina NovaSeq 6000	1
EGAD00001008728	reference whole exome sequence serving as a reference of individuals. Includes the fastq files of each individual (labelled S1-S5) and the called variants in vcf format merge for all individuals.	Illumina NovaSeq 6000	5
EGAD00001008729	Mixture of 3 unrelated individuals sequenced by 10x as a scRNA-seq. The dataset was then processed by Cell Ranger and deconvoluted to yield each individuals genetic profile. The clustering of SNPs is submitted as the processed file. The Sequencing fastqs are submitted as unprocessed files.	Illumina NovaSeq 6000	1
EGAD00001008730	Sepsis is defined as life-threatening organ dysfunction caused by a dysregulated host response to infection. This cohort comprises a subset of patients enrolled in the Genomic Advances in Sepsis (GAinS) study, an established biobank of adult sepsis patients. Patients with sepsis due to community acquired pneumonia or faecal peritonitis were recruited from 34 hospitals across the UK from 2005-2018, with samples for functional genomics and detailed clinical information collected on the first, third and/or fifth day following ICU admission. RNA was extracted from leukocytes isolated at the bedside using LeukoLOCK kits. We have previously identified sepsis response signatures (SRSs), transcriptomic endotypes that are associated with differential early mortality (Davenport et al, Lancet Respir Med, 2016; Burnham et al, AJRCCM, 2017) and response to treatment in a clinical trial (Antcliffe et al, AJRCCM, 2018). We generated RNA sequencing data on 903 samples, including 134 samples repeated from our previously released microarray data. Libraries were prepared using NEB Ultra II Library Prep kits (Illumina) and sequenced on a NovaSeq 6000. Reads were aligned to the reference genome (GRCh38) using STAR and gene counts quantified using featureCounts (annotation Ensembl v99). Counts were TMM-normalised and log-transformed. Following QC, processed data were available on 864 samples from 667 unique patients.	Illumina NovaSeq 6000	909
EGAD00001008731	Although cross-species transcriptional analysis has been generated for DCs, transcriptomic conservation between mouse and human FRCs at single-cell resolution has been unclear. To test whether GREM1+ FRCs might also play a role in DC homeostasis in humans, we performed scRNA-seq of CD45–PDPN+ stromal cells, as well as CD45+CD11c+ immune cells from healthy human LNs of three human donors. Data was generated using the 10x platform.	Illumina HiSeq 4000	6
EGAD00001008732	consists of 14 bam files	Sequel	14
EGAD00001008733	Whole exome sequencing from pre-treatment samples and matched blood normals from 22 patients. Of these individuals, on-treatment samples are available for a subset of 18 patients. RNA sequencing from pre-treatment samples from 21 patients. Of these individuals, on-treatment samples are available for a subset of 15 patients.	Illumina NovaSeq 6000	98
EGAD00001008734	Chromium 10x scRNA of 6 metastatic colorectal cancer organoids	Illumina NovaSeq 6000	6
EGAD00001008735	This dataset comprises the BAM files from targeted genome sequencing of CD138+ selected myeloma tumour samples from 21 individuals. In 5 cases there is only one tumour sample, but in the other 16 cases there are sequential samples, spanning treatment relapses. There are denoted Tumor A, B C etc. Therefore there are a total of 48 myeloma tumour samples. For each individual there is also a germline control samples, obtained either from peripheral blood or from CD138-selected bone marrow cells.	Illumina NovaSeq 6000	69
EGAD00001008736	This data set contains whole exome sequences of individuals from 8086 (mostly British Pakistani/British Bangladeshi, mostly self-reported parentally related) individuals from the following studies: 1. 5236 British Pakistani/British Bangladeshi adults from East London Genes & Health, now known as Genes & Health 2. 2624 British South Asian mothers from Born in Bradford (mostly Pakistani) 3. 1061 British South Asian adults from Birmingham (mostly Pakistani) This dataset contains all the exome sequence data available for this study on 2022-04-26		1
EGAD00001008737	We analyzed the cell free DNA methylomes using 72 plasma samples from patients with mCRPC prostate cancer in the VPC project for validation. Methylation was profiled using the methylated DNA immunoprecipitation coupled to next generation sequencing (MeDIP) technology. Files from multiple lanes exists per sample.	HiSeq X Ten	72
EGAD00001008739	Single nuclei RNA sequencing (snRNA-seq) on tissue samples from 11 patients (SDHB and RET). Files are fastq files of 10x-5'scRNAseq libraries.	NextSeq 500	1
EGAD00001008740	- RNA-sequencing data: 5 normal pleurae and 40 malignant pleural mesotheliomas - Targeted DNA-sequencing of the 165 genes included in the “Solid and Haematological tumors” panel (BRIGHTCore, Brussels, Belgium): 6 MPM samples. 2 FASTQ files for each sample (paired).	Illumina NovaSeq 6000	45
EGAD00001008741	RNA-sequencing data for 12 MPM cell lines treated with 0.1% DMSO or 1 µM palbociclib for 9-10 days. Experiment was performed in duplicates for sensitive cells (MPM08, MPM21, MPM38, MPM57, MPM59, Meso11, Meso13, Meso34 and Meso56) and only once for resistant cells (MPM31, MPM34 and MPM36) except for Meso11 which was done in triplicates. For Meso11, Meso13, Meso34 and Meso56, a drug washout of 48 hours was also performed. 4 MPM cell lines (MPM31, MPM34, MPM59 and MPM66) were also analyzed in untreated condition. 2 FASTQ files for each sample (n=56) (paired)	Illumina NovaSeq 6000	56
EGAD00001008742	Paired RNA-seq of bisulfite treated VDH01 samples control or depleted for NSUN3 (4 replicates each). The samples were prepped with NEBNext Ultra II DNA library prep kit and sequenced on MiSeq. Paired RNA-seq of fCAB treated vdh01 samples control or depleted for NSUN3(4 replicates each). the samples were prepped with NEBNext Ultra II DNA library prep kit and sequenced on MiSeq. Paired RNA-seq of fCAB treated VDH01 samples (3). the samples were prepped with NEBNext Ultra II DNA library prep kit and sequenced on MiSeq. Paired RNA-seq of bisulfite treated VDH01 samples (3). the samples were prepped with NEBNext Ultra II DNA library prep kit and sequenced on MiSeq.	Illumina MiSeq	22
EGAD00001008743	Single RNA-seq of fCAB treated tRNAs from vdh01 samples (5 replicates). tRNAs were extracted using Mirvana° Invitrogen kit. The samples were prepped with Il TruSeq Small RNA and sequenced on Illumina NextSeq550.	NextSeq 550	5
EGAD00001008744	This dataset contains raw .fastq files of a whole exome sequencing experiment on primary mediastinal large B-cell lymphoma and contains 8 tumor-normal pairs and 14 unpaired tumor samples.	Illumina NovaSeq 6000	30
EGAD00001008745	The cellular origin and differentiation status of glioblastoma by scRNA-seq	Illumina HiSeq 2000 Illumina NovaSeq 6000	20
EGAD00001008746	We generated a single-cell RNA-seq atlas capturing over 100,000 cells spanning all stages of the mouse cerebral development. By examining data from over 100 cerebral tumour samples, our study reveals that, despite the phenotypic/genotypic differences between the tumour types, they are all comprised of developmental sublineages that map most closely to embryonic or juvenile stages of development.	Illumina HiSeq 2500	4
EGAD00001008747	This dataset consists of 1 sample	Sequel	1
EGAD00001008748	this dataset consists of 18 samples	Sequel	18
EGAD00001008751	We have 5 breast cancer patients who received VEN therapy. We collected their peripheral blood samples before and after long-term Venetoclax treatment. We apply CITE-seq protocol to these samples. This collection contains the all CITE-seq data for patients.	Illumina NovaSeq 6000	10
EGAD00001008752	Osteochondral explants were obtained from knee joints (n=17 explants) from the RAAK study. Paired-end 2x150 bp RNA sequencing (Illumina TruSeq mRNA Library Prep Kit, Illumina HiSeq X) was performed.	HiSeq X Ten	17
EGAD00001008753	HGSOC patient-derived organoids and their tissue of origin	Illumina HiSeq 4000 Illumina NovaSeq 6000	13
EGAD00001008755	WES files (fastQ) from 19 germline DNA, 22 tumor DNA, 5 patient-derived xenograft (PDX) DNA, and 9 plasma cfDNA) samples from 28 CRC-BRAF-mutated patients collected at baseline to anti-BRAF/EGFR therapies.	Illumina NovaSeq 6000	55
EGAD00001008756	41 WGS DNA sequences from: Phase I trial of CX-5461, a first-in-class G-quadruplex stabilizer in patients with advanced solid tumors enriched for DNA-repair deficiencies (CCTG IND.231)	Illumina HiSeq X	56
EGAD00001008758	This dataset consists of functional genomic data from 20 Ankylosing Spondylitis patients and 35 Healthy Controls taken from CD4+ T cells, CD8+ T cells and CD14 Monocytes. It contains 364 paired end fastq files consisting of 104 total RNA-seq samples and 116 ATAC samples, for ChIP there are 46 H3K4me3 samples, 46 H3K27ac samples and 3 H3K4me1 samples, along with 49 paired input samples. The samples were sequenced on Illumina HiSeq4000, Illumina NextSeq500 and Illumina NovaSeq 6000 platforms.	Illumina HiSeq 4000 Illumina NovaSeq 6000 NextSeq 500	364
EGAD00001008759	DNA was extracted from archival tissue of 119 patients with various SGC subtypes and sequenced using a targeted NGS panel encompassing 523 cancer related genes (TruSight Oncology 500, TSO500). This dataset belongs to the publication entitled: 'Identification of fusion genes and targets for genetically matched therapies in a large cohort of salivary gland cancer patients'.		119
EGAD00001008760	Bulk tumor RNAseq FASTQ files from 124 samples from patients with hormone sensitive or castration resistant prostate cancer.	Illumina HiSeq 4000	124
EGAD00001008761	Clinical data from this cohort of patients, including hormone sensitive or castration resistant prostate cancer, overall survival, NMF subtypes, tumor TMB, prior treatment status, PD-L1 IC and TC scores from SP142 and SP263 as well as percentage of CD8 IHC at tumor center.		1
EGAD00001008762	Raw count matrix of the 124 bulk tumor RNAseq samples from patients with hormone sensitive or castration resistant prostate cancer.		1
EGAD00001008763	This dataset includes bam files (aligned to hg38) from the germline of pediatric cancer patients.	Illumina HiSeq 2500 NextSeq 550	1
EGAD00001008764	The single base substitution mutational signatures SBS2 and SBS13, likely caused by APOBEC cytosine deaminases, are common in many human cancer types. However, the stimulus activating APOBEC mutagenesis is unknown and understanding of when it occurs in the progression from normal to cancer cell is limited. Here, as part of a wider survey of human tissues, we whole genome sequenced 342 microdissected normal epithelial crypts from the small intestines of 39 individuals. SBS2/13 mutations were present in 17% normal small intestine crypts and were likely due to APOBEC3A activity. Localised clusters of SBS2/13 mutations (kataegis) were also commonly found. APOBEC mutation burdens were variable between individuals and between crypts from the same individual. Crypts with SBS2/13 often had immediate crypt neighbours without SBS2/13, suggesting that the underlying cause of SBS2/13 is cell-intrinsic rather than a widely distributed microenvironmental exposure, or needs to be permitted by cell-intrinsic conditions. APOBEC mutagenesis occurred throughout the human lifespan, including in young children, and was episodic with a small number of episodes occurring during the life history of a single cell. The results indicate that APOBEC mutagenesis is more common in the small intestine epithelium than in many other cell types, and is an episodic process in vivo initiated or permitted by cell intrinsic factors.	HiSeq X Ten Illumina NovaSeq 6000	31
EGAD00001008765	The dataset contains fecal WGS samples of 196 participants to the HELIUS study, as well as VLP filtrate sequencing for a subset of 48 participants.	Illumina NovaSeq 6000	244
EGAD00001008766	This dataset contains 56 fastq files of paired-end RNA sequencing of a Illumina® TrueSeq stranded mRNA library of 28 renal cell carcinoma PDX samples.	Illumina NovaSeq 6000	28
EGAD00001008767	This study included nasal gene expression data collected from nasal brushes of adolescents in PIAMA birth cohort, which is used in the project Nasal DNA methylation at three CpG sites predicts childhood allergic disease. 186 samples were included in this analysis, and phenotypes (age and sex) were provided together with a gene expression count table. Gene expression was measured by the Illumina HiSeq2500 sequencer.		1
EGAD00001008768	Dataset contains paired-end Whole Exome sequencing data from 5 tumor samples and 1 single normal blood sample from a single primary GBM patient.		6
EGAD00001008769	WGS data from healthy reference iPSC lines. The median coverage is 41-50x. >85% have a coverage of >30x. 97% of the variants are known.	Illumina NovaSeq 6000	3
EGAD00001008770	We assessed the transcriptome and chromatin states of patient and control samples at both bulk and single-cell resolutions with RNA-seq and ATAC-seq. Maternal-fetal interface samples were collected from 7 patients infected with SARS-CoV-2 during late pregnancy, and from 7 gestational age-matched control donors. Raw and processed files are provided in this dataset.	NextSeq 500	14
EGAD00001008771	The data set includes MBL2 genotypes and clinical phenotypes of a cohort of patients with critical Covid-19. The files included in the data set include a vcf file with single nucleotide variants, and a file with clinical phenotypes.		331
EGAD00001008772	This dataset contains 6 Fastq files from 3 samples (pre-culture (n=1), post culture in standard (n=1) or niche-llike (n=1) conditions) from 1 AML patient. They correspond to single-cell RNA sequencing on Illumina plateform of 3 libraries prepared with 10X Genomics gene expression V3.1 chemistry.	Illumina NovaSeq 6000	3
EGAD00001008773	This dataset contains 18 Fastq files from 9 samples (pre-culture (n=3), post culture in standard (n=4) or niche-llike (n=2) conditions) from 4 patients.They correspond to bulk RNA sequencing on Illumina instrument of libraries prepared using SMARTer Universal Low Input RNA Kit.	Illumina NovaSeq 6000	9
EGAD00001008774	1075 members of the LBC1936 were sequenced using the Illumina HiSeq X platform. This dataset contains the paired fastq files.	HiSeq X Ten	1075
EGAD00001008775	297 members of the LBC1921 were sequenced using the Illumina HiSeq X platform. This dataset contains the fastq files.	HiSeq X Ten	1
EGAD00001008776	Cellular deconvolution algorithms virtually reconstruct tissue composition by analyzing the gene expression of complex tissues. Here, we present the decision tree machine learning algorithm, Kassandra, trained on a broad collection of > 9,400 tissue and blood sorted cell RNA profiles to accurately reconstruct the tumor microenvironment (TME). Bioinformatic correction for technical and biological variability, aberrant cancer cell expression inclusion, and the accurate quantification and normalization of transcript expression increased the stability and robustness of Kassandra. Performance was validated on over 4,000 H&E tissue slides and more than 1,000 normal and tumor tissues by comparison with cytometric, immunohistochemical or single-cell RNA-seq measurements. Kassandra accurately deconvolved stromal and immune elements of blood and tumors, revealing the role of the TME in tumor pathogenesis. Digital TME reconstruction revealed that the presence of PD1-positive CD8+ T cells strongly correlated with immunotherapy response and increased the predictive potential of established biomarkers, indicating that Kassandra could potentially be utilized in future clinical applications.	Illumina NovaSeq 6000 NextSeq 550	348
EGAD00001008777	This submission includes raw FASTQ files (for bulk RNA-seq and 10X joint snATAC+snRNA multiome profiling experiments), sample phenotype files, and genotypes for the data included in the manuscript.	Illumina NovaSeq 6000	72
EGAD00001008778	Whole-genome sequencing (WGS) data.	unspecified	5
EGAD00001008779	Single-cell DNA sequencing (scDNA-seq) data.	Illumina NovaSeq 6000	3
EGAD00001008780	Excess sugar consumption is common among youth and can have adverse health effects. However, the relationship between saliva microbiota and sugar consumption remains sparsely studied. We aimed to explore diversity, composition and functional capacities of saliva microbiota in 11–13-year-old Finnish children with low and high sweet treat consumption.	Illumina HiSeq 2500	453
EGAD00001008781	The dataset comprises of transcriptomes of tissue sections derived from either the tumour normal interface or tumour core from clear cell renal cell carcinomas. 16 sections are sampled in total using 10x Genomics' Visium technology.	Illumina NovaSeq 6000	16
EGAD00001008782	Column 1 “rsid”: SNP identifier Column 2 “chromosome”: name of chromosome on which the SNP is located Column 3: “position”: base pair position on the chromosome Column 4 “minor_test_allele”: the base that constitutes the minor allele Column 5 “major_allele”: the base that constitutes the major allele Column 6 “maf”: the frequency of the minor allele, indicated as a fraction of 1 Column 7 “allele_freq_cases”: the minor allele frequency in cases Column 8 “allele_freq_controls”: the minor allele frequency in controls Column 9 “regression_pvalue”: the p-value for the difference in allele frequency between cases and controls Column 10 “odds_ratio”: the odds ratio, as calculated using logistic regression under an additive model with adjustment for the first ten principal components of ancestry		1
EGAD00001008783	This dataset contains one vcf file with variants from whole exome sequencing of 24 paediatric AML samples at diagnosis.		1
EGAD00001008785	Clinical data for KATHERINE: Clinical data include Treatment Arm, Invasive Disease Free Survival (IDFS), Clinical Stage at Presentation, Hormone Receptor Status, Preoperative HER2 Directed Therapy, Pathological Node.		1059
EGAD00001008786	Biomarker data for KATHERINE: Biomarker data include RNA-seq time point, Percent of tumor content, PAM50 subtypes, normalized gene expression of ERBB2, CD8 and CD274, normalized immune signature expression.		1059
EGAD00001008788	Chronic obstructive pulmonary disease (COPD) is a major respiratory disease characterized by small airway inflammation, emphysema and severe breathing difficulties. Low-grade systemic inflammation is an established hallmark of severe disease, however, the molecular changes in peripheral immune cells remain far from understood. We combined multi-color flow cytometry with single-cell RNA sequencing and showed that blood neutrophil numbers are significantly increased in COPD and they are a heterogeneous population. A transcriptomic state that expressed interferon response genes correlated with alveolar damage and acute exacerbations. Furthermore, bronchoalveolar neutrophils expressed gene signatures corresponding to certain blood neutrophil states. Last, our data in a murine model of cigarette smoke exposure demonstrated that bone marrow neutrophil progenitors are expanded in smoke-treated animals and display signs of immune activation. Our study provides evidence that COPD systemic inflammation may derive from an activated haematopoietic precursor compartment.	NextSeq 500	25
EGAD00001008789	Results of comprehensive immune deconvolution analysis through the TIMER2 web portal with algorithms specified in the publication.		1
EGAD00001008790	Recurrently altered genes based on FoundationOne sequencing.		-
EGAD00001008791	TCR-beta sequences, frequencies, and VDJ usage.		-
EGAD00001008792	"Master" file of patient clinical characteristics and outcomes, samples, and the results of certain analyses, including immunohistochemistry.		1
EGAD00001008793	Log2 gene expression count data from RNA sequencing.		1
EGAD00001008794	TCR-beta specificity motifs based on GLIPH2.		-
EGAD00001008796	Individual FASTQ files from RNA sequencing.	Illumina HiSeq 2500	66
EGAD00001008797	Individual FASTQ files from TCR sequencing.	Illumina NovaSeq 6000	45
EGAD00001008798	Illumina platform whole genome sequencing data for matched tumour-normal DNA samples from 570 melanoma patients		1139
EGAD00001008799	This dataset includes fastq files for total RNAseq of 104 patient biopsies with metastatic castration resistant prostate cancer. The RNAseq libraries were rRNA depleted and sequenced at 150bp paired-end on Illumina Novaseq.	Illumina NovaSeq 6000	104
EGAD00001008800	We copy number profiled 688 tumor regions from 300 patients presenting with advanced prostate cancer and prospectively followed-up (median, 7 years) in the control group of the STAMPEDE trial. Patients were categorised into four metastatic states, namely high-risk non-metastatic (with or without local lymph node involvement) or metastatic (low or high volume).	Illumina NovaSeq 6000	603
EGAD00001008801	This dataset contains chromosomal conformation capture data from fourteen samples (eleven tumor samples and three tumor derived cell lines). Libraries were prepared using the Illumina TruSeq LT sequencing adaptors. Sequencing was performed on the HiSeq X or NovaSeq platforms resulting in 28 FASTQ files.	Illumina NovaSeq 6000	14
EGAD00001008802	whole genome sequencing of six commonly used breast cancer cell lines and six patient derived xenograft models	HiSeq X Ten	14
EGAD00001008805	This dataset contains Whole Genome Bisulfite sequencing data from seven samples (six tumor samples and on tumor derived cell line). Sequencing was performed Illumina HiSeq 2000 machine resulting in 14 FASTQ files.	Illumina HiSeq 2000	1
EGAD00001008806	This dataset contains CTCF ChIP-sequencing data from seven samples (six tumor samples and one tumor derived cell line). Following library amplification, DNA fragments were sequenced using Illumina HiSeq 2000 paired-end sequencing resulting in 14 FASTQ files.	Illumina HiSeq 2000	1
EGAD00001008807	n=4 Ctrl and n=4 HD fibroblasts lines were treated with DMSO or 10nM Branaplam for 72h and RNA-seq was performed.	Illumina NovaSeq 6000	16
EGAD00001008808	n=3 Ctrl and n=3 HD iPSC lines differentiated into cortical neurons were treated with DMSO or 10nM Branaplam for 72h and RNA-seq was performed.	Illumina NovaSeq 6000	12
EGAD00001008809	5 human plasma cell-free DNA cases (BS-seq)	NextSeq 500	5
EGAD00001008810	36 mouse plasma cell-free DNA cases	NextSeq 500	36
EGAD00001008811	We profiled 87 primary-recurrentpatient-matched paired GBM specimens with single-nucleus RNA and bulk-DNA sequencing and single-cell open-chromatin and spatial transcriptomics/proteomics assays. We found that recurrent GBMs are characterized by a shift to a mesenchymal phenotype in response to therapy	Illumina NovaSeq 6000	71
EGAD00001008815	Meta-data/patient information for the bulk RNAseq data		1
EGAD00001008816	Bulk RNAseq of sigmoid colon biopsies from healthy volunteers and ulcerative colitis patients. Subjects were treated with Placebo or IL-22Fc at different doses, and biopsies were collected at day 0 and day 30 post treatment and prepared for RNA sequencing.	Illumina NovaSeq 6000	83
EGAD00001008817	Fecal WMS data from NCT02749630 ulcerative colitis patients. Stool samples were collected at screening as well as on day 64 and prepared for whole metagenomic sequencing.	Illumina HiSeq 4000	93
EGAD00001008818	Fecal 16S-V4 rRNA gene sequence data from NCT02749630 ulcerative colitis patients. Stool samples were collected at screening as well as on days 29, 43, 64, 85, and 134 processed for 16SV4 rRNA gene sequencing	Illumina MiSeq	192
EGAD00001008819	Metadata for fecal WMS data from NCT02749630 healthy volunteers.		1
EGAD00001008820	Metadata for fecal WMS data from NCT02749630 ulcerative colitis patients.		1
EGAD00001008821	Metadata for fecal 16S-V4 rRNA gene sequence data from NCT02749630 healthy volunteers.		1
EGAD00001008822	Metadata for fecal 16S-V4 rRNA gene sequence data from NCT02749630 ulcerative colitis patients.		1
EGAD00001008823	Metadata for 16S-V4 rRNA gene sequence data for intestinal biopsies from NCT02749630 trial participants.		1
EGAD00001008824	RNASeq files for Roussel-MPBRG paper titled "Combination of ribociclib and gemcitabine for the treatment of medulloblastoma"	Illumina HiSeq 2000	98
EGAD00001008825	Exome sequencing data from seven phenotypically abnormal human fetal samples. Anaysis perfomed using Illumina NovaSeq 6000, Twist Bioscience - Human Comprehensive Exome. Paired end fastq files were aligned to hg38 reference genome using BWA-MEM v0.7.15, followed by sorting using SAMtools sort v1.3.1, and duplicate reads marked using Picard Tools MarkDuplicates v2.18.2	Illumina NovaSeq 6000	11
EGAD00001008826	Mesothelioma of the peritoneum (n=21) and Pseudomyxoma peritonei/mucinous adenocarcinoma of the appendix (n=11)	Illumina HiSeq 4000	32
EGAD00001008827	Libraries of liCHi-C for different input cell numbers (50k, 100k, 250k, 500k and 1M cells) with 2 biological replicates each. Fastq file format	unspecified	10
EGAD00001008828	Libraries of liCHi-C for 9 blood cell types (HSC, CMP, CLP, Ery, Mon, MK, nB, nCD4 and nCD8) with 2 biological replicates each. Fastq file format.	unspecified	18
EGAD00001008829	Libraries of liCHi-C for 2 B-ALL (B Acute Lymphocytic Leukaemia) from human patients. Fastq file format.	unspecified	4
EGAD00001008830	RNAseq data generated from paired tumor frozen tissues in which the tumor organoids were established for cell-cell or cell-matrix adhesion dependency assay.	Illumina HiSeq 1500	52
EGAD00001008831	This dataset includes whole genome sequences of 75 synchronous primary tumors, 15 metastases, and corresponding normal samples from 13 patients with multifocal ileal neuroendocrine tumors. The whole genomes were sequenced on Illumina HiSeq X Ten to generate 151-bp paired-end reads, which were aligned to GRCh38/hg38 reference assembly using BWA–MEM and duplicate-marked with Picard tools. GATK was utilized for base score recalibration and local indel re-alignments. The whole genomes are provided as CRAM files.	HiSeq X Ten	108
EGAD00001008832	This dataset contains RNA-seq data (Fastq files of paired-end data) of 18 patient tumors used for identification of neotranscripts in 18 different types of fusion-driven sarcomas and other cancers as described in Vibert et al., Mol Cell 2022 (PMID: 35550257)	Illumina HiSeq 2500	18
EGAD00001008834	Single cell RNA-seq from D0,D11,D16,D21,D28 of dopamingeric differentiation from hESCs cell lines H9 and HS980 using current protocols. Different time points along the differentiation for each cell line were multiplexed using BD™ Single-Cell Multiplexing Kit for use with the 10x Chromium™ Single Cell 3’ Reagent Kit v2.	Illumina NovaSeq 6000	8
EGAD00001008835	Single-cell RNA-seq cases (Tumor and adjacent tissue)	NextSeq 500	7
EGAD00001008836	Plasma RNA sequencing (consists of 70 cases)	NextSeq 500	70
EGAD00001008837	Illumina RNASeq sequencing of tumour samples from 230 cases of melanoma		230
EGAD00001008838	Consists of 76 mouse plasma cell-free DNA, 30 mouse Liver DNA, 10 human plasma cell-free DNA	NextSeq 500	116
EGAD00001008839	Homologous recombination deficiency (HRD) score in a large cohort of 55 triple-negative breast cancer PDX	Illumina NovaSeq 6000	55
EGAD00001008840	Fecal 16S-V4 rRNA gene sequence data from NCT02749630 healthy volunteers. Stool samples were collected at screening as well as on days 29, 43, 64, 85, and 134 processed for 16SV4 rRNA gene sequencing	Illumina MiSeq	206
EGAD00001008841	Fecal WMS data from NCT02749630 healthy volunteers. Stool samples were collected at screening as well as on days 29, 43, and 64 and prepared for whole metagenomic sequencing.	Illumina HiSeq 4000	53
EGAD00001008842	RNASeq files for Roussel paper titled "Combination of CDK4/6 with BET-bromodomain and PI3K/mTOR inhibitors in medulloblastoma in vitro and in vivo"	Illumina HiSeq 2000	39
EGAD00001008843	16S-V4 rRNA gene sequence data for intestinal biopsies from NCT02749630 trial participants. Biopsies from patients were collected at screening, day 30, and day 85 and prepared for 16SV4 rRNA gene sequencing.	Illumina MiSeq	132
EGAD00001008844	Rare cancer sequencing data of 23 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000	16
EGAD00001008845	Rare cancer sequencing data of 28 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000	22
EGAD00001008846	Rare cancer sequencing data of 45 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000	26
EGAD00001008847	Rare cancer sequencing data of 95 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000	64
EGAD00001008848	Rare cancer sequencing data of 55 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000	44
EGAD00001008849	Rare cancer sequencing data of 87 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000	58
EGAD00001008850	Rare cancer sequencing data of 59 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	38
EGAD00001008851	Rare cancer sequencing data of 75 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000	49
EGAD00001008852	Rare cancer sequencing data of 50 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	34
EGAD00001008853	Rare cancer sequencing data of 44 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	30
EGAD00001008854	Rare cancer sequencing data of 40 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	26
EGAD00001008855	Rare cancer sequencing data of 97 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	61
EGAD00001008856	Rare cancer sequencing data of 48 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	33
EGAD00001008857	Rare cancer sequencing data of 58 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	40
EGAD00001008858	Rare cancer sequencing data of 49 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	35
EGAD00001008859	Rare cancer sequencing data of 243 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	164
EGAD00001008860	Rare cancer sequencing data of 47 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	41
EGAD00001008861	Rare cancer sequencing data of 92 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000	62
EGAD00001008862	Rare cancer sequencing data of 145 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000	104
EGAD00001008863	Rare cancer sequencing data of 119 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000	76
EGAD00001008864	Thirty six samples were sequenced and analysed.	Illumina HiSeq 1500	36
EGAD00001008865	Rare cancer sequencing data of 87 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2500 Illumina HiSeq 4000	87
EGAD00001008866	Rare cancer sequencing data of 48 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	48
EGAD00001008867	Rare cancer sequencing data of 12 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	12
EGAD00001008868	Rare cancer sequencing data of 94 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	91
EGAD00001008869	Rare cancer sequencing data of 46 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	46
EGAD00001008870	Rare cancer sequencing data of 62 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	59
EGAD00001008871	Rare cancer sequencing data of 119 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	117
EGAD00001008872	Rare cancer sequencing data of 54 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	52
EGAD00001008873	Rare cancer sequencing data of 30 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2500 Illumina HiSeq 4000	30
EGAD00001008874	Rare cancer sequencing data of 18 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	18
EGAD00001008875	Rare cancer sequencing data of 64 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2500 Illumina HiSeq 4000	61
EGAD00001008876	Rare cancer sequencing data of 18 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	12
EGAD00001008877	Rare cancer sequencing data of 55 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	55
EGAD00001008878	Rare cancer sequencing data of 38 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	31
EGAD00001008879	Rare cancer sequencing data of 56 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	54
EGAD00001008880	Rare cancer sequencing data of 86 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2500 Illumina HiSeq 4000	83
EGAD00001008881	Rare cancer sequencing data of 49 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	44
EGAD00001008882	Rare cancer sequencing data of 91 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	78
EGAD00001008883	Rare cancer sequencing data of 162 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2500 Illumina HiSeq 4000	149
EGAD00001008884	Rare cancer sequencing data of 83 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	71
EGAD00001008885	Rare cancer sequencing data of 96 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	89
EGAD00001008886	Rare cancer sequencing data of 29 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	28
EGAD00001008887	Rare cancer sequencing data of 66 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	58
EGAD00001008888	Rare cancer sequencing data of 48 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	41
EGAD00001008889	Rare cancer sequencing data of 22 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	17
EGAD00001008890	Rare cancer sequencing data of 76 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	72
EGAD00001008891	Rare cancer sequencing data of 164 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	159
EGAD00001008892	Rare cancer sequencing data of 42 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	42
EGAD00001008893	Rare cancer sequencing data of 112 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	100
EGAD00001008894	Rare cancer sequencing data of 34 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	28
EGAD00001008895	Rare cancer sequencing data of 49 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	43
EGAD00001008896	Rare cancer sequencing data of 137 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	138
EGAD00001008897	Rare cancer sequencing data of 246 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	250
EGAD00001008898	Rare cancer sequencing data of 34 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2500 Illumina HiSeq 4000	34
EGAD00001008899	Rare cancer sequencing data of 6 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	6
EGAD00001008900	Rare cancer sequencing data of 142 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	134
EGAD00001008901	Rare cancer sequencing data of 28 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 4000	24
EGAD00001008902	Rare cancer sequencing data of 85 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	77
EGAD00001008903	Rare cancer sequencing data of 36 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	34
EGAD00001008904	Rare cancer sequencing data of 112 runs in tumor/control pairs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	106
EGAD00001008905	RNA-Seq, WES and WGS data of 5 rare tumor/control pairs which were submitted to other HIPO projects, not MASTER. The sequencing was always paired.	HiSeq X Ten Illumina HiSeq 2000	11
EGAD00001008906	Part of the published data from EGAS00001004662 resulted in the publication of this study EGAS00001004813	HiSeq X Ten	5
EGAD00001008907	The TransplantLines Gut Microbiome study includes raw data generated by shotgun metagenomic sequencing of fecal samples of solid organ transplant recipients and basic phenotypes (age and sex, BMI).	Illumina HiSeq 2000	1177
EGAD00001008908	Chronic obstructive pulmonary disease (COPD) is a major respiratory disease characterized by small airway inflammation, emphysema and severe breathing difficulties. Low-grade systemic inflammation is an established hallmark of severe disease, however, the molecular changes in peripheral immune cells remain far from understood. We combined multi-color flow cytometry with single-cell RNA sequencing and showed that blood neutrophil numbers are significantly increased in COPD and they are a heterogeneous population. A transcriptomic state that expressed interferon response genes correlated with alveolar damage and acute exacerbations. Furthermore, bronchoalveolar neutrophils expressed gene signatures corresponding to certain blood neutrophil states. Last, our data in a murine model of cigarette smoke exposure demonstrated that bone marrow neutrophil progenitors are expanded in smoke-treated animals and display signs of immune activation. Our study provides evidence that COPD systemic inflammation may derive from an activated haematopoietic precursor compartment.	NextSeq 500	4
EGAD00001008909	Genome and transcriptome sequence data from a colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008910	Genome and transcriptome sequence data from a neuroendocrine tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008911	Genome and transcriptome sequence data from a Ewing sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008912	Genome and transcriptome sequence data from a metastatic synovial sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008913	Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008914	Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008915	Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008916	Genome and transcriptome sequence data from a carcinoma of unknown primary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008917	Genome and transcriptome sequence data from a nasopharynx carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008918	Genome and transcriptome sequence data from a pancreatic ductal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008919	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008920	Genome and transcriptome sequence data from a tongue squamous cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008921	Genome and transcriptome sequence data from a endometrial carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008922	Genome and transcriptome sequence data from a intrahepatic cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008923	Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008924	Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008925	Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008926	Genome and transcriptome sequence data from a lung adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008927	Genome and transcriptome sequence data from a carcinoma of unknown primary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008928	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008929	Genome and transcriptome sequence data from a hepatocellular carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008930	Genome and transcriptome sequence data from a adrenocortical carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008931	Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008932	Genome and transcriptome sequence data from a mucosal melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008933	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008934	Genome and transcriptome sequence data from a pancreatic acinar cell adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008935	Genome and transcriptome sequence data from a paraganglioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008936	Genome and transcriptome sequence data from a thoracic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008937	Genome and transcriptome sequence data from a salivary gland carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008938	Genome and transcriptome sequence data from a lung adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008939	Genome and transcriptome sequence data from a neuroendocrine tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008940	Genome and transcriptome sequence data from a clear cell adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008941	Genome and transcriptome sequence data from a cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008942	Genome and transcriptome sequence data from a breast carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008943	Genome and transcriptome sequence data from a melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008944	Genome and transcriptome sequence data from a hematopoietic or lymphoreticular systems myeloma-multiple patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008945	Genome and transcriptome sequence data from a carcinoma of unknown primary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008946	Genome and transcriptome sequence data from a neuroendocrine carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008947	Genome and transcriptome sequence data from a multiple myeloma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008948	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001008949	Organoid cultures derived from colorectal adenomas. RNA and DNA was isolated from these cultures for genome wide profiling.	Illumina HiSeq 2500	1
EGAD00001008950	WES sequencing of multiple regions per tumor from 8 lung cancer patients (LUSC, LCNEC and LUAD) and adjacent healthy lung tissue for each patient.	unspecified	111
EGAD00001008951	The dataset comprises bulk RNA‑seq libraries generated on the Illumina NextSeq 500 platform, including primary human hepatocytes (four donors), colon tissue (four masked replicates), and a comprehensive set of iPSC‑derived samples from the JHU106, ChiPSC18, ChiPSC22, and H9 lines. The collection covers multiple differentiation stages from iPSC to DE to HLC and cultivation protocols (CEL, HAY, SPH) with at least three replicates per condition. All sequencing data are provided as raw FASTQ files, with colon samples delivered as masked FASTQ in accordance with patient consent requirements.	Illumina HiSeq 2500	57
EGAD00001008952	We applied an integrative single-cell genomics strategy with single nucleus RNA sequencing (snRNA-seq) and single nucleus Assay for Transposase-Accessible Chromatin sequencing (snATAC-seq) together with spatial transcriptomics from the same tissue mapping human cardiac cells in homeostasis and after myocardial infarction (MI) at unprecedented spatial and molecular resolution. We profiled in total 31 samples from 23 patients including four non-transplanted donor hearts as controls and samples from tissues with necrotic tissue areas (ischemic zone, IZ), border zone (BZ), and the non-affected left ventricular myocardium (remote zone, RZ) of patients with acute MI.	Illumina NovaSeq 6000	27
EGAD00001008953	Whole-exome sequencing	Illumina HiSeq 2000	1
EGAD00001008954	Whole-genome sequencing data	unspecified	67
EGAD00001008955	Fastq files of single nucleus RNA Sequencing data from 26 patients including 26 lung adenocarcioma and 12 matched healthy tissue samples for 8 young female never smokers, 8 young female smokers, 7 elderly female never smokers and 3 male never smokers.	Illumina HiSeq 4000	38
EGAD00001008956	Aligned BAM files with removed duplicate reads of targeted sequencing data (exomes of a panel of 153 genes) from 12 skin and 5 oral epithelial bulk samples from 2 donors. Sequences generated by the BGI DNB-SEQ platform.	unspecified	6
EGAD00001008957	GWAS genotype data of 2,393 Japanese COVID-19 cases.		2393
EGAD00001008958	Consists of 88 cases	unspecified	88
EGAD00001008959	RNA-seq data	unspecified	-
EGAD00001008960	Single cell DNA-seq data (CLL gene panel - Tapestri single-cell DNA CLL panel, Mission Bio)	Illumina NovaSeq 6000	-
EGAD00001008961	Single cell RNA-sequencing of treatment naïve PDAC patient samples. We have 10 samples, sequenced using the 10X genomics chromium platform with 3 prime chemistry. We are submitting FASTQ files representing the index files (I1), Read1 (R1) and Read2 (R2).	Illumina HiSeq 4000	10
EGAD00001008962	ATAC-seq data. Dataset includes FASTQ files, BAM files, and analysis files with the ATAC-seq peaks determined using MACS2.	unspecified	-
EGAD00001008963	A method for multiplexed full-length single-molecule sequencing of the human mitochondrial genome - cell line data	GridION Illumina NovaSeq 6000	5
EGAD00001008964	ChIP-seq peaks of H3K27ac. The dataset includes FASTQ files, BAM files, and analyses of the ChIP-seq peaks of H3K27ac determined using MACS2	unspecified	-
EGAD00001008965	RNAseq dataset containing 20 control and 2 SLFN14 K219N patient samples, derived from platelets. Sequencing libraries were constructed using an Illumina TruSeq stranded Ribo Zero Gold kit and paired end sequenced at a read depth of 30 million reads on an Illumina NovaSeq 6000 platform.	Illumina NovaSeq 6000	22
EGAD00001008967	RNAseq of circulating monocytes of familial hypercholesterolaemia (FH) patients before and after treatment, and healthy controls. Please cite original paper: Monocyte and macrophage lipid accumulation results in downregulated type-I interferon responses. Willemsen et al. Frontiers in Cardiovascular Medicine (2022) Familial hypercholesterolemic patients (n=10) and healthy subjects (n=9): the study population, design, and further processing of these human study subjects and their samples have been extensively described (Stiekema et al., 2021). Briefly, untreated FH patients who indicated to start lipid-lowering therapy (statin, PCSK9 antibody, and/or ezetimibe) according to their treating physician were included. The healthy controls were age, sex, and body mass index (BMI) matched with the FH patients. After inclusion, FH patients fasted for at least 9 hours before blood samples were drawn for lipid measurements and monocyte isolation. This was repeated after 12 weeks of lipid-lowering therapy. RNA-seq was performed on circulating monocytes. V1 = visit 1. V2 = visit 2.	Illumina NovaSeq 6000	1
EGAD00001008968	Arcagen is an EORTC/SPECTA pan-European project that aims to recruit 1000 rare cancer patients from different tumour domains of EURACAN. This study collected samples from advanced or metastatic rare cancer from patients older than 12, and analysed them using Foundation Medicine next-generation sequencing (NGS) panels (FoundationOne CDx for FFPE samples or FoundationOne Liquid CDx for blood samples). Here we are submitting two datasets that contain NGS files from gastrointestinal rare cancers (n=119): - Dataset 2 (87 patients): Intra-hepatic cholangiocarcinoma (n=47), Extra-hepatic, cholangiocarcinoma (n=16), Not specified Cholangiocarcinoma (n=9), Small bowel adenocarcinoma (n=6) and other rare GI cancer (n=9)	Illumina HiSeq 4000	87
EGAD00001008969	Reads were processed with the RNA-seq workflow 1.3.0 developed by the DKFZ Omics IT and Data Management Core Facility (https://github.com/DKFZ-ODCF/RNAseqWorkflow). First, FASTQ reads were aligned via two-pass alignment using STAR 2.5.3a. The STAR index was generated from the 1000 Genomes assembly and GENCODE Version 19 gene models with a sjdbOverhang of 200. Duplicate marking of the resultant main alignment file was done with sambamba 0.6.5. Gene-specific read counting was performed using featureCounts (from Subread 1.5.1) over exon features based on GENCODE Version 19 gene models. Both reads of a paired fragment were used for counting, and the quality threshold was set to 255, indicating that STAR found a unique alignment. Strand-specific counting was also used. For RPKM and TPM calculations, all genes on chromosomes X and Y, the mitochondrial genome, as well as rRNA and tRNA genes were omitted as they are likely to introduce library size estimation biases.		10
EGAD00001008970	Nanopore RNA Sequencing was done for 10 tumor samples. Direct cDNA sequencing was performed using the SQK-DCS109 kit (Oxford Nanopore Technologies). For analysis of a single sample on a MinION flow cell (version R9.4.1), 5 μg RNA was used as input. For multiplexing on a MinION flow cell, 2.5 μg RNA per sample was used as input, and the native barcoding expansion kit EXP-NBD104 was employed in conjunction with SQK-DCS109. After reverse transcription with Maxima H Minus Reverse Transcriptase (Thermo Scientific), second-strand synthesis was performed using the 2x LongAmp Taq Master Mix (New England Biolabs). The resulting double-stranded cDNA was subjected to end-repair and dA-tailing using the NEBNext Ultra End Repair/dA-Tailing Module (New England Biolabs). For multiplexed libraries, this step was followed by barcode ligation and library pooling. Next, libraries were quantified with a Qubit Fluorometer 3.0 (Life Technologies). Finally, sequencing adapters were added to the library preparations and ligated with Blunt/TA Ligase Master Mix (New England Biolabs), followed by further quality control using a Qubit. Samples ACC1 and ACC2 were analyzed on individual MinION flow cells, while the remaining eight samples were sequenced as multiplexed libraries on two MinION flow cells by pooling four samples for each run. Five ACC samples were also analyzed individually on Flongle flow cells	MinION	10
EGAD00001008971	Fastq files of paired RNA-Seq of 10 different tumor samples, for which Nanopore and Illumina sequencing was compared. Illumina sequencing was carried out with HiSeq4000 or HiSeq X-Ten using the Illumina TruSeq stranded mRNA Kit.	HiSeq X Ten Illumina HiSeq 4000	10
EGAD00001008972	Ovarian cancer EV RNA-seq	Illumina NovaSeq 6000	24
EGAD00001008973	WGS	Illumina NovaSeq 6000	24
EGAD00001008974	This dataset contains DNA and RNA sequencing information for AML, Gliomas, brain tumors (medulloblastoma and ependymoma), DIPG, rhabdoid tumors and soft tissue sarcomas. In total 39 samples are present (14 matched normal, 25 tumor samples). Not all samples have matched normals and not all samples have RNA sequencing data..	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500	37
EGAD00001008975	234 BAM files containing capture data of MCL tumours and constitutive DNA	Illumina HiSeq 3000	235
EGAD00001008976	Datasets used in the article "The genetic and linguistic admixture histories of the islands of Cabo Verde" by Laurent R et al. eLife 2023 (DOI: https://doi.org/10.7554/eLife.79827 - URL: https://elifesciences.org/articles/79827) File name "eGAdeposit_233CaboVerde_SampleInfo_FINAL_01062022.txt" Column 1 corresponds to individual alphanumeric codes as in the "eGAdeposit_233CaboVerde_GenotypeFile_FINAL_01062022.vcf" genotype file Column 2 corresponds to individual's biological sex as per genetic inference Column 3 corresponds to individual's self-reported age in years Column 4 corresponds to individual's self-reported cumulated number of years spent in academic or professional education		1
EGAD00001008977	Datasets used in the article "The genetic and linguistic admixture histories of the islands of Cabo Verde" by Laurent R et al. eLife 2023 (DOI: https://doi.org/10.7554/eLife.79827 - URL: https://elifesciences.org/articles/79827) File name "eGAdeposit_233CaboVerde_GEOcoordFULL_FINAL_01062022.txt" Column 1 corresponds to individual alphanumeric codes as in the "eGAdeposit_233CaboVerde_GenotypeFile_FINAL_01062022.vcf" genotype file Column 2-3 corresponds to X-Y GPS coordinates of individual's interview location in Cabo Verde Column 4-5 corresponds to X-Y GPS coordinates of individual's self-reported residence location at the time of the interview Column 6-7 corresponds to X-Y GPS coordinates of individual's self-reported birth-place location Column 8-9 corresponds to X-Y GPS coordinates of individual's self-reported paternal birth-place location Column 10-11 corresponds to X-Y GPS coordinates of individual's self-reported maternal birth-place location		1
EGAD00001008978	Datasets used in the article "The genetic and linguistic admixture histories of the islands of Cabo Verde" by Laurent R et al. eLife 2023 (DOI: https://doi.org/10.7554/eLife.79827 - URL: https://elifesciences.org/articles/79827) File name "eGAdeposit_225CaboVerde_FreeSpeech_Utterance_counts_FINAL_01062022.txt" Column 1 corresponds to individual alphanumeric codes as in the "eGAdeposit_233CaboVerde_GenotypeFile_FINAL_01062022.vcf" genotype file. Note that only 225 unrelated Cabo Verdean-born individuals are considered here, out of the 233 individuals in the genotype file. See Material and Methods in Romain Laurent et al. 2022 - doi pending Each 4831 other column correspond to the respective individual's utterance count in the free speech transcribed in ALUPEC and provided as column header. See See Material and Methods in Romain Laurent et al. 2022 - doi pendingColumn 1 corresponds to individual alphanumeric codes as in the "eGAdeposit_233CaboVerde_GenotypeFile_FINAL_01062022.vcf" genotype file. Note that only 225 unrelated Cabo Verdean-born individuals are considered here, out of the 233 individuals in the genotype file. See Material and Methods in Romain Laurent et al. 2022 - doi pending Each 4831 other column correspond to the respective individual's utterance count in the free speech transcribed in ALUPEC and provided as column header. See Material and Methods in Laurent R et al. eLife 2023		1
EGAD00001008979	Datasets used in the article "The genetic and linguistic admixture histories of the islands of Cabo Verde" by Laurent R et al. eLife 2023 (DOI: https://doi.org/10.7554/eLife.79827 - URL: https://elifesciences.org/articles/79827) As per Materials and Methods herein, the genotype data corresponds to 2,118,722 autosomal SNPs genotyped from the IlluminaOmni 2.5 Million BeadChip for 233 Cabo Verdean volunteer participants, family unrelated at the 2nd degree based on population genetics analyses (see Material and Methods). SNP rsID, Chromosome position and genetic position in (bp) are in Build GRCh38. Cabo Verdean individuals are designated with an alphanumeric unique code		1
EGAD00001008980	31 pregnant women at different trimesters, 6 hepatitis B carriers, and 8 patients with hepatocellular carcinoma	PromethION	46
EGAD00001008981	Artificial mixtures of sonicated human and mouse DNA at different sizes were sequenced	PromethION	2
EGAD00001008982	Artificial mixtures of sonicated human and mouse DNA at different sizes were sequenced	Sequel	2
EGAD00001008983	Juntendo Muscle Study (JMS) dataset comprises 23 samples of paired-end RNA-Seq sequences in fastq format.	NextSeq 500	23
EGAD00001008984	Muscle SATellite cell study (MSAT) dataset comprises 39 samples of paired-end RNA-Seq sequences in fastq format.	NextSeq 500	39
EGAD00001008985	We profiled CD34+ enriched cells from GCSF mobilized bone marrow samples (n = 4) using single-cell RNA sequencing (10X) with targeted genotyping, and single-cell DNA methylation (RRBS) with single-cell RNA sequencing (Smart-Seq2) and targeted genotyping. A 5th CH bone marrow aspirate sample was obtained to validate observed results.	Illumina NovaSeq 6000	5
EGAD00001008987	The genomic VCF data of the Integrative proteogenomic characterization of early-stage DC project ,this dataset contains 76 VCF files.		76
EGAD00001008988	This dataset was collected from viable bone marrow cells obtained at diagnosis from nine patients with high hyperdiploid ALL and one normal bone marrow sample. All samples were subjected to low pass single cell whole genome sequencing with the median sequencing coverage of 0.02x. Single nuclei in G0/G1 phase were isolated using a fluorescence-activated cell sorting (FACS) cytometer. DNA libraries were constructed and associated next-generation sequencing was carried out by European Research Institute for the Biology of Ageing (ERIBA), University of Groningen, University Medical Center Groningen, Groningen, The Netherlands. Further details regarding the DNA libraries construction are available by Bos et. al., 2019 (https://link.springer.com/protocol/10.1007/978-1-4939-8931-7_15).	NextSeq 550	2842
EGAD00001008989	79 out of 336 GC samples	HiSeq X Ten	-
EGAD00001008991	Patients with progressive, metastatic castration-resistant prostate cancer (mCRPC) underwent metastatic tumor biopsy. 118 total samples including 14 paired samples (baseline and later progression). Various organ sites including soft tissue & bone were present. Published in DOI: 10.1200/JCO.2017.77.6880 Journal of Clinical Oncology 36, no. 24 (August 20, 2018) 2492-2503.	Illumina HiSeq 1500	118
EGAD00001008992	Raw, unfiltered, paired-end fastq files obtained through whole-genome and RNA-sequencing, respectively. RNA-seq of affected individuals in three twin pairs. WGS of blood in five twin pairs as well as uterine rudiment tissue of selected affected individuals.	Illumina NovaSeq 6000	14
EGAD00001008993	In this study, the DNA of 44 subjects with severe COVID-19 have been sequenced in order to explore rare genetic variants.	Illumina NovaSeq 6000	44
EGAD00001008994	WES	Illumina HiSeq 2500	6
EGAD00001008995	RNA-seq	Illumina HiSeq 2500	6
EGAD00001008996	single cell RNA-seq	unspecified	6
EGAD00001008997	single cell DNA sequencing	unspecified	6
EGAD00001008998	This dataset contains data from 11 uveal melanoma patients. Plasma samples were collected at baseline, 2-weeks, 3-, 6-, and 12-months post treatment (surgery/radiation). Samples underwent targeted panel, shallow whole genome, and cfMeDIP sequencing. A total of 46 plasma samples, 7 tumours, 11 buffy coats, and 10 healthy controls are included.	Illumina NovaSeq 6000	74
EGAD00001008999	This dataset contains plasma WGS data from patients with stage IV colorectal cancer (CRC, n = 16) and healthy individuals (n = 21) used in the Pointy manuscript. Patients with CRC provided written consent and samples were collected as performed as described previously (Clinical-Trials.gov number NCT01876511; Georgiadis et al., 2019, Le et al., 2017). Plasma samples from 21 healthy control individuals were procured through BioIVT. Cell-free DNA (cfDNA) was extracted from plasma using the QIAamp Circulating Nucleic Acid Kit. Libraries were prepared with 5 to 250 ng of cfDNA using the NEBNext DNA Library Prep Kit. Libraries were sequenced on HiSeq2000/2500.	Illumina HiSeq 2000	37
EGAD00001009000	Genomics of drug sensitivity in acute lymphoblastic leukemia	Illumina NovaSeq 6000	65
EGAD00001009001	RRBS data for solid tumors and adjacent normal tissues	HiSeq X Ten	328
EGAD00001009002		NextSeq 500	2
EGAD00001009003	cfMethyl-Seq libraries were generated for 479 cfDNA samples and were sequenced with 150 bp paired-end reads.	HiSeq X Ten	479
EGAD00001009004	whole genome sequencing on lymph node metastases and blood DNA from 25 cSCC patients with regional metastases of the head and neck. We designed a multifaceted computational analysis at the whole genome level to provide a more comprehensive perspective of the genomic landscape of metastatic cSCC. This study contains the majority of 15 samples which are previously submitted in EGAC00001001100.		25
EGAD00001009005	Single cell transcriptomes, generated using chromium 10X 3' sequencing, for two tumour types (AT/RT, and Ewing's sarcoma). For each individual, tumour and normal whole genome sequencing was also obtained using Illumina short read sequencing to an average depth of 30X. These data were used to validate the accuracy of a method for identifying cancer cell transcriptomes based on the allelic shift produced by copy number changes.	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina NovaSeq 6000	1
EGAD00001009006		NextSeq 500	1
EGAD00001009007		NextSeq 500	1
EGAD00001009008		NextSeq 500	1
EGAD00001009009		NextSeq 500	1
EGAD00001009010		NextSeq 500	1
EGAD00001009011		NextSeq 500	1
EGAD00001009012		NextSeq 500	1
EGAD00001009013		NextSeq 500	1
EGAD00001009014		NextSeq 500	1
EGAD00001009015		NextSeq 500	1
EGAD00001009016		NextSeq 500	1
EGAD00001009017		NextSeq 500	1
EGAD00001009018		NextSeq 500	1
EGAD00001009019		NextSeq 500	1
EGAD00001009020		NextSeq 500	1
EGAD00001009021		NextSeq 500	1
EGAD00001009022		NextSeq 500	1
EGAD00001009023		NextSeq 500	1
EGAD00001009024		NextSeq 500	1
EGAD00001009025		NextSeq 500	1
EGAD00001009026		NextSeq 500	1
EGAD00001009027		NextSeq 500	1
EGAD00001009028		NextSeq 500	1
EGAD00001009029		NextSeq 500	1
EGAD00001009030		NextSeq 500	1
EGAD00001009031		NextSeq 500	1
EGAD00001009032		NextSeq 500	1
EGAD00001009033		NextSeq 500	1
EGAD00001009034		NextSeq 500	1
EGAD00001009035		NextSeq 500	1
EGAD00001009036		NextSeq 500	1
EGAD00001009037		NextSeq 500	1
EGAD00001009041	This dataset includes RNA-seq, DNA and Chip-Seq data of samples from our paper.	HiSeq X Ten Illumina HiSeq 4000	-
EGAD00001009042	This dataset includes RNA-seq data of samples from our paper.	Illumina HiSeq 4000	18
EGAD00001009043	RNA-Seq data for 9 JPA samples.	unspecified	11
EGAD00001009044	5 Hi-C datasets (4 JPA and 1 LGG). Hi-C data for the remaining 5 JPAs used in our paper as well as all controls have been uploaded to EGA under EGAS00001005476.	unspecified	5
EGAD00001009045	ChIP data from PFA (n = 10). Raw data provided as FASTQ. Data generated on Illumina NovaSeq 6000 PE50.	Illumina NovaSeq 6000	-
EGAD00001009046	PDX aCGH (txt), WES (fastq) and RNASeq (fastq) samples from mice treated with cisplatin. Primary samples and matched pdx samples from multiple passages. Models from the Marie Curie Instute (HBCx1, HBCx4B, HBCx8, HBCx10, HBCx12B, HBCx14, HBCx15, HBCx16, HBCx17, HBCx23, HBCx24, HBCx27, HBCx28, HBCx30, HBCx31, HBCx33, HBCx39, HBCx40, HBCx43, HBCx51, HBCx63, HBCx66, HBCx92, HBCx106) and the NKI (T127, T162, T241, T250, T283, T302, T336).	Illumina HiSeq 2000 unspecified	71
EGAD00001009047	Total RNA was isolated from 50 formalin-fixed paraffin-embedded nasopharyngeal cancer (NPC) specimens using the RecoverAll Total Nucleic Acid Isolation kit (Ambion). Tumor RNA libraries were prepared with 200ng RNA using the Illumina TruSeq Stranded Total RNA kit with Ribo-Zero Gold, and sequenced with >80 million 100 bp paired-end reads.	unspecified	50
EGAD00001009048	Somatic RNA for 40 samples matched to the WGS was extracted using the Qiagen Qiasymphony RNA protcol (cat no 931636). The tissue was initially homogenised using a Qiagen Bioruptor, followed by the manufacturers recommended protocol (including DNase digestion). The resulting RNA the underwent quality control as follows: firstly, A260 and A280nm were measured on a Denovix DS-11 Fx to qualitatively illustrate A260/280nm and A260/230nm ratios as measures of RNA purity. A260/280 had to be 2.0 and A260/230 had to be 2.0-2.2. Then RNA was quantified using LifeTechnologies Qubit RNA BR kit (cat no Q10210). RNAseq was carried out by the Edinburgh Clinical Research Facility on an Illumina NExtSeq500. Total RNA samples were assessed on the Agilent Bioanalyser (Agilent Technologies, #G2939AA) with the RNA 6000 Nano Kit (#5067-1512) for quality and integrity of total RNA, and then quantified using the Qubit 2.0 Fluorometer (Thermo Fisher Scientific Inc, #Q32866) and the Qubit RNA HS assay kit (#Q32855). Libraries were prepared from total-RNA sample using the NEBNext Ultra 2 Directional RNA library prep kit for Illumina (#E7760S) with the NEBNext rRNA Depletion kit (#E6310) according to the provided protocol. 400ng of totalRNA was then added to the ribosomal RNA (rRNA) depletion reaction using the NEBNext rRNA depletion kit (Human/mouse/rat) (#E6310). This step uses specific probes that bind to the rRNA in order to cleave it. rRNA-depleted RNA was then DNase treated and purified using Agencourt RNAClean XP beads (Beckman Coulter Inc, #66514). RNA was then fragmented using random primers before undergoing first strand and second strand synthesis to create cDNA. cDNA was end repaired before ligation of sequencing adapters, and libraries were enriched by PCR using the NEBNext Multiplex oligos for Illumina set 1 and 2 (#E7500). Final libraries had an average peak size of 271bp. Libraries were quantified by fluorometry using the Qubit dsDNA HS assay and assessed for quality and fragment size using the Agilent Bioanalyser with the DNA HS Kit (#5067-4626). Sequencing was performed using the NextSeq 500/550 High-Output v2 (150 cycle) Kit (# FC- 404-2002) on the NextSeq 550 platform (Illumina Inc, #SY-415-1002). Libraries were combined in an equimolar pool based on the library quantification results and run across 5 High-Output Flow Cell v2.5.	NextSeq 550	-
EGAD00001009049	FASTQ reads for 81 matched tumour-normal WGS pairs for high grade serous ovarian cancer patients. Scottish HGSOC samples were collected via local Bioresource facilities at Edinburgh, Glasgow, Dundee and Aberdeen and stored in liquid Nitrogen until required. HGSOC patients were determined from pathology records and were included in the study where there was matched tumour and whole blood samples. Tumour samples were divided into two for DNA and RNA extraction and slivers of tissue were taken, fixed in formalin and embedded in paraffin wax (FFPE). Samples were only included if they were confirmed as HGSOC and there was greater than 40% tumour cellularity throughout the tumour, determined using H&E staining of the FFPE sections and pathology review. Somatic DNA was extracted from the tumour and germline DNA was extracted from whole blood. Somatic DNA was extracted using the Qiagen DNeasy Blood and tissue kit (cat no 69504). The tissue was initially homogenised using a Qiagen Bioruptor, followed by the manufacturers recommended protocol (including RNase digestion step). Germline DNA was extracted from 1-3ml whole blood using the Qiagen FlexiGene kit (cat no 51206) following the manufacturers recommended protocol. The resulting DNA underwent quality control as follows: firstly, A260 and A280nm were measured on a Denovix DS-11 Fx to qualitatively illustrate A260/280nm and A260/230nm ratios as surrogate measures of DNA purity. A260/280 had to be 1.8 or greater and A260/230 had to be 2.0 or greater. Then, DNA was quantified using LifeTechnologies Qubit dsDNA BR kit (cat no Q32850) and we required a minimum of 50ul at 25ng/ul for WGS. Thirdly, DNA was diluted to 25ng/ul and a representative sample was loaded onto a 0.8% TAE gel, ran at 100v for 60mins and then imaged using a BioRad ChemiDoc imaging system to visualise the DNA quality.	HiSeq X Ten Illumina NovaSeq 6000	-
EGAD00001009050	This dataset contains 2 human tumor-derived cell lines and 1 human tumor Hi-C samples. The raw fastq files and the processed .hic file is provided for each sample.	HiSeq X Ten Illumina HiSeq 2500	3
EGAD00001009051	K562 cell line has been treated with two different HSP90 inhibtors. After resistance clones emerged, they have been genetically characterized using WES in comparison to the parental K562 cell line.	NextSeq 550	3
EGAD00001009052	Single cell technologies allow the interrogation of tumor heterogeneity, providing insights into tumor evolution and treatment resistance. To better understand whether circulating tumor cells (CTCs) could complement metastatic biopsies for tumor genomic profiling, we characterized 11 single CTCs and 10 pooled CTC samples at the mutational and copy number aberration (CNA) levels, and compared these results with matched synchronous tumor biopsies from 3 metastatic breast cancer patients with triple-negative (TNBC), HER2-positive and estrogen receptor-positive (ER+) tumors. Similar CNA profiles and the same patient-specific driver mutations were found in bulk tissue and CTCs for the HER2-positive and TNBC tumors, whereas different CNA profiles and driver mutations were identified for the ER+ tumor, which presented two distinct clones in CTCs defined by mutations in ESR1 Y537N and TP53, respectively. Furthermore, de novo mutational signatures derived from CTCs described patient-specific biological processes. These data suggest that tumor tissue and CTCs provide complementary clinically relevant information to map tumor heterogeneity and tumor evolution.	Illumina HiSeq 2000	30
EGAD00001009053	This dataset contains RNA sequencing samples from the Duesseldorf/LIsbon pilocytic astrocytoma cohort which was profiled using Proteogenomics (RNA Sequencing, proteomics and methylation). There are 48 samples which were sequenced using paired-end sequencing at the University Hopstial Dusseldorf.	Illumina HiSeq 2500	48
EGAD00001009054	RNA sequencing data of 141 samples from 141 patients with HER2+ breast cancer treated withletrozole or tamoxifen (SOLTI-1114 PAMELA trial)	Illumina HiSeq 2500 Illumina NovaSeq 6000	142
EGAD00001009056	RNA-seq was performed to compare gene expression profiles between 11 patient adherent ALL samples and nonadherent ALL samples.	Illumina HiSeq 2500	-
EGAD00001009057	Medulloblastoma intra-tumoural genetic heterogeneity and clonal evolution, and their role in disease pathogenesis and clinical behaviour, are poorly understood. We used single-cell whole-genome sequencing (sc-WGS) to reconstruct the natural history and temporal evolution of 14 medulloblastomas, representing its major clinico-molecular sub-classes. We identified wholly-clonal tumours which displayed single-clone expansion (i.e. linear evolution); all were observed in favourable-outcome sub-classes (i.e. MBWNT and infant MBSHH). In contrast, remaining tumours harboured sub-clonal structures which displayed punctuated or gradual trajectories; highest-risk sub-classes, typically characterised by MYC-amplification (MBGroup3) or TP53-mutation (MBSHH), and linked to genomic instability and LCA pathology, were most clonally-diverse. Clinically-adopted biomarkers were typically early-clonal/initiating events, representing exploitable targets for early-disease detection; in analyses of spatially-distinct tumour regions, a single biopsy was sufficient to assess their status. sc-WGS revealed events not previously appreciated in bulk tumour analysis, which arose later and/or sub-clonally and more commonly displayed spatial diversity; their clinical significance and role in disease evolution post-diagnosis now require establishment. In summary, our findings reveal diverse modes of tumour initiation and clonal evolution in the major medulloblastoma sub-classes, highlighting their pathogenic relevance and clinical potential.	NextSeq 500	430
EGAD00001009058	We carried out WGS and RNAseq on a cohort of 48 children and young adults with induction failure in T-cell Acute Lymphoblastic Leukemia (T-ALL) to identify genomic drivers of treatment resistance. The study includes WGS for 33 tumour/normal pairs and 15 tumour-only samples. In addition, there is RNAseq data for 37 cases.	Illumina HiSeq 2500 Illumina NovaSeq 6000	118
EGAD00001009059	RNA-seq data for melanoma biopsies at baseline and after treatment	Illumina NovaSeq 6000	8
EGAD00001009060	RRBS data for melanoma biopsies at baseline and after treatment	Illumina NovaSeq 6000	8
EGAD00001009061	Clonal tracking of stem cells and their progeny by whole genome sequencing permits exploration of evolutionary genetics in human disease. In this study, we performed phylogenetic reconstruction of haematopoiesis using somatically acquired mutations in 323 single haematopoietic stem and progenitor cell-derived colonies from 10 individuals with an inherited disorder of ribosome assembly, Shwachman-Diamond syndrome. We observed numerous clonal expansions, with recurrent acquisition of mutually exclusive mutations (EIF6, TP53, RPL5, RPL22, PRPF8, chromosomes 7 and 15) in multiple different clones in utero or early childhood converging on the p53-dependent nucleolar surveillance pathway that monitors ribosome integrity. In contrast to clones carrying biallelic TP53 mutations, genomes derived from colonies carrying mono-allelic TP53 mutations displayed no increase in mutation burden or specific mutational signatures. Our study highlights striking loss of clonal diversity with convergent somatic evolution on the p53-dependent nucleolar surveillance pathway from early life to offset the deleterious effects of a germline mutation in a Mendelian haematopoietic disorder.	HiSeq X Ten	323
EGAD00001009062	This Dataset contains RNA-Seq, H3K27Ac ChIP-Seq, and ATAC-Seq data for 13 cystic fibrosis (CF) patients and 8 healthy volunteers (HV).	Illumina HiSeq 4000	222
EGAD00001009063	This dataset contains the sputum metagenome from 99 COPD patients and 36 healthy individuals in China.	Illumina NovaSeq 6000	135
EGAD00001009064	Profiling of 12 megabases of human non-coding DNA (including enhancers, promoters, and boundaries of topologically associating domains) in a longitudinal cohort of patients treated with endocrine therapies. For each patient, DNA from the primary and relapsed (metastatic) tumour, along with normal matched DNA, were profiled.	Illumina HiSeq 4000 Illumina NovaSeq 6000	300
EGAD00001009065	164 pairs of FASTQ files from metastatic Castration-Resistant Prostate Cancer (mCRPC) sequenced on HiSeq 4000 instruments. Patients were enrolled in the West Coast Dream Team study. Biopsies include various tissue sites including bone, soft tissue, and lymph node. 42 pairs prior to enzalutamide treatment and at progression from 21 patients are included.	Illumina HiSeq 1500 Illumina HiSeq 4000	164
EGAD00001009066	The cohort (n=53) consists of prostate cancer patients from Australia. For each patient, a pair of blood and tumour samples were collected. The sequencing data was mapped to hg38 reference. Blood BAMs are named with “-B” as suffix and tumour BAMs are named with “-T” as suffix.	HiSeq X Ten Illumina NovaSeq 6000	106
EGAD00001009067	The cohort (n=130) consists of prostate cancer patients from South Africa (n=123) and Brazil (n=7). For each patient, a pair of blood and tumour samples were collected. The sequencing data was mapped to hg38 reference. Blood BAMs are named with “-B” as suffix and tumour BAMs are named with “-T” as suffix.	HiSeq X Ten Illumina NovaSeq 6000	260
EGAD00001009068	RNAseq sequencing of 10 breast cancer bone metastasis PDX. Samples were obtained from 5 PDX that acquired palbociclib resistance (palboR) and 5 from the parentale PDX (palboS).	Illumina NovaSeq 6000	10
EGAD00001009069	This dataset contains RNA-sequencing of blood samples from Healthy Controls (n=7), PSA patients (n=27) and RA patients (n=9). It also contains RNA-sequencing data of skin fibroblasts from healthy controls (n=3) and PSA patients (n=3).	NextSeq 500	69
EGAD00001009070	GWAS genotype data of the Japanese population (N=2,380).		1
EGAD00001009071	This dataset contains genome-wide array data from Tunisian and Moroccan individuals. Tunisian individuals were sampled in the city of Tunis (n = 64), and Moroccan individuals in different urban areas in the country (n = 45).		1
EGAD00001009072	Fastq files from RNAseq of breast cancer bone metastases PDX after treatment with IACS-010759. Five PDX are resistants to treatment and 6 are responders.	Illumina NovaSeq 6000	11
EGAD00001009074	This dataset contains RNA-seq data of giant cell tumour of bone (GCTB) cell lines (n=3). Cell lines consist of neoplastic "stromal" cells harboring a heterozygous H3F3A p.G34W mutation. RNA-seq was performed on the BGISEQ-500 platform (PE100) and uploaded data contains fastq files, vcf files, and gene expression values (TPM). Cell lines, data generation, and data analysis are described in the following publication: Venneker et al., Histone deacetylase inhibitors as a therapeutic strategy to eliminate neoplastic “stromal” cells from giant cell tumors of bone, 2022.	unspecified	3
EGAD00001009075	Time-dependent characterization of CNS response in COVID-19	Illumina NovaSeq 6000	21
EGAD00001009076	This dataset includes 3432 paired single cell sequencing fastq files derived from synovial B cells of 5 early Rheumatoid Arthritis patients. Libraries were prepared using the SmartSeq2 protocol. All wells contained ERCC spike-ins. Libraries were sequenced on an Illumina NovaSeq instrument with 2x100bp paired-end reads yielding a median of 2.5M reads/well. Sample aliases ending on 368 and 384 represent empty control wells. Sample aliases starting with P11417_4 and P13157_1 belong to ACPA- patient A7, sample aliases starting with P11417_5 and P11417_6 belong to ACPA+ patient A1, sample aliases starting with P13157_2 belong to ACPA- patient A3, sample aliases starting with P13157_3 and P13157_6 belong to ACPA+ patient A2, sample aliases starting with P13157_4 and P13157_5 belong to ACPA- patient A4.	Illumina NovaSeq 6000	3432
EGAD00001009077	Sequencing of Huntington's disease patient samples. Whole exome sequencing (n = 463) and MiSeq HTT amplicon sequencing (n = 584) BAM/BAI files. Two analyses (one linear and one logistic) with relevant files.	Illumina HiSeq 4000 Illumina MiSeq	363
EGAD00001009078	Bam files of whole-genome sequencing of 14 paired PMBCL samples.	Illumina NovaSeq 6000	28
EGAD00001009079	We purified peripheral blood mononuclear cells from individuals living in India (N=10) and the Netherlands (N=10) at baseline and 10-12 weeks after BCG vaccination. We compared chromatin accessibility between the two populations at baseline, as well as gene transcription profiles and cytokine production capacities upon viral stimulation with influenza and SARS-CoV-2	unspecified	157
EGAD00001009081	The dataset is based on 37 FFPE samples obtained from 12 patients diagnosed with breast or larynx cancer, For each patient 3 sample types were obtained P - primary tumor, L - malignant lymph node and C - benign lymph node (control). For patient G46 two malignant lymph nodes were used. DNA isolated from all samples was subject to exome selection using Agilent SureSelect Human All Exon V7. The obtained material was sequenced using NovaSeq 6000 platform with 2x150 reads. The sequencing was conducted by Novogene company.	Illumina NovaSeq 6000	37
EGAD00001009082	We used chromatin-immunoprecipitation followed by sequencing (ChIP-Seq) with an antibody for the H3K27ac (a bona fide histone mark for regulatory element activation) in sorted CLL cells from 15 CLL, including cases from stereotyped subsets #1, #2, #4, and #8. The samples were sequenced by Illumina HiSeq 2500.	Illumina HiSeq 2500	15
EGAD00001009083	Samples: primary cutaneous melanoma (CM) non-associated or distal nevus (A); adjacent or CM-associated nevus (B); Primary-CM (C); and Lymph-Node Metastasis (LN-mts) (D). Whole-exome sequencing (WES) was performed in DNA extracted from the different samples (A-D) paired with the germline reference (G), processed with the Agilent SureSelect All Exon Human V5 Library in an Illumina Hiseq 4000 PE101 platform.	Illumina HiSeq 4000	5
EGAD00001009085	TCRab sequencing of viably frozen cells from 12 samples from four chronic-phase chronic myeloid leukemia patients. The raw data is available as fastq files.	Illumina HiSeq 2500	48
EGAD00001009086	Single-cell RNA sequencing of viably frozen cells from 12 samples from four chronic-phase chronic myeloid leukemia patients. The raw data is available as fastq files.	Illumina NovaSeq 6000	48
EGAD00001009087	RNA-sequencing (RNA-seq) efforts in acute lymphoblastic leukaemia (ALL) have identified numerous prognostically significant genomic alterations which can guide diagnostic risk stratification and treatment choices when detected early. However, a full RNA-seq Bioinformatics workflow is time-consuming and costly in a clinical setting where rapid detection and accurate reporting of clinically relevant alterations are essential. To accelerate the identification of ALL-specific alterations (including gene fusions, single nucleotide variants and focal gene deletions), we developed the rapid screening tool RaScALL, capable of identifying more than 100 prognostically significant lesions directly from raw sequencing reads. RaScALL uses the k-mer based targeted detection tool km and known ALL variant information to achieve a high degree of accuracy for reporting subtype defining genomic alterations compared to standard alignment-based pipelines. Gene fusions, including difficult to detect fusions involving EPOR and DUX4, were accurately identified in 98% (164 samples) of reported cases in a 180-patient Australian study cohort and 95% (n=63) of samples in a North American validation cohort. Pathogenic sequence variants were correctly identified in 75% of tested samples, including all cases involving subtype defining variants PAX5 p.P80R (n=12) and IKZF1 p.N159Y (n=4). Accurate detection of intragenic IKZF1 deletions resulting in aberrant transcript isoforms was also detectable with 98% accuracy. Importantly, the median analysis time for detection of all targeted alterations averaged 22 minutes per sample, significantly shorter than standard alignment-based approaches, ensuring accelerated risk-stratification and therapeutic triage.	NextSeq 500	180
EGAD00001009088	An increased incidence of endometrial cancer has been described for patients that have received tamoxifen to treat breast cancer. Using samples from endometrial tumors, isolated from surgivcal specimens of patients who previously received tamoxifen treatment for breast cancer, we aimed to identify whether there are specific somatic mutations enriched in this population, relative to endometrial tumors from the general population. For this, WES was performed on matched endometrial tumors and healthy tissue (n=21).	Illumina HiSeq 2000	42
EGAD00001009089	31 samples transcriptomics to simulate the knock-out of all targets of a drug on an objective function such as growth or energy balance.	NextSeq 500	31
EGAD00001009090	16 CRC patient WGS data	HiSeq X Ten	64
EGAD00001009091	16 CRC patients transcriptome data	HiSeq X Ten	96
EGAD00001009092	BARIA 100	HiSeq X Ten Illumina Genome Analyzer IIx	1
EGAD00001009099	Gynecologic carcinosarcomas (CS), including more generally uterine (endometrial) and less frequently ovarian localization, are histologically defined as biphasic neoplasms composed of carcinomatous (C) epithelial and sarcomatous (S) malignant components. We report a comprehensive analysis of 20 patients of macro-dissected samples of C and S components through RNA sequencing.	Illumina HiSeq 2500	40
EGAD00001009100	Brain-Derived Neurotrophic Factor (BDNF) is crucial for neuronal survival, differentiation, synaptic plasticity, memory formation, and neurocognitive health. Molecular mechanisms of BDNF promoting cellular survival and synaptic plasticity have been intensely studied, yet its role in genome regulation is obscure. Using human induced pluripotent stem cell (hiPSC)-derived neurons via lentiviral delivery of the neuronal transcription factor Ngn2, we performed a temporal profiling (1h, 6h and 10h) of chromatin accessibility upon BDNF treatment or depolarization (KCl) to identify BDNF-specific chromatin-to-gene expression programs.	NextSeq 500	12
EGAD00001009101	The data includes exome sequencing FASTQ files of 335 patients receiving immune checkpoint blockade therapy. The data only provides for WXS of tumor tissue.	Illumina HiSeq 2500	335
EGAD00001009102	Here we provide access to newly generated RNA-seq data for 101 human islet samples used to map genetic effects on gene expression and alternative splicing (eQTLs and sQTLs) in a total of 399 human islets. We also make publicly available genotyping array data for 128 human islets, including the fraction of 101 human islet samples with existing RNA-seq data.	Illumina HiSeq 2500	101
EGAD00001009103	Dataset for 16S rRNA gene sequencing data for sputum samples 61 COPD patients, generated using PacBio sequencing technology.	Sequel	40
EGAD00001009104	MTM-HD - fibroblast RNAseq. 57 samples from controls, pre-HD and early-HD patients.	Illumina NovaSeq 6000	57
EGAD00001009105	MTM-HD - adipose tissue RNAseq. 60 samples from controls, pre-HD and early-HD patients	Illumina HiSeq 2500	60
EGAD00001009106	MTM-HD - skeletal muscle RNAseq. 57 samples from controls, pre-HD and early-HD patients.	Illumina HiSeq 2500	57
EGAD00001009108	RNA sequencing dataset for primary and recurrent ovarian granulosa cell tumors consists of 24 .bam files with aligned reads, including 8 primary and 16 recurrent tumors. Total RNA was extracted from cryopreserved tissue of adult-type granulosa cell tumor samples. Libraries were prepared from cDNA using the NuGEN Ovation Ultralow Library System V2 (San Carlos, CA). Paired-end bulk RNA sequencing was performed on the Illumina HiSeq 2000 platform. RNA sequencing reads were aligned to the hg19/GRCh37 reference human genome using the STAR software (version 2.6.0b) with default parameters. Samples description: GCT001 Adult-type Granulosa Cell Tumor: recurrent tumor GCT002 Adult-type Granulosa Cell Tumor: primary tumor GCT003 Adult-type Granulosa Cell Tumor: recurrent tumor GCT004 Adult-type Granulosa Cell Tumor: primary tumor GCT005 Adult-type Granulosa Cell Tumor: recurrent tumor GCT006 Adult-type Granulosa Cell Tumor: recurrent tumor GCT007 Adult-type Granulosa Cell Tumor: primary tumor GCT008 Adult-type Granulosa Cell Tumor: recurrent tumor GCT009 Adult-type Granulosa Cell Tumor: recurrent tumor GCT010 Adult-type Granulosa Cell Tumor: recurrent tumor GCT011 Adult-type Granulosa Cell Tumor: primary tumor GCT012 Adult-type Granulosa Cell Tumor: recurrent tumor GCT013 Adult-type Granulosa Cell Tumor: recurrent tumor GCT014 Adult-type Granulosa Cell Tumor: recurrent tumor GCT015 Adult-type Granulosa Cell Tumor: recurrent tumor GCT016 Adult-type Granulosa Cell Tumor: recurrent tumor GCT017 Adult-type Granulosa Cell Tumor: recurrent tumor GCT018 Adult-type Granulosa Cell Tumor: primary tumor GCT019 Adult-type Granulosa Cell Tumor: recurrent tumor GCT020 Adult-type Granulosa Cell Tumor: primary tumor GCT021 Adult-type Granulosa Cell Tumor: primary tumor GCT022 Adult-type Granulosa Cell Tumor: recurrent tumor GCT023 Adult-type Granulosa Cell Tumor: recurrent tumor GCT024 Adult-type Granulosa Cell Tumor: primary tumor	Illumina HiSeq 2000	24
EGAD00001009109	6 trios and 1 proband were whole genome sequenced with PacBio Sequel II to a depth of 30X, using the HiFi chemistry. For each trio the proband was affected with severe ID, and the parents were unaffected. Samples are grouped by trio.	Sequel	19
EGAD00001009110	Extracted regions from WGS of Ewing sarcoma spanning fusion breakpoints +/- 100kb for ctDNA tracking in plasma	Illumina NovaSeq 6000	1
EGAD00001009111	10x Genomics Single Cell Gene Expression for Telomerase immortalized breast epithelium cell line 184-hTERT-22 L9 112.109	Illumina HiSeq 2500	1
EGAD00001009112	10x Genomics Single Cell Gene Expression for Telomerase immortalized breast epithelium cell line 184-hTERT-22 L9 116.126	Illumina HiSeq 2500	1
EGAD00001009113	10x Genomics Single Cell Gene Expression for Telomerase immortalized breast epithelium cell line 184-hTERT-22 L9 83.86	Illumina HiSeq 2500	1
EGAD00001009114	10x Genomics Single Cell Gene Expression for Telomerase immortalized breast epithelium cell line 184-hTert L9 116.66	Illumina HiSeq 2500	1
EGAD00001009115	10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA535 passage 4	BGISEQ-500	1
EGAD00001009116	10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA535 passage 6	Illumina HiSeq 2500	1
EGAD00001009117	10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA604 passage 6	Illumina HiSeq 2500	1
EGAD00001009118	10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA604 passage 8	Illumina HiSeq 2500	1
EGAD00001009119	10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA604 passage 7	NextSeq 500	1
EGAD00001009120	10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA609 passage 6	Illumina HiSeq 2500	1
EGAD00001009121	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1049 passage 1	NextSeq 500	1
EGAD00001009122	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1052 passage 1	NextSeq 500	1
EGAD00001009123	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1053 passage 1	NextSeq 500	1
EGAD00001009124	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1051 passage 1	NextSeq 500	1
EGAD00001009125	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1050 passage 1	NextSeq 500	1
EGAD00001009126	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1050 passage 1	NextSeq 500	1
EGAD00001009127	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1052 passage 1	NextSeq 500	1
EGAD00001009128	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1053 passage 1	Illumina HiSeq 2500	1
EGAD00001009129	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1093 passage 1	NextSeq 500	1
EGAD00001009130	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1091 passage 1	NextSeq 500	1
EGAD00001009131	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1053 passage 1	NextSeq 500	1
EGAD00001009132	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1096 passage 1	NextSeq 500	1
EGAD00001009133	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1051 passage 1	HiSeq X Ten	1
EGAD00001009134	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1052 passage 1	HiSeq X Ten	1
EGAD00001009135	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1181 passage 1	Illumina HiSeq 2500	1
EGAD00001009136	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1096 passage 1	Illumina HiSeq 2500	1
EGAD00001009137	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1096 passage 1	Illumina HiSeq 2500	1
EGAD00001009138	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1162 passage 1	Illumina HiSeq 2500	1
EGAD00001009139	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient SA1096	NextSeq 500	1
EGAD00001009140	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1049 passage 1	NextSeq 500	1
EGAD00001009141	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1050 passage 1	Illumina HiSeq 2500	1
EGAD00001009142	10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA605 passage 3	NextSeq 500	1
EGAD00001009143	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma cell line OV2295	HiSeq X Ten	1
EGAD00001009144	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma cell line OV2295(R2)	HiSeq X Ten	1
EGAD00001009145	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient SA1184	NextSeq 500	1
EGAD00001009146	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma patient-derived xenograft SA1047 passage 1	HiSeq X Ten	1
EGAD00001009147	10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA1035 passage 4	Illumina HiSeq 2500	1
EGAD00001009148	10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA1035 passage 8	NextSeq 500	1
EGAD00001009149	10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA535 passage 6	BGISEQ-500	1
EGAD00001009150	10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA1035 passage 6	Illumina HiSeq 2500	1
EGAD00001009151	10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA1035 passage 7	HiSeq X Ten	1
EGAD00001009152	10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA535 passage 4	BGISEQ-500	1
EGAD00001009153	10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA1035 passage 5	Illumina HiSeq 2500	1
EGAD00001009154	10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA1035 passage 6	Illumina HiSeq 2500	1
EGAD00001009155	10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA535 passage 5	HiSeq X Ten	1
EGAD00001009156	10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA535 passage 5	HiSeq X Ten	1
EGAD00001009157	10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA535 passage 9	Illumina HiSeq 2500	1
EGAD00001009158	10x Genomics Single Cell Gene Expression for High grade serous ovarian carcinoma cell line TOV2295(R)	HiSeq X Ten	1
EGAD00001009159	10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA604 passage 9	NextSeq 500	1
EGAD00001009160	10x Genomics Single Cell Gene Expression for Triple negative breast cancer patient-derived xenograft SA610 passage 3	HiSeq X Ten	1
EGAD00001009161	Children with ALL treated with anti-CD19 therapy occasionally develop a phenotypically distinct AML. However, the precise clonal origin of such class switch leukemias remains unresolved. Here, we reconstructed the evolution of leukemia in a child with primary ALL, two ALL relapses and AML after treatment with anti-CD19 CAR-T and blinatumomab through whole-genome sequencing. The phylogeny revealed that the AML was a monoclonal outgrowth descending from the initial ALL and harbored biallelic loss of CDKN2A, PAX5 and TP53. However, none of the ALL or AML relapses directly descended from one another, suggesting the presence of a reservoir of persistent clones. Our findings suggest anti-CD19 treatment selects pre-existing clones, with many key genetic alterations underpinning the lineage switch detectable prior to treatment.	Illumina NovaSeq 6000	8
EGAD00001009162	We conducted whole exome sequencing (using the SureSelect Human All Exon V5 + UTRs target enrichment kit) of 90 individuals from AP (23 from Saudi Arabia, 24 from Yemen, 24 from Oman and 19 from UAE).	Illumina HiSeq 3000	90
EGAD00001009163	WGS files for Roussel-ATRT-TM paper titled "Atypical teratoid/ rhabdoid tumoroids reveal subgroup-specific drug vulnerabilities"	Illumina HiSeq 2000	8
EGAD00001009164	WXS files for Roussel-ATRT-TM paper titled "Atypical teratoid/ rhabdoid tumoroids reveal subgroup-specific drug vulnerabilities"	Illumina HiSeq 2000	8
EGAD00001009165	This dataset contains the methylation sequencing data of 60 nonCancer and 70 colorectal cancer cfDNA samples. The methylation library is constructed by using NEBNext Enzymatic-seq Kit.	Illumina NovaSeq 6000	130
EGAD00001009166	The Roche Alzheimer’s disease dataset (Roche_AD) consists of 80 samples from 40 unique individuals (one sample from the temporal cortex and one from deep white matter for each individual, 12 cases, 25 controls, 3 dementia). A total of 12,000 estimated cells from each sample were loaded on the 10x Single Cell Next GEM G Chip. cDNA libraries were prepared using the Chromium Single Cell 3’ Library and Gel Bead v3 kit according to the manufacturer’s instructions. cDNA libraries were sequenced using the Illumina NovaSeq 6000 System and NovaSeq 6000 S2 Reagent Kit v1.5 (100 cycles), aiming at a sequencing depth of minimum 30K reads/nucleus.	Illumina NovaSeq 6000	80
EGAD00001009167	This dataset comprises genetic variation data (as somatic indels and snvs VCFs) of 38 OPSCC tumors. WES was done using NextSeq 500 System running in 150 cycles (2x 75bp paired-end) mode. Sequence information was converted to FASTQ format using bcl2fastq v2.20.0.422. VCFs were generated using the Strelka package.		1
EGAD00001009168	The Columbia Alzheimer’s dataset (white matter) consists of 24 white matter individuals (12 controls, 12 cases). A total of 12,000 estimated cells from each sample were loaded on the 10x Single Cell Next GEM G Chip. cDNA libraries were prepared using the Chromium Single Cell 3’ Library and Gel Bead v3 kit according to the manufacturer’s instructions. cDNA libraries were sequenced using the Illumina NovaSeq 6000 System and NovaSeq 6000 S2 Reagent Kit v1.5 (100 cycles), aiming at a sequencing depth of minimum 30K reads/nucleus.	Illumina NovaSeq 6000	24
EGAD00001009169	The Roche multiple sclerosis dataset (Roche_MS) consists of 166 cortical grey matter (GM) and white matter (WM) samples from 83 unique individuals (29 controls and 54 cases). A total of 12,000 estimated cells from each sample were loaded on the 10x Single Cell Next GEM G Chip. cDNA libraries were prepared using the Chromium Single Cell 3’ Library and Gel Bead v3 kit according to the manufacturer’s instructions. cDNA libraries were sequenced using the Illumina NovaSeq 6000 System and NovaSeq 6000 S2 Reagent Kit v1.5 (100 cycles), aiming at a sequencing depth of minimum 30K reads/nucleus.	Illumina NovaSeq 6000	166
EGAD00001009170	The whole exome was sequenced in two cancer-affected members (II:1 and III:1) of the family. The family subject of this study showed an autosomal dominant mode of CRC inheritance, fulfilling the Amsterdam I clinical criteria with three CRCs in two consecutive generations. The exome capture was performed using SureSelectXT Human All Exon V3 (51Mb, Agilent Technologies), and the library was sequenced on an Illumina HiSeq 2000 platform with paired-end reads of 101bp and a 50x average coverage depth.	Illumina HiSeq 2000	2
EGAD00001009171	Bank of metastatic colorectal cancer (mCRC) of Patient Derived Xenografts (PDXs)	unspecified	480
EGAD00001009172	PDX model of T-ALL under treatment of CB-103 and Vehicle was analyzed by single-cell transcriptomics using 10X Genomics technology.	NextSeq 500	4
EGAD00001009173	This dataset contains 10x Genomics v3 3’ single nuclei RNA sequencing (24 human schizophrenia and control samples) and 10x Genomics Visium spatial transcriptomics (14 human schizophrenia and control samples) datasets. Files are in .bam format, output of the cellranger v3.1.0 (https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/3.1/using/count) for snRNA-seq and spaceranger v1.1.0 (https://support.10xgenomics.com/spatial-gene-expression/software/pipelines/1.1/using/count) for Visium samples. Reads were mapped against Release 97 of human genome from Ensembl (http://ftp.ensembl.org/pub/release-97/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz http://ftp.ensembl.org/pub/release-97/gtf/homo_sapiens/Homo_sapiens.GRCh38.97.gtf.gz). More details on sample processing are available in the biorXiv pre-print (https://doi.org/10.1101/2020.11.17.386458) and upcoming publication in Science Advances. BAM files can be converted to fastq files using bamtofastq tool (https://support.10xgenomics.com/docs/bamtofastq) with downstream remapping using tools and genomes of choice.	Illumina NovaSeq 6000 NextSeq 500	38
EGAD00001009174	To explore intratumor-heterogeneity of CLL_24 using single-cell multi-omics approach, we generated single-cell CITE-seq data for CLL_24, coupling scRNA-seq and protein surface marker measurements with oligo-tagged antibodies.	NextSeq 500	1
EGAD00001009175	Low Pass Whole Genome Sequencing of Cell Free DNA from Patients Receiving CD19 CAR T-Cell Therapy for Large B-Cell Lymphoma consisting of 123 samples with FASTQs with hg19 aligned BAM/BAI files.	Illumina NovaSeq 6000	94
EGAD00001009176	We sorted CD45-CD44+CD90+ stromal cells from multiple tumor types and performed bulk RNA-sequencing.	Illumina HiSeq 2500	171
EGAD00001009177	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA218	Illumina HiSeq 2500	1
EGAD00001009178	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA219	Illumina HiSeq 2500	1
EGAD00001009179	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA272	Illumina HiSeq 2500	1
EGAD00001009180	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA274	Illumina HiSeq 2500	1
EGAD00001009181	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA275	Illumina HiSeq 2500	1
EGAD00001009182	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA276	Illumina HiSeq 2500	1
EGAD00001009183	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA279	Illumina HiSeq 2500	1
EGAD00001009184	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA283	Illumina HiSeq 2500	1
EGAD00001009185	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA287	Illumina HiSeq 2500	1
EGAD00001009186	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA394	Illumina HiSeq 2500	1
EGAD00001009187	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA395	Illumina HiSeq 2500	1
EGAD00001009188	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA398	Illumina HiSeq 2500	1
EGAD00001009189	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA402	Illumina HiSeq 2500	1
EGAD00001009190	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA404	Illumina HiSeq 2500	1
EGAD00001009191	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA530	Illumina HiSeq 2000	1
EGAD00001009192	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA585	Illumina HiSeq 2500	1
EGAD00001009193	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA586	Illumina HiSeq 2500	1
EGAD00001009194	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA588	Illumina HiSeq 2500	1
EGAD00001009195	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA589	Illumina HiSeq 2500	1
EGAD00001009196	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA590	Illumina HiSeq 2500	1
EGAD00001009197	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA591	Illumina HiSeq 2500	1
EGAD00001009198	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA592	Illumina HiSeq 2500	1
EGAD00001009199	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA593	Illumina HiSeq 2500	1
EGAD00001009200	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA595	Illumina HiSeq 2500	1
EGAD00001009201	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA596	Illumina HiSeq 2500	1
EGAD00001009202	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA598	Illumina HiSeq 2500	1
EGAD00001009203	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA599	Illumina HiSeq 2500	1
EGAD00001009204	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA600	Illumina HiSeq 2500	1
EGAD00001009205	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA601	Illumina HiSeq 2500	1
EGAD00001009206	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA654	Illumina HiSeq 2500	1
EGAD00001009207	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA655	Illumina HiSeq 2500	1
EGAD00001009208	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA665	Illumina HiSeq 2500	1
EGAD00001009209	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA666	Illumina HiSeq 2500	1
EGAD00001009210	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA667	Illumina HiSeq 2500	1
EGAD00001009211	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA668	Illumina HiSeq 2500	1
EGAD00001009212	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA669	Illumina HiSeq 2500	1
EGAD00001009213	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA671	Illumina HiSeq 2500	1
EGAD00001009214	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA672	Illumina HiSeq 2500	1
EGAD00001009215	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA535	Illumina HiSeq 2000	1
EGAD00001009216	Whole genome sequencing of normal sample for triple negative breast cancer patient SA1035	Illumina HiSeq X	1
EGAD00001009217	Whole genome sequencing of normal sample for triple negative breast cancer patient SA604	Illumina HiSeq 2500	1
EGAD00001009218	Whole genome sequencing of normal sample for triple negative breast cancer patient SA605	Illumina HiSeq 2500	1
EGAD00001009219	Whole genome sequencing of normal sample for triple negative breast cancer patient SA609	Illumina HiSeq 2500	1
EGAD00001009220	Whole genome sequencing of normal sample for triple negative breast cancer patient SA218	Illumina HiSeq 2500	1
EGAD00001009221	Whole genome sequencing of normal sample for triple negative breast cancer patient SA219	Illumina HiSeq 2500	1
EGAD00001009222	Whole genome sequencing of normal sample for triple negative breast cancer patient SA272	Illumina HiSeq 2500	1
EGAD00001009223	Whole genome sequencing of normal sample for triple negative breast cancer patient SA274	Illumina HiSeq 2500	1
EGAD00001009224	Whole genome sequencing of normal sample for triple negative breast cancer patient SA275	Illumina HiSeq 2500	1
EGAD00001009225	Whole genome sequencing of normal sample for triple negative breast cancer patient SA276	Illumina HiSeq 2500	1
EGAD00001009226	Whole genome sequencing of normal sample for triple negative breast cancer patient SA279	Illumina HiSeq 2500	1
EGAD00001009227	Whole genome sequencing of normal sample for triple negative breast cancer patient SA283	Illumina HiSeq 2500	1
EGAD00001009228	Whole genome sequencing of normal sample for triple negative breast cancer patient SA287	Illumina HiSeq 2500	1
EGAD00001009229	Whole genome sequencing of normal sample for triple negative breast cancer patient SA394	Illumina HiSeq 2500	1
EGAD00001009230	Whole genome sequencing of normal sample for triple negative breast cancer patient SA395	Illumina HiSeq 2500	1
EGAD00001009231	Whole genome sequencing of normal sample for triple negative breast cancer patient SA398	Illumina HiSeq 2500	1
EGAD00001009232	Whole genome sequencing of normal sample for triple negative breast cancer patient SA402	Illumina HiSeq 2500	1
EGAD00001009233	Whole genome sequencing of normal sample for triple negative breast cancer patient SA404	Illumina HiSeq 2500	1
EGAD00001009234	Whole genome sequencing of normal sample for triple negative breast cancer patient SA530	Illumina HiSeq 2500	1
EGAD00001009235	Whole genome sequencing of normal sample for triple negative breast cancer patient SA535	Illumina HiSeq 2000	1
EGAD00001009236	Whole genome sequencing of normal sample for triple negative breast cancer patient SA585	Illumina HiSeq 2500	1
EGAD00001009237	Whole genome sequencing of normal sample for triple negative breast cancer patient SA586	Illumina HiSeq 2500	1
EGAD00001009238	Whole genome sequencing of normal sample for triple negative breast cancer patient SA588	Illumina HiSeq 2500	1
EGAD00001009239	Whole genome sequencing of normal sample for triple negative breast cancer patient SA589	Illumina HiSeq 2500	1
EGAD00001009240	Whole genome sequencing of normal sample for triple negative breast cancer patient SA590	Illumina HiSeq 2500	1
EGAD00001009241	Whole genome sequencing of normal sample for triple negative breast cancer patient SA591	Illumina HiSeq 2500	1
EGAD00001009242	Whole genome sequencing of normal sample for triple negative breast cancer patient SA592	Illumina HiSeq 2500	1
EGAD00001009243	Whole genome sequencing of normal sample for triple negative breast cancer patient SA593	Illumina HiSeq 2500	1
EGAD00001009244	Whole genome sequencing of normal sample for triple negative breast cancer patient SA595	Illumina HiSeq 2500	1
EGAD00001009245	Whole genome sequencing of normal sample for triple negative breast cancer patient SA596	Illumina HiSeq 2500	1
EGAD00001009246	Whole genome sequencing of normal sample for triple negative breast cancer patient SA598	Illumina HiSeq 2500	1
EGAD00001009247	Whole genome sequencing of normal sample for triple negative breast cancer patient SA599	Illumina HiSeq 2500	1
EGAD00001009248	Whole genome sequencing of normal sample for triple negative breast cancer patient SA600	Illumina HiSeq 2500	1
EGAD00001009249	Whole genome sequencing of normal sample for triple negative breast cancer patient SA601	Illumina HiSeq 2500	1
EGAD00001009250	Whole genome sequencing of normal sample for triple negative breast cancer patient SA654	Illumina HiSeq 2500	1
EGAD00001009251	Whole genome sequencing of normal sample for triple negative breast cancer patient SA655	Illumina HiSeq 2500	1
EGAD00001009252	Whole genome sequencing of normal sample for triple negative breast cancer patient SA665	Illumina HiSeq 2500	1
EGAD00001009253	Whole genome sequencing of normal sample for triple negative breast cancer patient SA667	Illumina HiSeq 2500	1
EGAD00001009254	Whole genome sequencing of normal sample for triple negative breast cancer patient SA668	Illumina HiSeq 2500	1
EGAD00001009255	Whole genome sequencing of normal sample for triple negative breast cancer patient SA669	Illumina HiSeq 2500	1
EGAD00001009256	Whole genome sequencing of normal sample for triple negative breast cancer patient SA671	Illumina HiSeq 2500	1
EGAD00001009257	Whole genome sequencing of normal sample for triple negative breast cancer patient SA672	Illumina HiSeq 2500	1
EGAD00001009258	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1035	Illumina HiSeq X	1
EGAD00001009259	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA994	Illumina HiSeq X	1
EGAD00001009260	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA604	Illumina HiSeq 2500	1
EGAD00001009261	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA605	Illumina HiSeq 2500	1
EGAD00001009262	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA609	Illumina HiSeq 2500	1
EGAD00001009264	Profiling of paired nuclear and cytoplasmic fractions of anterior prefrontal cortex, cerebellar cortex and putamen samples by bulk-tissue RNA-sequencing. Samples were derived from 4 post-mortem neuropathologically-confirmed control individuals ( anterior prefrontal cortex – 4 individuals, cerebellar cortex – 4 individuals, putamen- 3 individuals). Paired-end FASTQ files for each of the human samples are provided. Fastp (v 0.20.0), a fast all-in-one FASTQ pre-processor, was used for adapter trimming, read filtering and base correction. Fastp default settings were used for quality filtering and base correction. Further details on parameters used are available here: https://github.com/RHReynolds/RNAseqProcessing .	Illumina HiSeq 4000	22
EGAD00001009265	Dataset comprising raw paired RNA-seq data in fastq.gz format for 7 samples of rosette forming brain tumors	NextSeq 500	7
EGAD00001009266	This dataset comprise results of mutect2 variant calling in vcf format on 9 samples of rosette forming brain tumors (5 with paired normal tissue and 4 without). Only variants specific to the tumor where kept to comply with patients consent.		14
EGAD00001009267	59 samples are sequenced, 18 are HCC and beta thalassemia cases while the remaining are control case.	NextSeq 500	59
EGAD00001009268	levels of 92 circulating proteins measured by Olink platform, CVDIII panel		1
EGAD00001009269	Fastq or bam files are deposited for 28 patient H3-K27M diffuse midline gliomas. UMPEDD65 was profiled by targeted exome-sequencing using the TSO500 Illumina assay, while all other samples were sequenced by whole-exome sequencing.	NextSeq 500	28
EGAD00001009270	10X Genomics scRNA- and TCR-sequencing (Chromium Next GEM Single Cell 5’ Reagent Kit v1.1) was performed on the plasma cell depleted mononuclear fraction of bone marrow aspirates from 6 patients with newly diagnosed multiple myeloma. Generated gene expression libraries were paired-end sequenced on the NovaSeq 6000 S2. Generated V(D)J libraries were paired-end sequenced on the NextSeq 550.	Illumina NovaSeq 6000 NextSeq 550	-
EGAD00001009271	ATAC-Seq on OCIAML-22 CD34+, CD34-, and Bulk Fractions RNA-Seq on OCIAML-22 CD34+/CD38-, CD34+/CD38+, CD34-/CD38+, CD34-/CD38- Fractions WGS on Donor Bulk, OCIAML-22 Bulk, and CD34+ and CD34- Fractions out of OCIAML-22 Xenografts	Illumina NovaSeq 6000	36
EGAD00001009272	WES/WGS sequencing data of 234 chromothriptic tumor and control runs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	233
EGAD00001009273	WES/WGS sequencing data of 86 chromothriptic tumor and control runs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	86
EGAD00001009274	WES/WGS sequencing data of 337 chromothriptic tumor and control runs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten Illumina HiSeq 4000	319
EGAD00001009275	WES/WGS sequencing data of 56 chromothriptic tumor and control runs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	56
EGAD00001009276	WES/WGS sequencing data of 75 chromothriptic tumor and control runs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	74
EGAD00001009277	WES/WGS sequencing data of 44 chromothriptic tumor and control runs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	40
EGAD00001009278	WES/WGS sequencing data of 242 chromothriptic tumor and control runs, which were uploaded to umbrella studies. The sequencing was always paired	Illumina HiSeq 2500 Illumina HiSeq 4000	242
EGAD00001009279	WES/WGS sequencing data of 239 chromothriptic tumor and control runs, which were uploaded to umbrella studies. The sequencing was always paired	HiSeq X Ten	218
EGAD00001009280	Whole-genome sequence (WGS) analysis of tumors from 22 TP53 mutation carriers. We observed somatic mutations affecting Wnt, PI3K/AKT signaling, epigenetic modifiers and homologous recombination genes as well as mutational signatures associated with prior chemotherapy. We identified near-ubiquitous early loss of heterozygosity of TP53, with gain of the mutant allele. This occurred earlier in these tumors compared to tumors with somatic TP53 mutations, suggesting the timing of this mark may distinguish germline from somatic TP53 mutations. Phylogenetic trees of tumor evolution, reconstructed from bulk and multi-region WGS, revealed that LFS tumors exhibit comparatively limited heterogeneity. Overall, our study delineates early copy number gains of mutant TP53 as a characteristic mutational process in LFS tumorigenesis, likely arising very early in life or in utero.years prior to tumor diagnosis.	HiSeq X Ten Illumina HiSeq 2500	65
EGAD00001009281	This dataset includes WXS sequencing for 1 tumor FFPE sample and adjacent normal tissue from the individual from one family member.	Illumina NovaSeq 6000	2
EGAD00001009282	The study includes WGS data for DNA extracted from blood, fibroblasts or buccal swabs from sixteen family members, who represent four sub-families, each including two parents and one to three children, comprising a total of eight offspring. In two sub-families POLD1 L474P was carried by the father; in one sub-family, by the mother; and in the other sub-family, both parents had wild-type POLD1.	Illumina NovaSeq 6000	16
EGAD00001009283	This dataset contains WGS for fibroblasts colonies obtained from carriers and non-carriers of germline POLD1 L474P. Data for single-cell colonies obtained from immortalized fibroblasts are present for eight out of 16 family members (six carriers and two non-carriers). Sequences obtained for colonies after approximately 40 passages also present for six out of these eight colonies (four carriers and 2 non-carriers). Samples marked with "_F2" and "_F3" represent sequences of single cell-derived colonies and colonies after ~40 passages correspondingly.	Illumina NovaSeq 6000	14
EGAD00001009284	Bam files of 17 samples from 11 different patients. The scRNAseq data were obtained using the 10X 3' Gene Expression kit.	Illumina NovaSeq 6000	17
EGAD00001009285	Data used to validate RNAmp tool.	Illumina HiSeq 2500	27
EGAD00001009286	Dataset contains paired-end clinical cancer panel sequencing (UCSF500) data from 2 samples of an initial tumor and one sample of a recurrence from one GBM patient and one sample from a second GBM patient.		4
EGAD00001009287	The dataset consists of 258 bam files by whole exome sequencing. 122 from IgAN-tGBM patients, 64 from IgAN patients and 72 fromTBMN patients.	Illumina NovaSeq 6000	258
EGAD00001009288	Sample information: The 56 samples produced in this project come from the human iPSC line GM17602 (Coriell) where a tyrosine hydroxylase-T2A-mCherry reporter was inserted. A combination of epigenetic analysis (ATAC/ChIP) along transcriptomics.	NextSeq 500	56
EGAD00001009289	Paired-end WGS data of 10 neuroblastoma patient samples (5 obtained at diagnosis and 5 matched blood samples as controls) used for analysis of telomeric content and sequence composition. Mean coverage is 11-65x per sample. The remaining patient samples of the dataset can be found under accession numbers EGAS00001001308 and EGAS00001005424 and mappings of the patients IDs in the supplementary material of the publication.	Illumina HiSeq 2000 Illumina NovaSeq 6000	10
EGAD00001009291	Tumour biopsies were collected from twenty-three patients (46 samples) (of which 20 were matched) with locally advanced or metastatic melanoma (stage IIIB – stage IV). Library construction was done either with the Chromium Single Cell 3ʹ GEM, Library & Gel Bead Kit v3 (n = 16; 10x genomics, Cat#1000092) or the Chromium Single Cell A Chip Kit and 5’ Library & Gel Bead Kit (10x genomics, Cat#1000014). All libraries were sequenced on Illumina NextSeq, HiSeq4000 or NovaSeq6000 until sufficient saturation was reached (60% on average). The reference genome used in this study was GRCh38.	unspecified	46
EGAD00001009292	Using single-nucleus RNA sequencing, we characterized the transcriptome of 880,000 nuclei from 18 control and 61 failing, nonischemic human hearts with pathogenic variants in DCM and ACM genes or idiopathic disease.	Illumina HiSeq 4000	196
EGAD00001009293	Using a novel sorting strategy, we performed ultra low input RNAseq from FACS-sorted populations from diagnostic DNMT3Amut and NPM1mut AML patients. Primary samples were retrospectively collected based on their mutational profile. Samples were thawed, stained and FACS sorted using combination of lineage markers, CD34, GPR56 and NKG2DLigands. RNA was extracted and library prepared from 13 samples.	Illumina HiSeq 2000 NextSeq 550	7
EGAD00001009294	Archival de-identified formalin-fixed paraffin-embedded RCC tumor tissue blocks from nephrectomy or tumor biopsy were processed as per below and the same sections were used for both DNA and RNA extractions. For WES (ACE version 3; Illumina NovaSeq), samples were profiled using Personalis ACE Cancer Exome (Personalis, Inc, Menlo Park, CA) Whole-transcriptome profiles were generated by RNA-seq (Accuracy and Content Enhanced (ACE) version 3; Illumina NovaSeq) using Personalis ACE Cancer Transcriptome (Personalis, Inc, Menlo Park, CA ) Of the 615 patients in the intent-to-treat population in S-TRAC trial, 193 individual specimens were available for molecular profiling, of which 171 (27.8%) (sunitinib, n = 91; placebo, n = 80) returned results for the WES analysis, and 133 (21.6%) (sunitinib, n = 72; placebo, n = 61) returned results for the GES analysis. Of the 138 WTS samples with data, replicates for two patients were summarized by median expression, and three samples were excluded from the final analysis due to low counts.	Illumina NovaSeq 6000	309
EGAD00001009296	Pulmonary atypical carcinoid. DNA WES and RNA-Seq on 42 tumour samples collected at autopsy, 10x Chromium linked read whole genome sequencing on four tumours plus one normal sample, and targeted DNA sequencing on two clinical biopsies and one blood plasma sample. Note – RNA-Seq dataset features complex batch effect attributable to tissue processing artefacts, as described in "Complex patterns of genomic heterogeneity identified in 42 tumor samples and ctDNA of a pulmonary atypical carcinoid patient" - Robb et al., 2022 and detailed in Supplementary Table S3. Note – RNA-Seq dataset features complex batch effect attributable to tissue processing artefacts, as described in "Complex patterns of genomic heterogeneity identified in 42 tumor samples and ctDNA of a pulmonary atypical carcinoid patient" - Robb et al., 2022 and detailed in Supplementary Table S3.	HiSeq X Ten Ion Torrent S5 NextSeq 500 unspecified	47
EGAD00001009297	Whole genome sequencing of paired tumor-normal samples of pediatric Wilms tumors		1
EGAD00001009298	Bulk RNA-seq data of pediatric Wilms tumors		1
EGAD00001009299	This dataset consists of paired-end DNA sequencing (whole exome and targeted-capture) of tumours for the BEACCON study. There are 240 unique samples consisting of 92 matched tumour-germline pairs and 56 unmatched tumours totalling 148 patients. There are 33 paired and 6 unpaired tumours sequenced using the Agilent SureSelect All Human Exon v6 libraries, 1 paired and 8 unpaired tumours sequenced using Agilent SureSelect All Human Exon v7 libraries and 48 paired and 3 unpaired tumours sequenced using Twist Bioscience Comprehensive Human Exome v1 libraries totalling 176 whole exome samples. There are 3 paired and 58 unpaired tumours sequenced using a custom Agilent SureSelectXT library totalling 64 targeted capture samples.	Illumina NovaSeq 6000 NextSeq 550	230
EGAD00001009300	extended cohort of single cell RNAseq data of lung adenocarcinoma	Illumina NovaSeq 6000	107
EGAD00001009301	RNA sequencing data of a collection of 6 pediatric ependymoma cases		6
EGAD00001009302	RNASeq files for Roussel-ATRT-TM paper titled "Atypical teratoid/ rhabdoid tumoroids reveal subgroup-specific drug vulnerabilities"	Illumina HiSeq 2000	8
EGAD00001009303	Data generated through single nuclei RNA sequencing on 5 regions of the brain (frontal cortex, ganglionic eminence, hippocampus, thalamus and cerebellum) from 3 fetuses (two of 14 and one of 15 post-conception weeks, all female). Tissue was acquired from the MRC-Wellcome Trust Human Developmental Biology Resource (HDBR) with ethical approval. snRNA-seq libraries were prepared from ∼10,000 nuclei from each sample using Chromium Single Cell 3ʹ (v3) reagents (10X Genomics). Quality control of libraries was performed using the Agilent 5200 Fragment Analyzer before sequencing on an Illumina NovaSeq 6000 to a depth of at least 865 million (median = 1.01 billion) read pairs per library. Raw sequencing data were converted into FASTQ files. For a full description of data generation, please see Cameron et al, Biological Psychiatry 2022, https://doi.org/10.1016/j.biopsych.2022.06.033.	Illumina NovaSeq 6000	17
EGAD00001009304	Genomic profiling at diagnosis of B-cell precursor Acute Lymphoblastic Leukemia (BCP-ALL) in adults is used to guide disease classification, risk stratification and treatment decisions. Patients for which diagnostic screening fails to identify disease defining or risk stratifying lesions are classified as B-other ALL. We screened a cohort of 652 BCP-ALL cases enrolled in UKALL14 to identify and perform whole genome sequencing (WGS) on paired tumor-normal samples. For 52 B-other patients we compared WGS findings to data from clinical and research cytogenetics. WGS identifies a cancer associated event in 51/52 cases, this includes an established subtype defining genetic alteration in 5/52 that were previously missed by standard-of-care genetics. Of the 47 true B-other ALL we identified a recurrent driver in 87% (41). Complex karyotype by cytogenetics emerges as a heterogeneous group, underlied by distinct genetic alterations associated with either favorable (DUX4-r) or poor outcomes (MEF2D-r, IGK::BCL2). For a subset of 31 cases, we integrate findings from RNA-sequencing (RNA-seq) analysis to include fusion gene detection, and classification by gene expression. Compared to RNA-seq, WGS was sufficient to detect and resolve recurrent genetic subtypes, however RNA-seq can provide orthogonal validation of findings. In conclusion, we demonstrate that WGS can identify clinically relevant genetic abnormalities missed by standard-of-care testing and identify leukemia driver events in virtually all cases of B-other ALL.	HiSeq X Ten	115
EGAD00001009305	Genomic profiling at diagnosis of B-cell precursor Acute Lymphoblastic Leukemia (BCP-ALL) in adults is used to guide disease classification, risk stratification and treatment decisions. Patients for which diagnostic screening fails to identify disease defining or risk stratifying lesions are classified as B-other ALL. We screened a cohort of 652 BCP-ALL cases enrolled in UKALL14 to identify and perform whole genome sequencing (WGS) on paired tumor-normal samples. For 52 B-other patients we compared WGS findings to data from clinical and research cytogenetics. WGS identifies a cancer associated event in 51/52 cases, this includes an established subtype defining genetic alteration in 5/52 that were previously missed by standard-of-care genetics. Of the 47 true B-other ALL we identified a recurrent driver in 87% (41). Complex karyotype by cytogenetics emerges as a heterogeneous group, underlied by distinct genetic alterations associated with either favorable (DUX4-r) or poor outcomes (MEF2D-r, IGK::BCL2). For a subset of 31 cases, we integrate findings from RNA-sequencing (RNA-seq) analysis to include fusion gene detection, and classification by gene expression. Compared to RNA-seq, WGS was sufficient to detect and resolve recurrent genetic subtypes, however RNA-seq can provide orthogonal validation of findings. In conclusion, we demonstrate that WGS can identify clinically relevant genetic abnormalities missed by standard-of-care testing and identify leukemia driver events in virtually all cases of B-other ALL.	Illumina HiSeq 4000	33
EGAD00001009306	Fresh nephrectomy samples were collected from a total of 5 untreated ccRCC patients. Out of these 5 patients, 7 samples were obtained. Two samples consisted of fresh versus frozen single cells from the primary tumor site of one patient. Two other samples consisted of matched primary and distant thrombus sites (the vena cava) of a second patient. The three remaining samples came from the primary tumor sites of three distinct ccRCC patients. Single-cells were captured into 10x barcoded gel beads and RNA-sequencing library preparation was done using Chromium Single Cell 3' v2 chemistry. Sequencing was performed on a Illumina HiSeq 4000 sequencer.	Illumina HiSeq 4000	7
EGAD00001009307	In this study, we aimed to identify somatic structural variation of Skin fibroblast at the single-cell level and investigate its direct consequence on the nucleosome occupancy using scNOVA approach. For this purpose, we performed strand-specific single-cell sequencing of skin fibroblast sample from male donor.	NextSeq 500	95
EGAD00001009308	This dataset contains data used in the paper titled "Significant and pervasive effects of RNA degradation on Nanopore direct RNA sequencing. The data consists of one post mortem sample that was sequenced with direct RNA sequencing form Oxford Nanopore Technologies on a promethION flow cell.	PromethION	1
EGAD00001009309	This dataset contains: 1.) Whole-genome sequencing (WGS) data (~6x) of 259 cfDNA samples obtained from 50 colorectal cancer (CRC) patients and 61 healthy controls. Paired-end sequencing was performed with 2x101 bp reads on the NovaSeq 6000 system. Data is provided as mapped .bam files (aligned to GRCh38/hg38). 2.) WGS data (~1x) of 50 tumor biopsy and 45 saliva samples from CRC patients. Paired-end sequencing was performed with 2x101 bp reads on the NovaSeq 6000 system. Data is provided as mapped .bam files (aligned to GRCh38/hg38).	Illumina NovaSeq 6000	354
EGAD00001009311	The dataset contains a genomics characterization of 35 triple-negative Asian breast tumours from the Malaysian Breast Cancer cohort. This includes whole-exome sequencing of tumour tissue at 80X, whole-exome sequencing of matched normal (blood) tissue at 40X, and RNA-seq of tumour tissue at 40X coverage (>15 million reads). Whole-exome libraries were prepared using the Nextera Rapid Capture Exome Kit; exome capture was performed in pools of 3 and subjected to paired end 75 sequencing on a NovaSeq 6000 platform. RNA libraries were prepared using the TruSeq Stranded Total RNA HT kit with Ribo-Zero Gold as per manufacturer’s instructions and also subjected to paired end 75 sequencing on a NovaSeq 6000 platform. Uploaded bam files have been mapped to the hs37d5 human genome and processed using the standard GATK pipelines. Paired clinical, demographic, genotyping, and overall survival data for these patients are available from the associated publications or by request.	Illumina NovaSeq 6000	105
EGAD00001009315	The dataset encompasses 172 Runs from the WGSPD Project 3 - Genomic Strategies to Identify High-impact Psychiatric Risk Variants Project	Illumina Genome Analyzer IIx	172
EGAD00001009316	Single Cell Genome Sequence for Triple negative breast cancer patient-derived xenograft SA609 passage 3 on DLP+ library A95618B	NextSeq 550	15
EGAD00001009317	Spinocerebellar ataxia type 3 (SCA3) is the most common autosomal dominant inherited ataxia worldwide, caused by a CAG repeat expansion in the Ataxin-3 gene resulting in a polyglutamine (polyQ)-expansion in the corresponding protein. Here we have RNA-sequencing data from the cerebellum of individuals with SCA3 and matched controls.	Illumina HiSeq 2000	12
EGAD00001009318	Small variants in HAE of several Canary Islanders sequenced with Illumina WES.		1
EGAD00001009319	Single Cell Genome Sequence for triple negative breast cancer patient SA1162SB on DLP+ library A95628A	HiSeq X Ten	3
EGAD00001009320	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1051D passage 1 on DLP+ library A95629B	HiSeq X Ten	3
EGAD00001009321	Single Cell Genome Sequence for immortalized breast epithelium - BRCA1-/- Tp53-/- cell line 184-hTERT-22 L9 83.86 on DLP+ library A95632A	Illumina HiSeq 2500	7
EGAD00001009322	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA609 passage 6 on DLP+ library A95632C	NextSeq 550	8
EGAD00001009323	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1091A passage 1 on DLP+ library A95634A	HiSeq X Ten	3
EGAD00001009324	Single Cell Genome Sequence for immortalized breast epithelium - BRCA2-/-; Tp53-/- cell line 184-hTERT-22 L9 116.126 on DLP+ library A95635A	Illumina HiSeq 2500	4
EGAD00001009325	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1052J passage 1 on DLP+ library A95650A	HiSeq X Ten	3
EGAD00001009326	Single Cell Genome Sequence for immortalized breast epithelium BRCA2+/- Tp53-/- cell line 184-hTert L9 116.66 on DLP+ library A95652A	HiSeq X Ten	2
EGAD00001009327	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1049A passage 1 on DLP+ library A95652B	HiSeq X Ten	3
EGAD00001009328	This ADPKD project has 3 different experiments, 7 different single cell RNA-seq data, 5 different single nuclei RNA-seq data and 6 different bulk ATAC-seq data objects.	Illumina NovaSeq 6000 NextSeq 550	15
EGAD00001009330	Sequencing data for three HGSC patients with patient derived cell lines. WES data for two patient derived cell line samples and matched blood control samples. Fresh frozen tumor samples of two patients with WES or WGS sequencing data.	HiSeq X Ten Illumina HiSeq 2000	9
EGAD00001009331	Single-cell RNA-seq, single-cell ATAC-seq, and genotypes used in the analysis for the study "Altered and allele-specific open chromatin landscape reveal epigenetic and genetic regulators of innate immunity in COVID-19". The RNA-seq and ATAC-seq are raw data in FASTQ format while the genotypes are in the VCF format which was filtered and imputed (more details are available in the main text of the study).	Illumina NovaSeq 6000	32
EGAD00001009333	The phenotypic data for 6431 samples of the KDRN Study from Ghana and Nigeria.		6431
EGAD00001009335	The dataset of Integrative modeling of tumor genomes and epigenomes for enhanced cancer diagnosis by cell-free DNA includes 3784 whole genome sequencing bam files on the MGI and Illumina platform. The analyzed samples include plasma samples from normal individuals and patients with cancer.	unspecified	3784
EGAD00001009336	Create a living biobank of patient-derived ductal carcinoma in situ (DCIS) Mouse-INtraDuctal (MIND) xenografts to find factors explaining invasive growth. Samples exist of both primary and pdx samples. Invasive growth was scored in the pdx.	Illumina HiSeq 2500 Illumina NovaSeq 6000	227
EGAD00001009337	Low-pass whole-genome sequencing of pretherapy lymphoma cfDNA and targeted sequencing of cfDNA, tumor tissue and whole-blood samples of NLG-LBC-05 patient samples and cfDNA of nine subjects with no known cancer. Hybrid capture target enrichment; panel target and sequencing information provided in PMID:34932792. FASTQ files provided for targeted sequencing, separate sequencing runs per sample noted with prefix "run" if applicable. Sequences from DTX1 and KLHL6 targets are advised to be excluded from analyses due to PCR/plasmid contaminants of in-house origin.	Illumina HiSeq 2500 Illumina NovaSeq 6000	391
EGAD00001009339	High-resolution lung adenocarcinoma expression subtypes identify tumors with dependencies on MET, CDK4, CDK6, and PD-L1	Illumina HiSeq 2500	164
EGAD00001009340	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA673	Illumina HiSeq 2500	1
EGAD00001009341	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA674	Illumina HiSeq 2500	1
EGAD00001009342	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA675	Illumina HiSeq 2500	1
EGAD00001009343	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA676	Illumina HiSeq 2500	1
EGAD00001009344	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA677	Illumina HiSeq 2500	1
EGAD00001009345	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA678	Illumina HiSeq 2500	1
EGAD00001009346	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA679	Illumina HiSeq 2500	1
EGAD00001009347	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA680	Illumina HiSeq 2500	1
EGAD00001009348	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA681	Illumina HiSeq 2500	1
EGAD00001009349	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA682	Illumina HiSeq 2500	1
EGAD00001009350	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA683	Illumina HiSeq 2500	1
EGAD00001009351	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA221	Illumina HiSeq 2000	1
EGAD00001009352	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA238	Illumina HiSeq 2000	1
EGAD00001009353	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA239	Illumina HiSeq 2000	1
EGAD00001009354	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA300	Illumina HiSeq 2000	1
EGAD00001009355	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA423	Illumina HiSeq 2000	1
EGAD00001009356	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA425	Illumina HiSeq 2000	1
EGAD00001009357	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA495	Illumina HiSeq 2000	1
EGAD00001009358	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA286	Illumina HiSeq 2000	1
EGAD00001009359	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA289	Illumina HiSeq 2000	1
EGAD00001009360	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA291	Illumina HiSeq 2000	1
EGAD00001009361	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA280	Illumina HiSeq 2000	1
EGAD00001009362	Whole genome sequencing of normal sample for triple negative breast cancer patient SA221	Illumina HiSeq 2000	1
EGAD00001009363	Whole genome sequencing of normal sample for triple negative breast cancer patient SA238	Illumina HiSeq 2000	1
EGAD00001009364	Whole genome sequencing of normal sample for triple negative breast cancer patient SA239	Illumina HiSeq 2000	1
EGAD00001009365	Whole genome sequencing of normal sample for triple negative breast cancer patient SA280	Illumina HiSeq 2000	1
EGAD00001009366	Whole genome sequencing of normal sample for triple negative breast cancer patient SA286	Illumina HiSeq 2000	1
EGAD00001009367	Whole genome sequencing of normal sample for triple negative breast cancer patient SA289	Illumina HiSeq 2000	1
EGAD00001009368	Whole genome sequencing of normal sample for triple negative breast cancer patient SA291	Illumina HiSeq 2000	1
EGAD00001009369	Whole genome sequencing of normal sample for triple negative breast cancer patient SA300	Illumina HiSeq 2000	1
EGAD00001009370	Whole genome sequencing of normal sample for triple negative breast cancer patient SA423	Illumina HiSeq 2000	1
EGAD00001009371	Whole genome sequencing of normal sample for triple negative breast cancer patient SA425	Illumina HiSeq 2000	1
EGAD00001009372	Whole genome sequencing of normal sample for triple negative breast cancer patient SA495	Illumina HiSeq 2000	1
EGAD00001009373	Whole genome sequencing of normal sample for triple negative breast cancer patient SA673	Illumina HiSeq 2500	1
EGAD00001009374	Whole genome sequencing of normal sample for triple negative breast cancer patient SA674	Illumina HiSeq 2500	1
EGAD00001009375	Whole genome sequencing of normal sample for triple negative breast cancer patient SA675	Illumina HiSeq 2500	1
EGAD00001009376	Whole genome sequencing of normal sample for triple negative breast cancer patient SA676	Illumina HiSeq 2500	1
EGAD00001009377	Whole genome sequencing of normal sample for triple negative breast cancer patient SA677	Illumina HiSeq 2500	1
EGAD00001009378	Whole genome sequencing of normal sample for triple negative breast cancer patient SA678	Illumina HiSeq 2500	1
EGAD00001009379	Whole genome sequencing of normal sample for triple negative breast cancer patient SA679	Illumina HiSeq 2500	1
EGAD00001009380	Whole genome sequencing of normal sample for triple negative breast cancer patient SA680	Illumina HiSeq 2500	1
EGAD00001009381	Whole genome sequencing of normal sample for triple negative breast cancer patient SA681	Illumina HiSeq 2500	1
EGAD00001009382	Whole genome sequencing of normal sample for triple negative breast cancer patient SA682	Illumina HiSeq 2500	1
EGAD00001009383	Whole genome sequencing of normal sample for triple negative breast cancer patient SA683	Illumina HiSeq 2500	1
EGAD00001009384	whole-genome sequencing data of 177 samples.	HiSeq X Ten	177
EGAD00001009385	Single-cell mRNA-sequencing to generate a transcriptomic atlas of soft tissue sarcoma tumors	NextSeq 500	13
EGAD00001009386	Comprehensive map of first- and second-trimester gonadal development in humans using a combination of single-cell and spatial transcriptomics, chromatin accessibility assays, and imaging. ArrayExpress Accession: E-MTAB-10551	Illumina NovaSeq 6000	28
EGAD00001009387	Comprehensive map of first- and second-trimester gonadal development in humans using a combination of single-cell and spatial transcriptomics, chromatin accessibility assays, and imaging. ArrayExpress Accession: E-MTAB-10570	Illumina NovaSeq 6000	8
EGAD00001009388	Comprehensive map of first- and second-trimester gonadal development in humans using a combination of single-cell and spatial transcriptomics, chromatin accessibility assays, and imaging. ArrayExpress Accession: E-MTAB-11708	Illumina NovaSeq 6000	4
EGAD00001009389	This experiment consists of RNAseq of liver harvested from CDAHFD mice treated for 8 weeks with either the MGAT2 inhibitor compound BMS-963272 (N = 10) or with vehicle (N = 10).	Illumina HiSeq 2500	20
EGAD00001009390	This experiment consists of RNAseq of jejunum (small intestine) harvested from CDAHFD mice treated for 8 weeks with either the MGAT2 inhibitor compound BMS-963272 (N = 14) or with vehicle (N = 14).	Illumina HiSeq 2500	28
EGAD00001009391	A subset of meningiomas progress in histopathological grade and drivers of progression are poorly understood. We aimed to identify somatic mutations and copy number alterations (CNAs) associated with grade progression in a unique matched tumour dataset. This dataset consists of DNA sequencing from 10 individuals with meningiomas, where the meningiomas have underdone grade progression. 50 meningiomas were sequenced from the 10 individuals using the hybrid capture-based TruSight Oncology 500 (TSO500) Library Preparation Kit, and 13 of those meningiomas were also sequenced using Agilent SureSelect Clinical Research Exome V2. BAM files for the sequencing data are included in the dataset.	Illumina NovaSeq 6000	63
EGAD00001009392	The dataset contains 12 lung cancer plasma cfDNA samples, 8 bladder cancer and 2 healthy control urine cfDNA samples collected in EDTA collection tubes. Shallow WGS was performed using both Oxford Nanopore Technologies' MinION platform with an R9.4.1 flow cell and the SQK-PBK004 kit (22 files) and Illumina Novaseq platform with an S4 flow cell in PE150bp configuration (2x22 files).	Illumina NovaSeq 6000 MinION	53
EGAD00001009393	Tumor infiltrated Macrophages and Monocytes were sorted on Aria II (Becton Dickinson) into TRIzol LS and flash frozen. RNA was extracted with chloroform. Isopropanol and linear acrylamide were added, and the RNA was precipitated with 75% ethanol. Total RNA (0.649–1 ng) with RNA integrity numbers 6.8 to 10 underwent amplification using the SMART-Seq v4. Ultra Low Input RNA Kit (Clontech; cat. #63488). Amplified cDNA (15 ng) was used to prepare libraries with the KAPA Hyper Prep Kit (Kapa Biosystems, KK8504) using 8 cycles of PCR.	Illumina HiSeq 2500	9
EGAD00001009394	Additional RNASeq files for Roussel paper titled "Combination of CDK4/6 with BET-bromodomain and PI3K/mTOR inhibitors in medulloblastoma in vitro and in vivo"	Illumina HiSeq 2000	19
EGAD00001009395	Total RNA sequencing of cultured ONS cells derived from patients with Alzheimer's disease (AD), individuals with mild cognitive impairment (MCI) and cognitively healthy controls.		1
EGAD00001009396	We performed deep targeted DNA sequencing with a panel of 74 selected cancer-related genes previously identified to be recurrently mutated in EBV associated DLBCL. Sequencing was performed on a HiSeq platform (Illumina) with 250 bp paired-end reads. There are 68 FFPE samples in this targetedDNAseq-dataset with 46 unpaired tumors and 22 normals used as a panel of normals.	Illumina HiSeq 4000	68
EGAD00001009397	PacBio HiFi sequencing was performed on 68 barcoded patients' genomic DNA after a telobait-capture protocol to enrich for telomeric regions. The sequencing reads of each patient were de-multiplexed and presented as patient-specific PacBio CCS BAM files. There are 56 new samples and 12 repeated samples from run 1.	Sequel	68
EGAD00001009398	Whole-genome sequencing of high-grade serous ovarian cancer (HGSC) tumours and matched normals from long-term survivors performed as part of the Multidisciplinary Ovarian Cancer Outcomes Group (MOCOG) study. The dataset includes fastq files from 58 HGSC tumours (53 primary tumours and 5 recurrent tumours) and 53 matched normals from 53 long-term survivor patients. Sequence libraries were generated from tumour and matched normal genomic DNA using the KAPA HyperPrep PCR-free library preparation kit (Roche) according to manufacturer’s instructions. Sequencing was carried out by the Kinghorn Centre for Clinical Genomics Sequencing Laboratory (Sydney, Australia) on the HiSeq X Ten System (Illumina) to a minimum base coverage of 30-fold for normal DNA and 60-fold for tumour DNA samples.	HiSeq X Ten	111
EGAD00001009399	This dataset contains bulk transcriptomes from the inoperable cohorts of the LUD2015-005 study (NCT02735239, EudraCT 2015-005298-19). Transcriptomes were prepared from pre-treatment oesophageal tumour biopsies using a ribodepletion approach in order to assess both previously reported (e.g. PD-L1 expression) and novel predictive expression-based biomarkers for immunochemotherapy treatment in this setting. On-treatment and post-treatment biopsies were generated as well to characterize response to therapy, and samples were also prepared from paired normal GI tissues for a subset of patients. scRNA-seq based deconvolution was also applied to bulk transcriptomes in this study to estimate the cell composition of tumour biopsies and assess the link between the presence of specific cell types with immunochemotherapy outcomes.	Illumina HiSeq 4000	144
EGAD00001009400	This dataset contains whole genome sequencing (WGS) generated from the inoperable cohorts of the LUD2015-005 study (NCT02735239, EudraCT 2015-005298-19). WGS data were generated with the aim to assess previously reported (e.g. tumour mutational burden) and novel predictive genomic markers for immunochemotherapy treatment in this setting. Using these data, mutation and copy number profiles were generated for the LUD2015-005 study, which were assessed for correlation with patient outcomes from this trial.	HiSeq X Ten Illumina NovaSeq 6000	102
EGAD00001009401	This dataset contains single-cell RNA-seq generated from the LUD2015-005 study (NCT02735239, EudraCT 2015-005298-19) and additional donors with Barrett's oesophagus using the 5' Single Cell Gene Expression assay from 10x Genomics. Samples were generated from oesophageal tumours, Barrett's oesophagus, and normal oesophagus and gastric tissues, with the aim of generating a reference atlas for cell types found in normal and disease-associated tissue states in the upper GI tract. This reference atlas was used as the basis for a deconvolution workflow to estimate the cell composition of bulk transcriptomes from these tissues, and to assess cell type-specific expression patterns of potential predictive biomarkers for immunochemotherapy regimens in this setting.	NextSeq 500	59
EGAD00001009402	Somatic mosaicism (SM), referring to the presence of somatic mutations in sub-populations of cells within healthy individuals, is associated with an increased risk of a variety of diseases, including cancer. Blood is at particularly high risk of SM, given its rapid turnover and functionally- heterogeneous cell-type composition. While the roles of point mutations and large-scale rearrangements in blood SM have been scrutinised in recent years, the functional impact of mosaic structural variants (mSVs) remains poorly understood. Using haplotype-resolved single-cell multi-omics based on Strand-seq technology, we explored the mSV landscape of human hematopoietic stem and progenitor cells (HSPCs).	NextSeq 500	1133
EGAD00001009403	RNA-Seq was performed on 249 DS-ALL samples. Library preparation was carried out using TruSeq Stranded Total RNA Library Prep Kit. The libraries were sequenced on a NovaSeq platform with read length of 2×101.	Illumina NovaSeq 6000	249
EGAD00001009404	Cetuximab treatment in organoids	unspecified	62
EGAD00001009405	Primary lung fibroblast were isolated from well-matched control donors (no COPD, n=3) and patients with COPD (GOLD stage I-IV, n=8). Total RNA of cultured human lung fibroblast were isolated at passage 3 and rRNA was depleted. 75 bp single-end reads were generated from RNA libraries on Illumina NextSeq 500.	NextSeq 500	11
EGAD00001009406	Primary lung fibroblast were isolated from well-matched control donors (no COPD, n=3) and patients with COPD (GOLD stage I-IV, n=8). Genomic DNA of cultured human lung fibroblast was isolated at passage 3, fragmented by tagmentation, and subjected to bisulfite treatment. 100 bp paired-end reads were generated from DNA libraries on Illumina HiSeq2500 platform.	Illumina HiSeq 2500	11
EGAD00001009407	29 paired FASTQ files from a Hi-C assay performed on mCRPC tumors. Sequencing was performed using 150nt paired reads generated by a Novaseq 6000 instrument.	Illumina NovaSeq 6000	28
EGAD00001009408	Paired FASTQ files from a Hi-C assay performed on mCRPC tumors. Sequencing was performed using 150nt paired reads generated by a Novaseq 6000 instrument.	Illumina NovaSeq 6000	65
EGAD00001009409	BAM files from RNA-seq of PDAC samples used in the COMPASS hENT1 study		-
EGAD00001009410	Germline variants calls were defined using the sequenced reads derived from 230 patients with hepatocellular carcinoma. This dataset is comprised of one aggregated vcf file.		230
EGAD00001009411	ONT (PromethION) sequencing of chromothriptic medulloblastoma. Three samples: blood, primary tumor, and relapse tumor. Includes a fourth low-coverage run that multiplexes blood and primary tumor.	PromethION	3
EGAD00001009412	Longitudinal plasma samples (n = 79) of 21 ALK-positive NSCLC patients and 13 healthy donors were collected alongside 15 ALK-positive tumor tissue and 10 healthy lung tissue specimens. All plasma and tissue samples were analyzed by cell-free DNA methylation immunoprecipitation sequencing to generate genome-wide 5-mC profiles. Paired cfMeDIP on NextSeq 550 using KAPA Hyper Prep Kit was done.	NextSeq 550	104
EGAD00001009413	Total RNA sequencing (SMARTer Stranded Total RNA-Seq Kit v2) data of extracellular RNA (exRNA) from liquid biopsies of the validation PDX/CDX cohort	Illumina NovaSeq 6000	60
EGAD00001009414	Clinical data including the treatment arm, HER2 status Pre- and Post-NAT.		-
EGAD00001009415	Biomarker data including the time point of sample collection, tumor content and ERBB2 gene expression.		1
EGAD00001009416	Amplicon sequencing data for 90 patients hospitalized for COVID-19. to general ward. Patients had a median age of 60.5 (52.0 – 69.3) years and were overweighted (Body mass index: 28.4 (24.4 – 32.6) kg/m2). 35.6% of the cohort were female. The following genes were sequenced on a NovaSeq600 instrument with an Enrichment based library preparation (IDT-xGEN) with a median coverage of 2000x: ABL1, ASXL1, ATRX, BCOR, BCORL1, BRAF, CALR, CBL, CBLB, CBLC, CDKN2A, CEBPA, CSF3R, CUX1, DNMT3A, ETV6, EZH2, FBXW7, FLT3, FLT3-ITD, GATA1, GATA2, GNAS, GNB1, HRAS, IDH1, IDH2, IKZF1, JAK2, JAK3, KDM6A, KIT, KMT2A, KRAS, MPL, MYD88, NOTCH1, NPM1, NRAS, PDGFRA, PHF6, PPM1D, PTEN, PTPN11, RAD21, RUNX1, SETBP1, SF3B1, SMC1A, SMC3, SRSF2, STAG2, TET2, TP53, U2AF1, WT1, ZRSR2	Illumina NovaSeq 6000	90
EGAD00001009417	13 paired FASTQ files from a Hi-C assay performed on mCRPC tumors. Sequencing was performed using 150nt paired reads generated by a Novaseq 6000 instrument.	Illumina NovaSeq 6000	13
EGAD00001009418	RNAseq FASTq files from 418 pre-treatment (Ven-Obi or Clb-Obi) CD19+ B cells.	unspecified	418
EGAD00001009419	RNAseq FASTq files from 44 pre-treatment (Ven-Obi or Clb-Obi) and 44 paired, post-treatment relapsed CD19+ B cells.	unspecified	88
EGAD00001009420	Fastq transcriptomic sequencing files from Z138 mantle cell lymphoma (MCL) cell line upon MSI2 knockdown (KD) with two different shRNAs and after MSI2 inhibition with Ro 08-2750 small molecule. Dataset includes 4 samples of Z138 MSI2-KD, 4 of Z138 control, 3 of Z138 treated with Ro 08-2750 and 3 of Z138 treated with DMSO.	Illumina HiSeq 2500	14
EGAD00001009421	Fastq transcriptomic sequencing files from Z138 SOX11+ and JVM2 SOX11- mantle cell lymphoma (MCL) cell lines upon SOX11 knock out (KO) and ectopic overexpression, respectively. Dataset includes 3 samples of Z138-SOX11KO, 3 of Z138 control, 3 of JVM2 control and 3 of JVM2-SOX11 MCL cell lines.	Illumina HiSeq 2500	12
EGAD00001009422	Dataset includes fastq transcriptomic sequencing files from 8 conventional (SOX11+) and 4 non-nodal (SOX11-) mantle cell lymphoma (MCL) primary cases. RNA-sequencing has performed from peripheral blood and lymph node diagnostic samples.	Illumina HiSeq 2500	12
EGAD00001009424	Illumina whole genome sequencing of Medulloblastoma Blood, Primary tumor, and Relapse tumor	HiSeq X Ten	1
EGAD00001009425	Illumina RNA-sequencing of Medulloblastoma primary and relapse tumor.	Illumina HiSeq 2000	1
EGAD00001009426	genetic data of 14 rigorously selected CUP samples	Illumina NovaSeq 6000	15
EGAD00001009427	8 pregnant women at the 3rd trimesters, 4 hepatitis B carriers, and 4 patients with hepatocellular carcinoma	Sequel	16
EGAD00001009428	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1181A passage 1 on DLP library A108765A	HiSeq X Ten	3
EGAD00001009429	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1182E passage 1 on DLP+ library A108847B	HiSeq X Ten	4
EGAD00001009430	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA610 passage 3 on DLP+ library A110660A	HiSeq X Ten	4
EGAD00001009431	Single Cell Genome Sequence for Immortalized breast epithelium BRCA2+/- Tp53-/- cell line 184-hTert L9 116.66 cell line SA1188 on DLP+ library A118357B	HiSeq X Ten	3
EGAD00001009432	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA605 passage 3 on DLP+ library A118368B	HiSeq X Ten	4
EGAD00001009433	Single Cell Genome Sequence for Telomerase immortalized breast epithelium cell line 184-hTERT 85.14 p20 cell line AT135 on DLP+ library A118389B	NextSeq 2000	4
EGAD00001009434	Single Cell Genome Sequence for Immortalized breast epithelium BRCA2+/- Tp53-/- cell line 184-hTert L9 116.66 cell line SA1188 on DLP+ library A118425B	HiSeq X Ten	3
EGAD00001009435	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1050B passage 1, patient-derived xenograft SA1050E passage 1 on DLP+ library A118782A	HiSeq X Ten	6
EGAD00001009436	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1050B passage 1, patient-derived xenograft SA1050E passage 1 on DLP+ library A118784A	HiSeq X Ten	6
EGAD00001009437	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA610 passage 3, patient-derived xenograft SA1096C passage 1 on DLP+ library A118790A	HiSeq X Ten	6
EGAD00001009438	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1184D passage 1 on DLP+ library A118797B	HiSeq X Ten	4
EGAD00001009439	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1162B passage 1, patient-derived xenograft SA1096B passage 1 on DLP+ library A118804A	HiSeq X Ten	6
EGAD00001009440	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1096B passage 1 on DLP+ library A118808A	HiSeq X Ten	4
EGAD00001009441	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1096C passage 1 on DLP+ library A118808B	HiSeq X Ten	4
EGAD00001009442	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1180C passage 1 on DLP+ library A118812B	HiSeq X Ten	4
EGAD00001009443	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1162B passage 1 on DLP+ library A118814B	HiSeq X Ten	4
EGAD00001009444	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1180C passage 1 on DLP+ library A118816A	HiSeq X Ten	4
EGAD00001009445	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1184D passage 1 on DLP+ library A118857B	HiSeq X Ten	4
EGAD00001009446	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1053F passage 1 on DLP+ library A95663A	HiSeq X Ten	3
EGAD00001009447	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient SA1162SA on DLP+ library A95668A	HiSeq X Ten	3
EGAD00001009448	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA501 passage 5 on DLP+ library A95670A	NextSeq 550	3
EGAD00001009449	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA501 passage 6 on DLP+ library A95670B	NextSeq 550	3
EGAD00001009450	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1096A passage 1 on DLP+ library A95717A	HiSeq X Ten	3
EGAD00001009451	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA604 passage 6 on DLP+ library A95722A	HiSeq X Ten	7
EGAD00001009452	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA501 passage 2 on DLP+ library A96109A	HiSeq X Ten	5
EGAD00001009453	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1049A passage on DLP+ library A96113A	NextSeq 550	6
EGAD00001009454	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA609 passage 8 on DLP+ library A96130A	HiSeq X Ten	3
EGAD00001009455	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient SA1162SA on DLP+ library A96142B	HiSeq X Ten	3
EGAD00001009456	Single Cell Genome Sequence for triple negative breast cancer patient SA1135, patient SA1162SA on DLP+ library A96154A	HiSeq X Ten	5
EGAD00001009457	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA604 passage 8 on DLP+ library A96161A	HiSeq X Ten	5
EGAD00001009458	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA501 passage 2, patient-derived xenograft SA611 passage 3 on DLP+ library A96171A	HiSeq X Ten	7
EGAD00001009459	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA501 passage 15 on DLP+ library A96173A	HiSeq X Ten	7
EGAD00001009460	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA501 passage 11 on DLP+ library A96174A	HiSeq X Ten	3
EGAD00001009461	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA604 passage 7 on DLP+ library A96175A	HiSeq X Ten	3
EGAD00001009462	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA604 passage 6 on DLP+ library A96177C	HiSeq X Ten	3
EGAD00001009463	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA604 passage 6 on DLP+ library A96180A	HiSeq X Ten	3
EGAD00001009464	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA501 passage 11, patient-derived xenograft SA609 passage 7 on DLP+ library A96187A	HiSeq X Ten	10
EGAD00001009465	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1049C passage 1 on DLP+ library A96189A	HiSeq X Ten	3
EGAD00001009466	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1051A passage 1 on DLP+ library A96190B	HiSeq X Ten	3
EGAD00001009467	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1093C passage 1, patient SA1147 on DLP+ library A96192A	HiSeq X Ten	5
EGAD00001009468	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1053B passage 1 on DLP+ library A96194A	HiSeq X Ten	3
EGAD00001009469	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1053E passage 1 on DLP+ library A96194B	HiSeq X Ten	3
EGAD00001009470	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1052D passage 1 on DLP+ library A96200B	HiSeq X Ten	3
EGAD00001009471	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1049A passage 1 on DLP+ library A96205B	HiSeq X Ten	-
EGAD00001009472	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1050F passage 1 on DLP+ library A96206B	HiSeq X Ten	3
EGAD00001009473	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1051A passage 1 on DLP+ library A96207A	HiSeq X Ten	3
EGAD00001009474	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1052B passage 1 on DLP+ library A96207B	HiSeq X Ten	3
EGAD00001009475	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient SA1047B on DLP+ library A96210B	HiSeq X Ten	3
EGAD00001009476	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA604 passage 7 on DLP+ library A96212B	HiSeq X Ten	5
EGAD00001009477	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA501 passage 6, cell line SA1090 on DLP+ library A96213A	HiSeq X Ten	5
EGAD00001009478	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient SA1105, patient SA1103, patient SA1106, patient SA1104 on DLP+ library A96222A	NextSeq 550	11
EGAD00001009479	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA919 passage 7, patient-derived xenograft SA1050A passage 1 on DLP+ library A98181A	HiSeq X Ten	6
EGAD00001009480	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA530 passage 3 on DLP+ library A98240A	HiSeq X Ten	4
EGAD00001009481	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient SA1096A, patient-derived xenograft SA1052D passage 1 on DLP+ library A98243B	HiSeq X Ten	6
EGAD00001009482	Intratumoral heterogeneity (ITH) has been linked to decreased efficacy of clinical treatments. However, although genomic ITH has been characterized in genetic, transcriptomic and epigenetic alterations are hallmarks of esophageal squamous cell carcinoma (ESCC), the extent to which these are heterogeneous in ESCC has not been explored in a unified framework. Further, the extent to which tumor-infiltrated T lymphocytes (TILs) are directed against cancer cells, but how the immune infiltration acts as a selective force to shape the clonal evolution of ESCC is unclear. In this study, we perform multi-omic sequencing on 186 samples from 36 primary ESCC patients. Through multi-omics analyses, it is discovered that genomic, epigenomic, and transcriptomic ITH are underpinned by ongoing chromosomal instability. Based on the RNA-seq data, we observe diverse levels of immune infiltrate across different tumor sites from the same tumor. We reveal genetic mechanisms of neoantigen evasion under distinct selection pressure from the diverse immune microenvironment. Overall, our work offers an avenue of dissecting the complex contribution of the multi-omics level to the ITH in ESCC and thereby enhances the development of clinical therapy.	HiSeq X Ten Illumina HiSeq 2000	129
EGAD00001009483	Circle-Seq experiment.	HiSeq X Ten	1
EGAD00001009484		GridION	1
EGAD00001009485	Contains 4 clonal organoid samples + 1 bulk healthy tissue sample	HiSeq X Ten	5
EGAD00001009486	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1050A passage 1, patient SA1234 on DLP+ library A98279A	HiSeq X Ten	-
EGAD00001009487	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA501 passage 2 on DLP+ library A95621B	Illumina HiSeq 2500	1
EGAD00001009488	Single Cell Genome Sequence for Immortalized breast epithelium - BRCA2-/-; Tp53-/- cell line 184-hTERT-22 L9 112.109 cell line SA1055 on DLP+ library A95621A	Illumina HiSeq 2500	3
EGAD00001009489	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient-derived xenograft SA1050D passage 1 on DLP+ library A95717B	HiSeq X Ten	3
EGAD00001009490	Oxford Nanopore Technologies (ONT) long-read sequencing in a paired diagnostic and post- therapy medulloblastoma (2 samples). One sequencing was done on GridION, the other one on a P2 Solo. In both cases the SQK LSK-109 Kit was used for preparation.	GridION PromethION	2
EGAD00001009492	RNA-Seq data from 20 fresh-frozen postmortem samples from three patients who participated in the CASCADE rapid autopsy program and died of metastatic castration resistant prostate cancer.	Illumina NovaSeq 6000	20
EGAD00001009493	High-coverage whole genome sequencing data (median coverage: 23.5X, range: 14.14X-32.62X)) from white blood cells of the patients from isolated buffy coat of the blood drawn postmortem from patients who participated in the CASCADE rapid autopsy program and died of metastatic castration resistant prostate cancer.	Illumina NovaSeq 6000	14
EGAD00001009494	Low-pass whole genome sequencing data (median coverage: 0.37X, range: 0.07X-5.8X) from diagnostic formalin-fixed paraffin-embedded tissue and fresh frozen postmortem tissues from ten organs and postmortem blood from patients who participated in the CASCADE rapid autopsy program and died of metastatic castration resistant prostate cancer. For samples CA27_11, CA34_10, CA35_5, CA35_6, CA36_3, CA36_11, CA63_13, CA63_34, CA76_4, CA76_11, CA79_4, CA83_14 and CA83_26, we subsampled (using `samtools view -s 0.01` after mapping) from respective high coverage data from "CASCADE tumour high-coverage whole genome sequencing data" dataset.	Illumina NovaSeq 6000	152
EGAD00001009495	The study includes methylC-capture sequencing (MCC-Seq) on 73 cord blood DNA samples from the result of natural pregnancies (control) and through the assisted reproductive technologies for infertile couples (ART/infertile). Samples were collected as a part of the Quebec-based 3D (Design, Develop, Discover) longitudinal pregnancy cohort study. All the data were generated with 100bp paired-end reads using the Illumina NovaSeq 6000 systems.	Illumina NovaSeq 6000	73
EGAD00001009496	Dataset contains paired-end Whole Exome sequencing data from 257 glioma samples from 28 patients. 26 normal blood samples are also included.		283
EGAD00001009497	Dataset contains paired-end RNA-seq sequencing data from 221 glioma samples.		221
EGAD00001009498	Nasal epithelial cells of PCD and non-PCD patients grown at air-liquid interface for RNAseq analysis. A total of 10 non-PCD patients (ALI day 14), 9 non-PCD patients (ALI day 21), 8 non-PCD patients (ALI day 28), 4 PCD patients (ALI day 14, day 21 and day 28), and 23 PCD patients (ALI day 21). Non-PCD patients and the 4 PCD patients on the three ALI days were sequenced at a depth of 100M reads, the remaining 23 PCD patients were sequenced at a depth of 70M. Overall sequencing design was rRNA depletion and 150bp paired-end.	Illumina HiSeq 2500	65
EGAD00001009499	This dataset includes WES, WGS, and RNAseq data generated from autopsy samples.	unspecified	347
EGAD00001009500	Count matrix from 44 pre-treatment (Ven-Obi or Clb-Obi) and 44 paired, post-treatment relapsed CD19+ B cells.		1
EGAD00001009501	Count matrix from 418 pre-treatment (Ven-Obi or Clb-Obi) CD19+ B cells		1
EGAD00001009502	Table of treatment arm information for the 418 RNAseq evaluable population.		1
EGAD00001009504	10x Genomics 5' library scRNA-seq data for 4 iAMP21 patients	Illumina NovaSeq 6000	4
EGAD00001009505	This dataset contains 29 paired FASTQ files from whole-genome bisulfite sequencing (WGBS) assay performed on mCRPC tumors. Sequencing was performed using 150nt paired reads generated by a Novaseq 6000 instrument. It also contains whole-genome sequencing bam files aligned to hg38 using BWA from 36 patients, with tumor and matched tumor-adjacent normal samples. Sequencing was generated using HiSeq X Ten.	HiSeq X Ten Illumina NovaSeq 6000	72
EGAD00001009506	Samples encompass primary colorectal tumors or metastasis of 75 patients, collected by Medical Pathologists from surgically removed specimens. Tissues were embedded in optimal cutting temperature (OCT) medium, snapshot frozen in liquid nitrogen within 40 minutes of collection and preserved at -80ºC. Samples were collected between June 2010 and October 2017 as part of a prospective biobanking project.	Illumina NovaSeq 6000	75
EGAD00001009507	Fastq, Mutect (SNVs), Platypus (indels), and InfoGenomeR (SVs and CNAs) calls from whole genome sequencing data and fastq files of whole genome transcription data of five patients with pediatric medulloblastoma.	Illumina NovaSeq 6000	1
EGAD00001009508	We conducted whole-genome sequencing on blood and buccal specimen from a family with chimerism identified in the two monochorionic dizygotic twins. Blood DNA was obtained from all family members. In addition, we obtained buccal specimen from the chimeric twins. Whole-genome sequencing was conducted on Illumina NovaSeq.	Illumina NovaSeq 6000	8
EGAD00001009509	Targeted exome sequencing for a panel of 13 CLL driver genes	Illumina MiSeq	58
EGAD00001009510	Differential Presence of Exons in Cell-Free DNA Reveals Different Patterns in Colorectal Cancer Between Metastatic, Non-Metastatic Patients and Healthy Donors. 159 samples, Illumina sequencing technology.	Illumina NovaSeq 6000	159
EGAD00001009512	This dataset contains three sets of samples. The first sample set contains euploid fetus pregnancies reported by NIPTIFY screening test and postnatal evaluation. Dataset was processed similarly to previously published guidelines from KU Leuven, with modifications [1]. Briefly, peripheral blood samples were collected in cell-free DNA BCT tubes (Streck, USA), and plasma was separated with standard dual centrifugation. Cell-free DNA was extracted from 3 ml plasma using MagMAX Cell-Free DNA Isolation Kit (ThermoFisher Scientific). Whole-genome libraries were prepared using the FOCUS (Fragmented DNA Compact Sequencing Assay, Competence Centre on Health Technologies, Estonia) NIPT method protocol with 12 cycles for the final PCR enrichment step. In the following quantification, equal amounts of 36 samples were pooled, and the quality and quantity of the pool were assessed on Agilent 2200 TapeStation (Agilent Technologies, USA). Whole genome sequencing was performed on the NextSeq 550 instrument (Illumina Inc.) with an average coverage of 0.32× (minimum 0.08 and maximum 0.42) and producing 85 bp single-end reads. The second sample set contains a single NIPT sample postnatally diagnosed with Prader-Willi syndrome. The sample was sequenced with Illumina NextSeq 500 platform, producing 85 bp single-end reads with an average per-sample coverage of 0.32× at the University of Tartu, Institute of Genomics Core Facility, according to the manufacturer’s standard protocols, as described previously [2]. The third sample set contains samples SC005 (SeraCare Life Sciences Inc lot #10446565), SC0042 (#10571706), and SC016 (#10560229). These are SeraCare Life Sciences Inc circulating cell-free DNA (ccfDNA) like mixture of human genomic DNA that consists of matched maternal and fetus. SC005 and SC0042 consist of matched DNA of maternal and fetus with DiGeorge Syndrome. SC016 is a custom-ordered DNA Mix with fetus DNA having a pathogenic loss of the terminal region of 20p13 and a pathogenic 3q29 duplication. SC016 was processed as the first sample set was processed, and SC0042 was processed as the second sample set was processed. Sample SC005 was processed once as was sample set 1 and once as was sample set 2 processed. This study was performed with the approval of the Research Ethics Committee of the University of Tartu (#352/M-12). 1. Bayindir B, Dehaspe L, Brison N, Brady P, Ardui S, Kammoun M, et al. Noninvasive prenatal testing using a novel analysis pipeline to screen for all autosomal fetal aneuploidies improves pregnancy management. Eur J Hum Genet. 2015;23: 1286– 1293. doi:10.1038/ejhg.2014.282 2. Žilina O, Rekker K, Kaplinski L, Sauk M, Paluoja P, Teder H, et al. Creating basis for introducing noninvasive prenatal testing in the Estonian public health setting. Prenat Diagn. 2019;39: 1262–1268. doi:10.1002/pd.5578	NextSeq 550	377
EGAD00001009513	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA604 passage 8 on DLP+ library A96141A	HiSeq X Ten	212
EGAD00001009514	These are the raw subreads bam files for the pacbio IsoSeq data	NextSeq 500 PacBio RS II Sequel	30
EGAD00001009516	This data set includes bam files (aligned to hg38) from the germline of parents whose children have CHEK2 germline mutations.	Illumina HiSeq 2500 NextSeq 550	48
EGAD00001009517	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA533	Illumina HiSeq 2000	1
EGAD00001009518	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA409	Illumina HiSeq 2500	1
EGAD00001009519	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA420	Illumina HiSeq 2000	1
EGAD00001009520	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA296	Illumina HiSeq 2500	1
EGAD00001009521	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA597	Illumina HiSeq 2500	1
EGAD00001009522	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA232	Illumina HiSeq 2000	1
EGAD00001009523	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA211	Illumina HiSeq X	1
EGAD00001009524	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA230	Illumina HiSeq X	1
EGAD00001009525	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA234	Illumina HiSeq X	1
EGAD00001009526	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA278	Illumina HiSeq X	1
EGAD00001009527	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA101	Illumina HiSeq X	1
EGAD00001009528	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA212	Illumina HiSeq X	1
EGAD00001009529	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA214	Illumina HiSeq X	1
EGAD00001009530	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA224	Illumina HiSeq X	1
EGAD00001009531	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA225	Illumina HiSeq X	1
EGAD00001009532	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA226	Illumina HiSeq X	1
EGAD00001009533	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA228	Illumina HiSeq X	1
EGAD00001009534	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA229	Illumina HiSeq X	1
EGAD00001009535	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA231	Illumina HiSeq X	1
EGAD00001009536	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA233	Illumina HiSeq X	1
EGAD00001009537	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA237	Illumina HiSeq X	1
EGAD00001009538	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA271	Illumina HiSeq X	1
EGAD00001009539	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA277	Illumina HiSeq X	1
EGAD00001009540	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA284	Illumina HiSeq X	1
EGAD00001009541	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA399	Illumina HiSeq X	1
EGAD00001009542	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA294	Illumina HiSeq X	1
EGAD00001009543	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA285	Illumina HiSeq 2000	1
EGAD00001009544	Whole genome sequencing of normal sample for triple negative breast cancer patient SA101	Illumina HiSeq X	1
EGAD00001009545	Whole genome sequencing of normal sample for triple negative breast cancer patient SA211	Illumina HiSeq X	1
EGAD00001009546	Whole genome sequencing of normal sample for triple negative breast cancer patient SA212	Illumina HiSeq X	1
EGAD00001009547	Whole genome sequencing of normal sample for triple negative breast cancer patient SA214	Illumina HiSeq X	1
EGAD00001009548	Whole genome sequencing of normal sample for triple negative breast cancer patient SA224	Illumina HiSeq X	1
EGAD00001009549	Whole genome sequencing of normal sample for triple negative breast cancer patient SA225	Illumina HiSeq X	1
EGAD00001009550	Whole genome sequencing of normal sample for triple negative breast cancer patient SA226	Illumina HiSeq X	1
EGAD00001009551	Whole genome sequencing of normal sample for triple negative breast cancer patient SA228	Illumina HiSeq X	1
EGAD00001009552	Whole genome sequencing of normal sample for triple negative breast cancer patient SA229	Illumina HiSeq X	1
EGAD00001009553	Whole genome sequencing of normal sample for triple negative breast cancer patient SA230	Illumina HiSeq X	1
EGAD00001009554	Whole genome sequencing of normal sample for triple negative breast cancer patient SA231	Illumina HiSeq X	1
EGAD00001009555	Whole genome sequencing of normal sample for triple negative breast cancer patient SA232	Illumina HiSeq 2000	1
EGAD00001009556	Whole genome sequencing of normal sample for triple negative breast cancer patient SA233	Illumina HiSeq X	1
EGAD00001009557	Whole genome sequencing of normal sample for triple negative breast cancer patient SA234	Illumina HiSeq X	1
EGAD00001009558	Whole genome sequencing of normal sample for triple negative breast cancer patient SA237	Illumina HiSeq X	1
EGAD00001009559	Whole genome sequencing of normal sample for triple negative breast cancer patient SA271	Illumina HiSeq X	1
EGAD00001009560	Whole genome sequencing of normal sample for triple negative breast cancer patient SA277	Illumina HiSeq X	1
EGAD00001009561	Whole genome sequencing of normal sample for triple negative breast cancer patient SA278	Illumina HiSeq X	1
EGAD00001009562	Whole genome sequencing of normal sample for triple negative breast cancer patient SA284	Illumina HiSeq X	1
EGAD00001009563	Whole genome sequencing of normal sample for triple negative breast cancer patient SA285	Illumina HiSeq 2500	1
EGAD00001009564	Whole genome sequencing of normal sample for triple negative breast cancer patient SA290	Illumina HiSeq 2500	1
EGAD00001009565	Whole genome sequencing of normal sample for triple negative breast cancer patient SA294	Illumina HiSeq X	1
EGAD00001009566	Whole genome sequencing of normal sample for triple negative breast cancer patient SA296	Illumina HiSeq 2500	1
EGAD00001009567	Whole genome sequencing of normal sample for triple negative breast cancer patient SA399	Illumina HiSeq X	1
EGAD00001009568	Whole genome sequencing of normal sample for triple negative breast cancer patient SA400	Illumina HiSeq X	1
EGAD00001009569	Whole genome sequencing of normal sample for triple negative breast cancer patient SA409	Illumina HiSeq 2500	1
EGAD00001009570	Whole genome sequencing of normal sample for triple negative breast cancer patient SA420	Illumina HiSeq 2500	1
EGAD00001009571	Whole genome sequencing of normal sample for triple negative breast cancer patient SA533	Illumina HiSeq 2000	1
EGAD00001009572	Whole genome sequencing of normal sample for triple negative breast cancer patient SA597	Illumina HiSeq 2500	1
EGAD00001009573	Whole genome sequencing of normal sample for triple negative breast cancer patient SA997	Illumina HiSeq X	1
EGAD00001009574	Whole genome sequencing of normal sample for triple negative breast cancer patient SA998	Illumina HiSeq X	1
EGAD00001009575	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA416	Illumina HiSeq X	1
EGAD00001009576	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA400	Illumina HiSeq X	1
EGAD00001009577	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA290	Illumina HiSeq 2000	1
EGAD00001009578	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA288	Illumina HiSeq 2500	1
EGAD00001009579	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA299	Illumina HiSeq 2000	1
EGAD00001009580	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA095	Illumina HiSeq 2000	1
EGAD00001009581	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA415	Illumina HiSeq X	1
EGAD00001009582	Whole genome sequencing of normal sample for triple negative breast cancer patient SA718	Illumina HiSeq 2500	1
EGAD00001009583	Whole genome sequencing of normal sample for triple negative breast cancer patient SA720	Illumina HiSeq 2500	1
EGAD00001009584	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA718	Illumina HiSeq 2500	1
EGAD00001009585	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA720	Illumina HiSeq 2500	1
EGAD00001009586	Whole genome sequencing of normal sample for triple negative breast cancer patient SA1017	Illumina HiSeq X	1
EGAD00001009587	Whole genome sequencing of normal sample for triple negative breast cancer patient SA1026	Illumina HiSeq X	1
EGAD00001009588	Whole genome sequencing of normal sample for triple negative breast cancer patient SA1027	Illumina HiSeq X	1
EGAD00001009589	Whole genome sequencing of normal sample for triple negative breast cancer patient SA1028	Illumina HiSeq X	1
EGAD00001009590	Whole genome sequencing of normal sample for triple negative breast cancer patient SA1040	Illumina HiSeq X	1
EGAD00001009591	Whole genome sequencing of normal sample for triple negative breast cancer patient SA1064	Illumina HiSeq X	1
EGAD00001009592	Whole genome sequencing of normal sample for triple negative breast cancer patient SA1065	Illumina HiSeq X	1
EGAD00001009593	Whole genome sequencing of normal sample for triple negative breast cancer patient SA1069	Illumina HiSeq X	1
EGAD00001009594	Whole genome sequencing of normal sample for triple negative breast cancer patient SA1073	Illumina HiSeq X	1
EGAD00001009595	Whole genome sequencing of normal sample for triple negative breast cancer patient SA1074	Illumina HiSeq X	1
EGAD00001009596	Whole genome sequencing of normal sample for triple negative breast cancer patient SA576	Illumina HiSeq 2000	1
EGAD00001009597	Whole genome sequencing of normal sample for triple negative breast cancer patient SA610	Illumina HiSeq 2500	1
EGAD00001009598	Whole genome sequencing of normal sample for triple negative breast cancer patient SA992	Illumina HiSeq X	1
EGAD00001009599	Whole genome sequencing of normal sample for triple negative breast cancer patient SA994	Illumina HiSeq X	1
EGAD00001009600	Whole genome sequencing of normal sample for triple negative breast cancer patient SA095	Illumina HiSeq 2000	1
EGAD00001009601	Whole genome sequencing of normal sample for triple negative breast cancer patient SA288	Illumina HiSeq 2500	1
EGAD00001009602	Whole genome sequencing of normal sample for triple negative breast cancer patient SA299	Illumina HiSeq 2000	1
EGAD00001009603	Whole genome sequencing of normal sample for triple negative breast cancer patient SA415	Illumina HiSeq X	1
EGAD00001009604	Whole genome sequencing of normal sample for triple negative breast cancer patient SA666	Illumina HiSeq 2500	1
EGAD00001009605	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1065	Illumina HiSeq X	1
EGAD00001009606	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1064	Illumina HiSeq X	1
EGAD00001009607	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1026	Illumina HiSeq X	1
EGAD00001009608	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1069	Illumina HiSeq X	1
EGAD00001009609	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1017	Illumina HiSeq X	1
EGAD00001009610	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1027	Illumina HiSeq X	1
EGAD00001009611	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1073	Illumina HiSeq X	1
EGAD00001009612	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1074	Illumina HiSeq X	1
EGAD00001009613	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA576	Illumina HiSeq 2500	1
EGAD00001009614	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1028	Illumina HiSeq X	1
EGAD00001009615	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1040	Illumina HiSeq X	1
EGAD00001009616	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA992	Illumina HiSeq X	1
EGAD00001009617	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA610	Illumina HiSeq 2500	1
EGAD00001009618	Whole genome sequencing of normal sample for triple negative breast cancer patient SA416	Illumina HiSeq X	1
EGAD00001009619	Whole genome sequencing of normal sample for triple negative breast cancer patient SA1070	Illumina HiSeq X	1
EGAD00001009620	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1070	Illumina HiSeq X	1
EGAD00001009621	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA997	Illumina HiSeq X	1
EGAD00001009623	The data set contains information from 9 individuals (5 ALS + 4 controls) using single cell RNA sequencing in combination with TCR V(D)J sequencing to study the immune profile of the central nervous compartment (CSF). Sequencing was done using 10x Genomics platform (5’ scRNAseq & V(D)J Reagent Kits v1.1). 5P and TCR libraries were then pooled and for sequencing on the NovaSeq sequencer. Provided files are in .fastq.gz format and per individual four files are available (Read 1&2 and Lane 1&2).	Illumina NovaSeq 6000	36
EGAD00001009624	high coverage whole genome sequencing of 38 samples was done on a patterned flowcell v.2.5 (150 bp paired end, HiSeq X Ten) with coverage of about 60x for the tumor and whole blood control samples. All tumors had a tumor cell content of ≥60%. Sequencing libraries were prepared using the Truseq DNA Nano kit (Illumina) according to the manufacturers’ instructions and size selected using SPRI beads (Beckman Coulter Genomics).	HiSeq X Ten	18
EGAD00001009625	Small RNA sequencing data (TruSeq small RNA library preparation kit v2) from serum samples and tumor tissue of orthotopically injected mice (SH-SY5Y cell line) and unengrafted mice, treated with idasanutlin, temsirolimus and vehicle control.	NextSeq 500	128
EGAD00001009626	Whole genome sequencing of 14 cases of low-grade ovarian serous carcinoma with matched normal DNA	HiSeq X Ten	28
EGAD00001009627	For WGS DNA of tumor or control samples was prepared for paired sequencing using the Illumina TruSeq DNA Nano Kit and sequenced on NovaSeq 6000. For RNA-Seq the sequencing Kit Illumina TruSeq stranded mRNA was used with the same sequencer. There are 4 samples for WGS (18 runs) and 4 samples for RNA (4 runs) available.	Illumina NovaSeq 6000	10
EGAD00001009628	This dataset include the Fastq files from Capture-based targeted high throughput sequencing of bulk, monocytic and progenitor subfractions of PHENOMUT11 sample.	Illumina NovaSeq 6000	1
EGAD00001009629	This dataset include the Fastq files from Mission Bio DNA+Protein single-cell multiomic sequencing from 11 NPM1-mutated AML diagnostic samples.	Illumina NovaSeq 6000	11
EGAD00001009630	Dataset for "Intratumoral Heterogeneity and Clonal Evolution Induced by HPV Integration" (Illumina)	Illumina NovaSeq 6000	3
EGAD00001009631	Dataset for "Intratumoral Heterogeneity and Clonal Evolution Induced by HPV Integration" (pacBio)	Sequel II	3
EGAD00001009632	Dataset for "Intratumoral Heterogeneity and Clonal Evolution Induced by HPV Integration" (ONT)	PromethION	3
EGAD00001009633	This dataset has the processed WGS data for the cancer models in CCMA.		148
EGAD00001009634	The dataset contains samples of 11 CRC patients (2 samples for each patient, tumor and normal adjacent tissue site, 22 samples in total). Dataset is composed by fastq file (paired end) type from 10x single-cell RNA-Seq.	Illumina NovaSeq 6000	22
EGAD00001009635	The dataset contains samples of 30 CRC patients (3 samples for each patient, tumor and 2 normal adjacent tissue sites, 90 samples in total). Dataset is composed by fastq file (paired end) type from bulk RNA-Seq.	Illumina NovaSeq 6000	90
EGAD00001009636	miRNA sequencing data, single-end, produced by an llumina NextSeq 500 sequencer.	NextSeq 500	216
EGAD00001009637	398 runs and 184 samples of CLL paired RNA-Seq. Sequences were prepared with the Illumina TruSeq RNA Kit. Paired sequencing was done on Hiseq2000 or Hiseq2500.	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500	173
EGAD00001009638	Whole genome sequencing of chronic lymphocytic leukemia, 49 CLL/control pairs, sequenced on HiSeq X Ten with DNA prepared using Illumina TruSeq DNA Nano.	HiSeq X Ten	74
EGAD00001009639	WES dataset obtained using Illumina HiSeq 2500, Swift Bioscience library kit, paired reads.	Illumina HiSeq 2500	1
EGAD00001009641	Mesothelioma is an aggressive cancer associated with previous exposure to asbestos and dismal prognosis. Since a pemetrexed/cisplatin combination was introduced for treatment of mesothelioma, no new first- or second-line therapies have been discovered. Thus, to better understand what drives mesothelioma carcinogenesis and to identify potential targets for therapy, in this project we aim at performing WGS analysis of a panel of mesothelioma cells lines.	Illumina NovaSeq 6000	21
EGAD00001009642	Mesothelioma is an aggressive cancer associated with previous exposure to asbestos and dismal prognosis. Since a pemetrexed/cisplatin combination was introduced for treatment of mesothelioma, no new first- or second-line therapies have been discovered. Thus, to better understand what drives mesothelioma carcinogenesis and to identify potential targets for therapy, in this project we aim at performing RNAseq analysis of a panel of mesothelioma cells lines.	Illumina HiSeq 4000	21
EGAD00001009643	This study presents Whole Genome Sequencing results from the Anson Street African Burial Ground Project, which is a community-based initiative aimed at understanding the histories of 37 Ancestors in Charleston, South Carolina. Here we report fastq files for all 37 Ancestors. DNA was extracted at the University of Tennessee-Knoxville following Dabney et al. 2013, and dual index libraries prepared using a modified NEBNext Ultra II kit with partial USER enzyme digestion. Libraries were then enriched for human genomic DNA (MyBaits) and sequenced on Illumina Platforms.	Illumina HiSeq 4000 Illumina MiSeq Illumina NovaSeq 6000	31
EGAD00001009644	Single Cell Genome Sequence for high grade serous ovarian carcinoma patient SA1105, patient SA1106 on DLP+ library A96168B	HiSeq X Ten	5
EGAD00001009645	Single Cell Genome Sequence for Immortalized lymphoblastoid cell line GM18507 cell line SA928,Triple negative breast cancer patient-derived xenograft SA609 passage 2 patient-derived xenograft SA609 passage 2 on DLP+ library A96228A	HiSeq X Ten	6
EGAD00001009646	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA530 passage 3 on DLP+ library A98247A	HiSeq X Ten	-
EGAD00001009647	We performed targeted NGS in Follicular lymphoma samples at diagnosis. Explored clinico-genetic correlations and assessed four clinical or clinicogenetic risk models (FLIPI, FLIPI-2, PRIMA-IP or m7-FLIPI-molecular score) in patients with symptomatic FL who received frontline immunochemotherapy. Out of 191 patients with FL grade 1-3a, 109 were successfully genotyped. Treatment consisted on rituximab (R) plus CVP/CHOP (72.5%) or R-bendamustine (R-B) (27.5%).	NextSeq 500	109
EGAD00001009648	bam files of sc-RNA and sc-BCR sequencing of multiple myeloma and precursors from 65 samples	Illumina NovaSeq 6000	65
EGAD00001009649	The cold pressor test (CPT) is a widely used pain provocation test to investigate both pain tolerance and cardiovascular responses. Twenty-two females were phenotypically assessed before and after a CPT, and blood samples were taken for RNA-sequencing. Files were processed and quantified with kallisto v0.42.5 using the human reference transcriptome (Gencode Release 28). Countdata was rlog-transformed.		1
EGAD00001009650	This data set includes RNAseq from 38 follicular lymphoma tumours. All tumours were fresh frozen. Libraries were constructed by enriching for poly-A transcripts and sequenced as 75bp paired end reads on an Illumina HiSeq 2500 instrument.	Illumina HiSeq 2500	38
EGAD00001009651	This submission includes targeted and whole exome paired-end fastq files.	Illumina HiSeq 2500	866
EGAD00001009652	Whole Exome Sequencing Dataset (CRAM files) of 415 admixed Brazilians with Covid-19 extreme phenotypes from recovered nonagenarians and centenarians to deceased adults.	Illumina NovaSeq 6000	415
EGAD00001009653	Targeted sequencing of a biobank of PDOs PDXs and LMHs	Illumina NovaSeq 6000	302
EGAD00001009654	This dataset includes Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) data, in FASTQ format, from 70 metastatic castration-resistant prostate cancer tumor samples from the SU2C/PCF West Coast Dream Team (WCDT) project. The sequencing data is paired-end, 150 bp sequencing data from an Illumina NovaSeq 6000 machine. ATAC-seq libraries were prepared following the protocol described in Buenrostro et al. Nature Methods. 2013 (PMID: 24097267) and Corces et al. Nature Methods. 2017 (PMID: 28846090).	Illumina NovaSeq 6000	70
EGAD00001009655	scRNA-seq of monocultures and co-cultures of patient-derived PDAC organoids and matched CAFs. 3 sample sets per patient.	NextSeq 500	9
EGAD00001009657	2 patient-derived xenograph tumours, and associated normal blood samples. Duplicated samples for each gave 8 pairs of fastq files	Illumina NovaSeq 6000	8
EGAD00001009658	RNAseq from PDX tumours under treatment with dpbs or eribulin. Sarcomatous or mixed sarcoma/carcinoma. 6 PDX tumours each with 2 treatments gave 12 pairs of fastq files.	Illumina NovaSeq 6000	12
EGAD00001009659	Original patient tumours from which PDX models were derived. TruSight Oncology RNA panel for 2 samples, sequenced over 4 lanes each, gave 8 pairs of fastq files	Illumina HiSeq 4000	8
EGAD00001009660	5 samples as fastq file pairs. 1 solid tumour sample from patient #1105 with matched blood, and 2 solid tumour samples from patient #1177 with matched blood.	Illumina NovaSeq 6000	5
EGAD00001009661	Shallow sequencing of organoid/xenograft or human colorectal metastases	Illumina NovaSeq 6000	302
EGAD00001009662	Single cell RNA Seq of: 193 MCSP+ DCC isolated from SLNs of melanoma patients, 9 MCSP+ cells isolated from LNs of non-melanoma patients, 14 melanocytes from a healthy donor. Bulk RNA Seq of 10 samples from 4 DCC-PDX-derived cell lines. Sequencing on NovaSeq6000. Fastq files.	Illumina NovaSeq 6000	226
EGAD00001009663	Nanopore low-pass WGS of human brain tumors for evaluation of DNA methylation-based classification of cancer	MinION	16
EGAD00001009664	RNA-seq for fusion gene discovery of human astroblastomas	NextSeq 500	4
EGAD00001009666	The incidence of non-melanoma skin cancer is 17-fold lower in Singapore compared to the UK1, despite Singapore receiving 2-3 times more year-round ultraviolet radiation (UV)2,3. The ageing epidermis of the skin comprises competing somatic mutant clones4,5, from which such cancers develop. We question if differences in keratinocyte skin cancer incidence are reflected in the mutational landscape by comparing ageing facial epidermis from donors of Singapore and the UK. We find UK skin to be a highly competitive, densely mutated landscape with 4-fold greater mutation burden compared to Singaporean skin and differences in clonal selection by country. We disproportionately observe multiple features common to keratinocyte skin cancers6,7,8 in UK skin, such as UV mutagenesis, copy number aberration and hotspot mutations (in particular TP53 R248W). We conclude that keratinocyte skin cancer incidence is reflected in the somatic clones of non-cancerous epidermis. Finally, we re-analyse squamous cell carcinoma exomes from Korea9 to show, even in low incidence populations, carcinogenesis is driven by UV damage.	Illumina HiSeq 2500	191
EGAD00001009667	Human clonal intestinal organoids were treated with 1µM MMF (Roche), 20 μM GCV (Hainan Poly Pharm Co Ltd), or in combinations (1 µM MMF + 20 μM GCV or 1 µM MMF + 40 μM GCV) continuously for 4-6 weeks. DNA was extracted after drug treatment. WGS was performed with 150 bp PE sequencing at 30X using an Illumina Novaseq sequencer.	Illumina NovaSeq 6000	6
EGAD00001009668	This dataset contains raw exome sequencing data from nine sinonasal undifferentiated carcinoma FFPE samples and matched normal tissue that were assigned to a shared epigenetic class using DNA methylation-based classification. They were analyzed using the Twist Human Core Exome Plus Kit (Twist Bioscience) on a NovaSeq 6000 sequencer.	Illumina NovaSeq 6000	15
EGAD00001009669		HiSeq X Five Illumina HiSeq 4000 Illumina NovaSeq 6000	18
EGAD00001009670	Sequencing data of 20 tumor runs (different tumors), which were uploaded to EGAS00001004813 and used in the ImmuNeo publication. The sequencing was always paired and run on Illumina HiSeq sequencers.	HiSeq X Ten Illumina HiSeq 4000	1
EGAD00001009671	Sequencing data of 39 tumor and control runs (different tumors and blood controls), which were uploaded to EGAS00001004813 and reused in this ImmuNEO publication. The sequencing was always paired.	HiSeq X Ten Illumina HiSeq 4000	-
EGAD00001009672	Whole genome sequencing of normal sample for triple negative breast cancer patient SA1058	Illumina HiSeq X	1
EGAD00001009673	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA1058	Illumina HiSeq X	1
EGAD00001009674	Whole genome sequencing of tumour sample for triple negative breast cancer patient SA998	Illumina HiSeq X	1
EGAD00001009675	RNA-Seq transcriptome data is only for academic use.	NextSeq 500	2
EGAD00001009676	RNA-Seq data for both Academic and For-profit use	Illumina HiSeq 2500 Illumina NovaSeq 6000 NextSeq 500	221
EGAD00001009677	There are two datasets: 1. scRNA-seq of human cutaneous immune cells from psoriasis patients. These include pre- and post-Tildrakizumab treated patients and come in a BAM file format. 19006FL-25-01 19006FL-38-01 19006FL-32-01-03 19006FL-33-01 19006FL-28-01-05 19006FL-35-01-01 2. RNA-seq of ZFP36L2 CRISPIR deleted Human T cells are FASTQ files. 19006XR-30-05 19006XR-30-04 19006XR-30-02 19006XR-30-01 19006XR-26-05 19006XR-26-04 19006XR-26-02 19006XR-26-01 19006R-22-04 19006R-22-08 19006R-22-05 19006R-22-01	Illumina HiSeq 4000 Illumina NovaSeq 6000	17
EGAD00001009678	This dataset contains germline variants (in .vcf format) from six pediatric cancer patients (sample IDs D1 - D6). WES data of the children and their parents was mapped to hg38. A consensus of four variant callers was used to obtain germline variants of the children.		6
EGAD00001009679	subset of 11 samples (RNA-Seq and WGS) from study EGAS00001005973, which was published earlier and are linked here to study EGAS00001006538	HiSeq X Ten Illumina HiSeq 4000	1
EGAD00001009680	Paired RNA sequencing of 30 samples RRMM using Illumina TruSeq stranded mRNA kit and either HiSeq2000 or HiSeq X Ten for sequencing.	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 4000	30
EGAD00001009681	Paired scRNA sequencing using 10xgenomics library preparation and Illumina HiSeq4000 for sequencing of 2 samples RRMM (relapsed refractory multiple myeloma)	Illumina HiSeq 4000	2
EGAD00001009682	Here is mostly paired WGS data of RRMM, 45 samples (tumors and controls) in 86 runs. This data was produced by using Illumina TruSeq Nano DNA and NovaSeq6000 or HiSeq X Ten for sequencing. One tumor/control pair is WES data using Agilent SureSelect V5+UTRs and NovaSeq6000 for sequencing.	HiSeq X Ten Illumina NovaSeq 6000	45
EGAD00001009683	scATAC sequencing was performed of 29 samples of RRMM tumors using 10xGenomics for the preparation and NovaSeq6000 for sequencing.	Illumina NovaSeq 6000	29
EGAD00001009684	A pan-cancer cohort of 1031 patients resistant to systemic therapies or with no approved therapeutic options. It includes whole-exome sequencing of 571 tumor and matched-normal samples, and transcriptome sequencing of 947 tumor samples. Biopsies were taken at entry into precision medicine trials, often after diagnosed resistance. Comprehensive clinical information is available for all patients and include patient age at biospy, tumor primary site and histological subtype, biopsy site, treatments received prior to biopsy, blood assessment results at biopsy, metastatic sites at biopsy, and survival time from the biopsy date.	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina NovaSeq 6000 NextSeq 500	2089
EGAD00001009685	Whole exome and RNASeq raw sequencing data for an individual with multiple pancreatic neuroendocrine tumours (panNETs). Age at diagnosis was 51. Tumour tissue and PBMCs were used for sequencing. This data was generated as part of a study funded by a Cancer Research UK Centres Network Accelerator Award Grant (A21998).	Illumina NovaSeq 6000	25
EGAD00001009686	This dataset contains both snRNA-seq and bulk RNA-seq data from 19 different patients, comprising of 9 healthy controls (4 Spinal Cord, 5 Motor Cortex) as well as 5 C9ALS patients and 5 sporadic ALS patients, each with paired data from the spinal cord and motor cortex. For the snRNA-seq data, fastq files containing the raw reads are provided, many of these samples were pooled for sequencing and subsequently require demultiplexing using SNPs using a tool such as freemuxlet. The paired bulk RNA-seq data (raw fastq files) can be used to acquire the ground truth per patient - a list of pooled samples can be found in map.txt. snRNA-seq data was produced using the 10X Genomics 3' v3 kit and bulk RNA-seq was produced using the Illumina TruSeq v2 kit.	Illumina NovaSeq 6000	64
EGAD00001009687	scRNAseq dataset containing 5 healthy donors and 4 asthmatic donors.	Illumina NovaSeq 6000	8
EGAD00001009688	In this study single cell RNA-Seq data was used to train a deconvolution algorithm. The algorithm was validated on paired bulk RNA-Seq profiles.		-
EGAD00001009689	FAST5 original nanopore data from MinION sequencing of 10 tumor samples		10
EGAD00001009690	Reads were aligned to 1000 Genomes assembly reference (hs37d5) using minimap2 2.22. SAM-to-BAM conversion, BAM sorting and indexing were performed with SAMtools 1.13. Read summarization was performed with featureCounts (from Subread 2.0.3) over exon features based on GENCODE Version 19 gene models. Strand specific counting was used.		10
EGAD00001009691	6 organoids transcriptomic profiles	NextSeq 500	6
EGAD00001009692	Mutations calls from 68 MM patients collected with Mutect2. All tumor samples were CD138+ cells at diagnosis and all control samples were PBMCs.		68
EGAD00001009693	This dataset is composed of NGS data from 33 XP patients studied by WGS.	Illumina HiSeq 4000	65
EGAD00001009694	We profiled CD45- enriched, viable cells from GBM (n = 7) and IDH-MUT (n = 7) primary samples with multi-modality single-cell sequencing of scDNAme (by reduced representation bisulfite sequencing [RRBS]) and scRNAseq (Smart-seq2).	Illumina HiSeq 2500 Illumina NovaSeq 6000	2989
EGAD00001009695	Targeted sequencing data to look for the involvement of genes in the RAS-MAPK pathway, angiogenesis and brain vascular disorders among others, in brain AVMs	unspecified	30
EGAD00001009696	Whole genome sequencing data of brain AVM endothelial and non-endothelial cell fractions, as well as paired blood samples	unspecified	31
EGAD00001009697	Data includes whole exome sequenced bam files for matched tumor-normal pairs from the study.	Illumina HiSeq 2500	90
EGAD00001009698	Data from samples that are marked for academic use only	HiSeq X Ten Illumina NovaSeq 6000	3
EGAD00001009699	Data from samples that are marked for both academic and for-profit use.	HiSeq X Ten Illumina NovaSeq 6000	425
EGAD00001009700	WES for Patient 9 to 14 of NIBIT-M4 clinical trial	Illumina NovaSeq 6000	6
EGAD00001009701	WES for Patient 1 to 8 of NIBIT-M4 clinical trial	Illumina HiSeq 3000	8
EGAD00001009702	RNAseq for Patients of NIBIT-M4 clinical trial	Illumina HiSeq 3000	14
EGAD00001009703	RRBS for Patients of NIBIT-M4 clinical trial	Illumina HiSeq 3000	14
EGAD00001009704	Chronic obstructive pulmonary disease (COPD) is a major respiratory disease characterized by small airway inflammation, emphysema and severe breathing difficulties. Low-grade systemic inflammation is an established hallmark of severe disease, however, the molecular changes in peripheral immune cells remain far from understood. We combined multi-color flow cytometry with single-cell RNA sequencing and showed that blood neutrophil numbers are significantly increased in COPD and they are a heterogeneous population. A transcriptomic state that expressed interferon response genes correlated with alveolar damage and acute exacerbations. Furthermore, bronchoalveolar neutrophils expressed gene signatures corresponding to certain blood neutrophil states. Last, our data in a murine model of cigarette smoke exposure demonstrated that bone marrow neutrophil progenitors are expanded in smoke-treated animals and display signs of immune activation. Our study provides evidence that COPD systemic inflammation may derive from an activated haematopoietic precursor compartment.	NextSeq 500	1
EGAD00001009705	Paired Exome sequencing of 34 samples (tumors and controls) of different tumors. The samples were prepared using Agilent SureSelect V5+UTRs, the sequencing was done on Illumina HiSeq 4000.	Illumina HiSeq 4000	34
EGAD00001009706	Paired RNA sequencing data (21 runs/ 17 samples) of different tumors. The samples were prepared using the Illumina TruSeq stranded mRNA Kit. The sequencing was done on Illumina HiSeq 4000.	Illumina HiSeq 4000	17
EGAD00001009707	RRBS data from TRACERx non-small cell lung cancer (NSCLC) tumours and matched normal adjacent tissue. TRACERx (TRAcking Cancer Evolution through therapy (Rx)) is a prospective cohort study designed to investigate intratumor heterogeneity (ITH) in relation to clinical outcome, and to determine the clonal nature of driver events and evolutionary processes in early stage non-small cell lung cancer (NSCLC).	Illumina HiSeq 2500	155
EGAD00001009709	We profiled 111 patient medulloblastoma primary tumor samples by bulk RNA-seq (19 samples), 27ac (98 samples) / 27me3 (61 samples) ChIP-Seq, WGS (4 samples) and 27ac hichip (8 samples). Submitted data consists of data generated from previously unpublished tumors as well as complementary data for data sets already published for identical medulloblastoma tumors (ex: 27me3 ChIP-Seq and RNA-Seq data submitted for a tumor with publicly available WGS data). The raw fastqs and hg19 aligned RNA-Seq bams are provided.	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina NovaSeq 6000	111
EGAD00001009710	This data set contains the CRAM files for the samples in the CHILD cohort, sequenced on the Illumina HiSeq X platform.	HiSeq X Ten	604
EGAD00001009711	244 infected single-cell alveolar bam files, 48 empty well bam files, and 52 RNA sequencing of amplicons (4 SARS-CoV-2 variants with 12 batches and 4 viral variants pool samples). 244 alveolar single cells were captured over 12 experimental batches and experimental condition is written in metadata uploaded as "infected_cells_final_revision.csv". on github (https://github.com/twkim-0510/SARS-CoV-2_viral_competition). Each bam file name corresponds to the sample_name column of the metadata.	Illumina NovaSeq 6000	344
EGAD00001009712	Bank of human both primary and metastastic colorectal cancer sample RNAseq	unspecified	119
EGAD00001009713	Dataset contains mRNA capture sequencing data from plasma of 266 different human donors. The first, pan-cancer, cohort covers 25 high-grade to metastatic cancer types (8 cancer patients per type) and a control group (8 healthy donors). The validation cohort comprises additional plasma samples from ovarian, prostate and uterine cancer patients (12 per type) as well as additional samples from controls (22 new and 8 repeated). Samples were sequenced on a NovaSeq 6000 and are provided in FASTQ format.	Illumina NovaSeq 6000	274
EGAD00001009714	2 paired WGS samples of peritumour regions of colorectal cancer (2 patients). The library was prepared using the Illumina TruSeq Nano FFPE kit and the sequencing was done on NovaSeq6000.	Illumina NovaSeq 6000	2
EGAD00001009715	Exome sequencing data from seven phenotypically abnormal human fetal samples. Anaysis perfomed using Illumina NovaSeq 6000, Twist Bioscience - Human Comprehensive Exome. Paired end fastq files were aligned to hg38 reference genome using BWA-MEM v0.7.15, followed by sorting using SAMtools sort v1.3.1, and duplicate reads marked using Picard Tools MarkDuplicates v2.18.2	Illumina NovaSeq 6000	13
EGAD00001009718	This dataset consists of 39 noncancerous donor and 62 cancer patient plasma samples (including 29 patients with CRC across a total of 13 tumor types) that were analyzed with the PGDx elio plasma resolve assay. The PGDx elio plasma resolve assay is a hybrid capture approach targeting 33 genes with sequencing performed using the Illumina NextSeq with 150bp paired-end reads. The bam files provided have been adapter masked and contain duplicate reads.	NextSeq 500	101
EGAD00001009719	The dataset "PGDx elio™ plasma resolve assay: targeted sequencing analyses of plasma cfDNA" includes paired end FASTQ reads of 183 cfDNA samples from metastatic colorectal cancer patients. Sequencing was performed using a panel consisting of 33 genes, covering over 237,000 bp, targeting 25,000x depth across the targeted regions.	NextSeq 500	183
EGAD00001009720	The dataset "PGDx elio™ tissue complete assay: targeted sequencing analyses of tissue DNA" includes paired end FASTQ reads of 28 tissue samples from metastatic colorectal cancer patients. Sequencing was performed using a panel consisting of 505 genes, targeting 2,500x depth across the targeted regions.	NextSeq 500	28
EGAD00001009721	The dataset "PGDx elio™ plasma resolve assay: targeted sequencing analyses of WBC DNA" includes paired end FASTQ reads of 49 white blood cell (WBC) genomic DNA samples from metastatic colorectal cancer patients. Sequencing was performed using a panel consisting of 33 genes, covering over 237,000 bp, targeting 25,000x depth across the targeted regions.	NextSeq 500	49
EGAD00001009722	Paired end FASTQ files of 119 Iberian Roma whole exome sequence data (Illumina sequencing)	NextSeq 500	119
EGAD00001009723	Dataset containing the FASTQ files of RNA (scr) and TCR (vdj) sequencing of 17 bronchoalveolar lavage fluid samples collected from ICI pneumonitis (n=11) and control (n=6) patients. To comply with GDPR regulations, please note that individual sample identifiers used for this data deposit (alphabetical ID) are different from and cannot be traced back to the patient identifiers used throughout the manuscript (numerical ID).	Illumina NovaSeq 6000	17
EGAD00001009724	mRNA capture sequencing and small RNA sequencing data (FASTQ files) of the exRNAQC study phase 2 (interaction study)	Illumina NovaSeq 6000	180
EGAD00001009725	ctDNA data for IMpower150, including individual mutation calls (one mutation per sample per line), sample list including ctDNA status (one sample per line), and patient-level ctDNA summaries called ctDNA features (one patient per line).		-
EGAD00001009726	Clinical data for IMpower150 (one patient per line): anonymized_patient_id, train_test_split, ctDNA_status, ARM1, OS_months, OS_event, PFS_months, PFS_event, TTEOS_rebaseline_BL, TTEPFS_rebaseline_BL, TTEOS_rebaseline_C2D1, TTEPFS_rebaseline_C2D1, TTEOS_rebaseline_C3D1, TTEPFS_rebaseline_C3D1, TTEOS_rebaseline_C4D1, TTEPFS_rebaseline_C4D1, TTEOS_rebaseline_C8D1, TTEPFS_rebaseline_C8D1, pdl1_high, number_metastatic_sites, baseline_ECOG, age, sex_female, history_of_tobacco_use, sld_baseline, sld_wk6, sld_percent_change_bl_to_wk6, sld_difference_bl_to_wk6, AGEGRP, tumor_assessment_week_6, tumor_assessment_week_12, tumor_assessment_week_18, tumor_assessment_week_24, PFS_days, days_between_randomization_c3		-
EGAD00001009727	Clinical data from AVANT: Clinical data include race, age, sex, baseline ecog, tumor stage, node status, treatment arm, KRAS and BRAF mutation status, tumor location, concensus molecular subtype, overall survival and disease free survival for 797 patients across AVANT.		1
EGAD00001009728	RNAseq FASTq files from 797 tumors from AVANT. Sequencing libraries were generated with the TruSeq Stranded Total RNA kit (Illumina) following ribosomal RNA (rRNA) depletion with the Ribo-Zero Gold kit (Illumina). The libraries were sequenced on the HiSeq4000 (Illumina) with a sequencing protocol of 75 bp paired-end sequencing. Note: 10 samples used in the original publication were excluded from this upload due to regulations from the Human Genetics Resources Administration of China (HGRAC).	Illumina HiSeq 4000	797
EGAD00001009729	RNA-sequencing data for 100 stereotypes CLL subset cases. Illumina based short-read sequencing data from a 100 samples from patients with CLL. Oxford Nanopore Technologies long-read sequencing data from 5 samples from patients with CLL.	Illumina HiSeq 2500 MinION	100
EGAD00001009730	Paired RNA-Seq of four patients with advanced Parathyroid carcinoma (PC). The library was prepared using the Illumina TruSeq stranded mRNA Kit, the sequencing was done either on an Illumina HiSeq 4000 or on Illumina NovaSeq 6000.	Illumina HiSeq 4000 Illumina NovaSeq 6000	4
EGAD00001009731	Paired WGS data of four patients with advanced Parathyroid carcinoma (PC). There are tumor/control pairs (buffy coat control). The library was prepared with Illumina TruSeq Nano DNA, the sequencing was done with HiSeq X Ten.	HiSeq X Ten	8
EGAD00001009732	The gut microbiota composition is unique to every individual but is shaped by common factors including diet, lifestyle, medication use, early-life determinants, living environment or genetics. Most of these factors may be influenced by ethnicity. This study explored variations in fecal microbiota composition in 6048 individuals with different ethnic backgrounds living in the same geographical area (Amsterdam, the Netherlands). The HELIUS data are owned by the Amsterdam University Medical Centers, location AMC in Amsterdam, The Netherlands. To allow sharing of microbiome data collected in HELIUS with (inter)national researchers, 16s rRNA sequence analysis has been stored at the European genome-phenome archive (EGA; accession code EGAD00001004106). This requires that access needs to be granted, also because the HELIUS data are stored with relevant phenotypical variables. Access is granted to all researchers affiliated with an internationally recognized research institution who request to use the HELIUS data within the EGA context, after having signed the data transfer agreement. Any researcher can request the data by submitting a proposal to the HELIUS Executive Board as outlined at http://www.heliusstudy.nl/en/researchers/collaboration, by email: heliuscoordinator at amsterdamumc dot nl. The HELIUS Executive Board will check proposals if they do not conflict with ethical approvals and informed consent forms of the HELIUS study.	Illumina MiSeq	3885
EGAD00001009733	This data set contains KiCS cancer panel data for academic and for-profit use.	Illumina HiSeq 2500	3
EGAD00001009734	This data set contains KiCS cancer panel data for academic and for-profit use.	Illumina HiSeq 2500 NextSeq 500	521
EGAD00001009735	Files from Tapestri snDNA-seq of archival tissue samples from 16 pancreatic ductal adenocarcinoma (PDAC) patients. Matched bulk sequencing (whole-exome, whole-genome, MSK-IMPACT) data are attached for a subset of the patients.	Illumina HiSeq 4000 Illumina NovaSeq 6000	46
EGAD00001009736	This dataset contains RNA sequencing information for Chronic Myeloid Leukemia. In total 2 single-end RNA-seq tumor cell line samples are present.	Sequel	2
EGAD00001009737	The .cram files of the Trio or Quad sequencing data used for generation of the genomic autopsy study. This contains a mix of genome sequencing and exome sequencing data for probands and their parents. A subset of families (n=32) did not provide consent to publicly sharing their data.	Illumina HiSeq 2000 Illumina NovaSeq 6000 NextSeq 500	156
EGAD00001009738	Whole genome sequencing data of 5 High-grade serous carcinoma (HGSC) patients (6 samples) sequenced with BGI.	unspecified	6
EGAD00001009739	Phenotype data from pregnant mothers unexposed and exposed to the Rwandan genocide from 59 whole blood samples.		1
EGAD00001009741	Two primary tumor-derived PDAC organoids were subjected to SNP array, RNA-seq, and single-cell WGS.	Illumina NovaSeq 6000	4
EGAD00001009742	16 additional samples	HiSeq X Ten	16
EGAD00001009743		Illumina NovaSeq 6000	1
EGAD00001009744	This dataset contains RNA-sequencing of Bone marrow-derived CD34+ cells from Healthy Controls (n=2) and SLE patients (n=10). SLE patients are divided into two categories based on severity: patients with moderate/mild disease (n=4) and patients with severe disease (n=6). Libraries were generated using the Illumina TruSeq Sample Preparation kit v2. Single-end 75-bp mRNA sequencing was performed on Illumina NextSeq 500. The raw fastq files are uploaded.	NextSeq 500	2
EGAD00001009745		HiSeq X Five Illumina HiSeq 4000 Illumina NovaSeq 6000	1
EGAD00001009746	Whole genome sequencing of high-grade serous ovarian cancer (HGSC) tumours and matched normals from 15 patients with homologous recombination deficiencies. The dataset includes fastq files from 56 HGSC tumours (1 primary, 1 relapse, 54 end-stage) and 15 matched normals. Sequence libraries were generated from tumour and matched normal genomic DNA using the KAPA HyperPrep PCR-free library preparation kit (Roche), or the Illumina TruSeq DNA Nano kit according to manufacturer’s instructions. Sequencing was carried out by the Kinghorn Centre for Clinical Genomics Sequencing Laboratory (Sydney, Australia) on the HiSeq X Ten System (Illumina) or by the Australian Genome Research Facility (Melbourne, Australia) on an Illumina NovaSeq to a minimum base coverage of 30-fold for normal DNA and 60-fold for tumour DNA samples.	unspecified	66
EGAD00001009747	Targeted DNA sequencing of high-grade serous ovarian cancer (HGSC) tumour and normal samples from 15 patients with homologous recombination deficiencies. The dataset includes fastq files from 243 HGSC tumours (15 primary, 3 relapse, 225 end-stage) and 15 normals from 15 HGSC patients. Following target hybrid capture of 63 genes involved in DNA repair and response to treatment with an Agilent SureSelect XT panel, sequencing libraries were generated using the SureSelect XT Low Input Target Enrichment System (Agilent) as per the manufacturer's protocol. Libraries were sequenced on an Illumina NextSeq 500 at the Peter MacCallum Cancer Centre (Melbourne, Australia).	NextSeq 500	266
EGAD00001009748	This dataset comprises of Clinical-Epidemiological (CE) data from an Erasmus MC cohort of 151 individuals who were tested positive for COVID-19.		151
EGAD00001009749	This dataset includes 4 samples profiled by high-throughput Illumina sequencing, in bam format, aligned to GRCh37. Human patient T-ALL samples were serially propagated as xenografts in immunodeficient mice. The samples were collected after development of frank leukemia in recipient mice.	Illumina HiSeq 2500	4
EGAD00001009750	This dataset includes 90 samples profiled by high-throughput Illumina sequencing, in bam format, aligned to GRCh37. Normal human CD34+ cord blood (CB), bone marrow (BM), or post-natal thymus (PNT) cells were transduced with various combinations of T-ALL oncogenes and cultured in vitro.	Illumina HiSeq 2500	90
EGAD00001009751	This dataset includes 73 samples profiled by high-throughput Illumina sequencing, in bam format, aligned to GRCh37. Normal human CD34+ cord blood (CB), bone marrow (BM), or post-natal thymus (PNT) cells were transduced with various combinations of T-ALL oncogenes, cultured in vitro on OP9-DL1 feeders for up to 25 days, and then transplanted into immunodeficient NSG or NRG mice. The samples were collected after development of frank leukemia in recipient mice.	Illumina HiSeq 2500	73
EGAD00001009752	This dataset contains paired RNA sequencing data for end-stage kidney disease (ESKD) patients on dialysis. There are two cohorts. The first includes 179 samples from 51 COVID-19 patients recruited during the initial phase of the COVID-19 pandemic (April-May 2020) and 55 non-infected ESKD patients as controls. 17 patients initially recruited as controls as part of the Wave 1 cohort were later infected with COVID-19 in January-March 2021. We acquired a total of 90 samples during the acute infection and convalescent samples for 12 of the 17 patients following the acute COVID-19 episode. RNA-seq counts and full clinical metadata for these cohorts are available without restriction from Zenodo (https://doi.org/10.5281/zenodo.6497251).	Illumina HiSeq 4000	336
EGAD00001009754	Tumor-specific T cells are frequently exhausted by chronic antigenic stimulation. To explore new pathways for reinvigoration of anti-tumor immune functions, we developed a human ex vivo exhaustion model by repetitive antigenic stimulation of primary CD8 T cells. This results in T cells that resemble patient-derived T cells in tumors on a phenotypic and transcriptional level. Four human healhy donor CD8+ T cells were isolated, transduced with an NY-ESO-1 TCR lentivirus construct, stimulated in four different conditions (Trested, Ttumor, Tex, Teff) with T2 tumor cells and specific peptides over 12 days. Cells were then sorted for TCR Vbeta 13.1+ (NY-ESO-1 TCR) CD8+ CD3+ CD56- CD4- DAPI- cells. RNA-seq TruSeq libraries were generated from polyA-enriched mRNA isolated from the samples, and sequenced in paired-end mode (2x51bp) on 2 lanes of an Illumina NovaSeq 6000 flow-cell. FASTQ sequence files were generated with the Illumina RTA version 3.4.4 and Base-calling Version bcl2fastq-2.20.0.422.	Illumina NovaSeq 6000	16
EGAD00001009755	scWGS-seq of flow sorted blast and normal cells from SJBALL021901 with 71 high quality cells sequenced (67 blast and 4 normal)	Illumina NovaSeq 6000	84
EGAD00001009756	Three whole genome sequecing of three independent cohorts. The three cohorts derives from projects including samples of inviduals with Danish origin. Data is deviden into males and female and compiled.		1
EGAD00001009757	Bam files aligned using hg19. Sequencing data generated with Illumina MiSeq, HiSeq 2500, or HiSeq 4000 instruments.	Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina MiSeq	62
EGAD00001009758	A novel in-house-made pediatric MEF2D-BCL9 fusion positive acute lymphoblastic leukemia cell line was characterized	Illumina NovaSeq 6000	3
EGAD00001009759	A novel in-house-made pediatric MEF2D-BCL9 fusion positive acute lymphoblastic leukemia cell line was characterized	Illumina HiSeq 2000	2
EGAD00001009760	Colorectal cancer samples will be submitted for Illumina sequencing using a custom capture of 116 genes implicated in colorectal tumourigenesis. Driver mutations will be detected and ultimately correlated with phenotypic data.	Illumina HiSeq 2000 Illumina HiSeq 2500	2229
EGAD00001009761	The dataset contains whole exome sequencing of a family revealing a homozygous splice variant LGR4 gene rresponsible of salt wasting and adrenal zonation alteration. The members sequences were the proband with hypoaldosteronism , her parents and her two healthy brother in a consanguineous family.	NextSeq 500	5
EGAD00001009763	This dataset includes the Fastq files from sequencing data generate from diagnostic and remission bone marrow mononuclear cell samples using the Mission Bio Tapestri Plateform with both DNA amplicons and protein from antibody-derived tags sequencing libraries.	NextSeq 550	7
EGAD00001009764	Rmarkdown code, PDF, and Rdata file to recapitulate the paper's primary figures and machine learning model development.		-
EGAD00001009766	Profiling of childhood neuroblastoma by single-cell RNA sequencing	NextSeq 500	24
EGAD00001009767	Solve-RD data submitted to the ERN-GENTURIS cohort for re-analysis (Data freeze 1+2) v1		1
EGAD00001009768	Solve-RD data submitted to the ERN-EuroNMD cohort for re-analysis (Data freeze 1+2) v1		1
EGAD00001009769	Solve-RD data submitted to the ERN-RND cohort for re-analysis (Data freeze 1+2) v1		1
EGAD00001009770	Solve-RD data submitted to the ERN-ITHACA cohort for re-analysis (Data freeze 1+2) v1		1
EGAD00001009771	Manuscript Title: Co-targeting of BTK and MALT1 overcomes resistance to BTK inhibitors in mantle cell lymphoma Journal: Journal of Clinical Investigation Authors Vivian Changying Jiang1, Yang Liu1, Junwei Lian1, Shengjian Huang1, Alexa Jordan1, Qingsong Cai1, Fangfang Yan3, Joseph Mitchell McIntosh1, Yijing Li1, Yuxuan Che1, Zhihong Chen1, Jovanny Vargas1, Maria Badillo1, JohnNelson Bigcal1, Heng-Huan Lee1, Wei Wang1, Yixin Yao1, Lei Nie1, Christopher Flowers1, and Michael Wang1, 2* Abstract Bruton’s tyrosine kinase (BTK) is a proven target in mantle cell lymphoma (MCL), an aggressive subtype of non-Hodgkin lymphoma. However, resistance to BTK inhibitors is a major clinical challenge. We here report that MALT1 is one of the top overexpressed genes in ibrutinib-resistant MCL cells, while expression of CARD11, which is upstream of MALT1, is decreased. MALT1 genetic knockout or inhibition produced dramatic defects in MCL cell growth regardless of ibrutinib sensitivity. Conversely, CARD11 knockout cells showed anti-tumor effects only in ibrutinib-sensitive cells, suggesting that MALT1 overexpression could drive ibrutinib resistance via bypassing BTK-CARD11 signaling. Additionally, BTK knockdown and MALT1 knockout markedly impaired MCL tumor migration and dissemination, and MALT1 pharmacological inhibition decreased MCL cell viability, adhesion, and migration by suppressing NF-κB, PI3K-ATK-mTOR, and integrin signaling. Importantly, co-targeting MALT1 with safimaltib and BTK with pirtobrutinib induced potent anti-MCL activity in ibrutinib-resistant MCL cell lines and patient-derived xenografts. Therefore, we conclude that MALT1 overexpression associates with resistance to BTK inhibitors in MCL, targeting abnormal MALT1 activity could be a promising therapeutic strategy to overcome BTK inhibitor resistance, and co-targeting of MALT1 and BTK should improve MCL treatment efficacy and durability as well as patient outcomes. Dataset description: The bulk RNA-seq dataset was generated for the cell lines below and used for two major purposes: 1. DEG analysis and GSEA analysis comparing IBN-R and IBN-S cells 2. DEG analysis and GSEA analysis comparing MCL cells with/without MI-2 treatment. sample Cell MI-2 Ibrutinib (IBN) Venetoclax (VEN) Used for IBN-R vs IBN-S comparison Used for MI-2 vs untreated (DMSO) H9 Granta519 - R S yes H21 Granta519 - R S yes H33 Granta519 - R S yes H10 Granta519-VEN-R - R R yes H22 Granta519-VEN-R - R R yes H34 Granta519-VEN-R - R R yes H3 JeKo BTK KD_1 - R R yes yes H15 JeKo BTK KD_1 - R R yes yes H27 JeKo BTK KD_1 - R R yes yes H5 JeKo BTK KD_2 - R R yes yes H17 JeKo BTK KD_2 - R R yes yes H29 JeKo BTK KD_2 - R R yes yes H1 JeKo-1 - S R yes yes H13 JeKo-1 - S R yes yes H25 JeKo-1 - S R yes yes H7 Mino - S S yes H19 Mino - S S yes H31 Mino - S S yes H8 Mino-VEN-R - S R yes H20 Mino-VEN-R - S R yes H32 Mino-VEN-R - S R yes H11 Rec-1 - S S yes H23 Rec-1 - S S yes H12 Rec-VEN-R - S S yes H24 Rec-VEN-R - S R yes H36 Rec-VEN-R - S R yes H35 Rec-1 -- S R yes H4 JeKo BTK KD_1 + MI-2 + yes H16 JeKo BTK KD_1 + MI-2 + yes H28 JeKo BTK KD_1 + MI-2 + yes H6 JeKo BTK KD_2 + MI-2 + yes H18 JeKo BTK KD_2 + MI-2 + yes H30 JeKo BTK KD_2 + MI-2 + yes H2 JeKo-1 + MI-2 + yes H14 JeKo-1 + MI-2 + yes H26 JeKo-1 + MI-2 + yes	Illumina HiSeq 4000	35
EGAD00001009772	miRNA libraries of the AML-PMP project were sequenced on the Illumina HiSeq 2000 instrument, approximately 16 samples per HiSeq lane, to a median depth of 5.7 million reads per library.	Illumina HiSeq 2000	1
EGAD00001009773	116,958 single-cell transcriptomes from samples of peripheral blood mononuclear cells (PBMCs) from five CVID patients at three distinct stages of the SARS-CoV-2 infection: 1) baseline, before viral infection, 2) progression, during viral infection, and 3) convalescence, once the viral infection had been resolved and the patient was PCR negative. CVID patients were under regular immunoglobulin replacement therapy and displayed only mild symptoms during SARS-CoV-2 infection.	Illumina NovaSeq 6000	12
EGAD00001009774	Bone marrow aspirates were obtained from patients with relapsed/refractory large B cell lymphoma (rrLBCL), mononuclear cells isolated by ficoll density-gradient centrifugation, and loaded onto a 10X Chromium for single cell RNA-sequencing using 5’ chemistry without prior cryopreservation. Healthy donor bone marrow mononuclear cells were obtained from healthy allogeneic stem cell transplant donors and analyzed following viable cryopreservation.	Illumina NovaSeq 6000	22
EGAD00001009775	Summary statistics Korean PD (n=410) vs. Korean Healthy Control (n=200)		1
EGAD00001009777	Single Cell Genome Sequence for triple negative breast cancer patient-derived xenograft SA604 passage 8 on DLP+ library A96141A	HiSeq X Ten	1
EGAD00001009778	The goal of this study is to characterize immune cell populations by single cell RNA-sequencing (scRNA-seq) in tumor and uninvolved normal tissues from non-small cell lung cancer (NSCLC) patients with resectable non-small cell lung cancer and who received neoadjuvant chemoimmunotherapy. scRNA-seq was performed on seven pairs of tumor and normal tissues as well as one lymph node (LN) sample. Data set includes pair-end fastq files for single cell RNA sequencing of 7 neo-immuno patients. (Total 15 samples and 50 runs).	Illumina NovaSeq 6000	15
EGAD00001009780	Plasma from lung cancer patients from EDTA tubes was fractionated using size exclusion chromatography. Fractions 1-5, 7-11, 12-15, 16-20 were pooled, cfDNA was extracted from the fractions and paired unfractionated samples and PE150bp sequencing was performed on an Illumina Novaseq S4 flowcell. Samples are provided as raw reads without any prior processing.	Illumina NovaSeq 6000	64
EGAD00001009781	Multiple regions were cut out of each tumor extracted form breast cancer patients and two experiments were run: - Whole exome sequencing on each of the regions plus an adjacent normal tissue sample - Smart-Seq3 Single cell RNA sequencing on EPCAM+ CD45- sorted cells from different tumor regions	Illumina NovaSeq 6000	1518
EGAD00001009783	This dataset includes 2*76bp RNA-seq reads from 9 pigs sequenced using Illumina NextSeq500.	NextSeq 500	9
EGAD00001009784	This dataset contains: Ultra-deep sequencing data using the Duplex sequencing technology of: 1.) SeraSeq cfDNA reference materials with spike-in variants with allele frequencies from 0% to 5% 2.) One cfDNA sample from a CRC patient 3.) One cfDNA sample from a patient with asymmetric overgrowth Paired-end sequencing was performed with 2x151 bp reads on the NextSeq 500 system. Data is provided as mapped .bam files (aligned to GRCh38/hg38).	NextSeq 500	13
EGAD00001009785	Contains control and PXR KD human small intestinal organoids	NextSeq 500	18
EGAD00001009786	This dataset contains 68 BAM files from matched normal-tumor pairs of HCV positive lymphoma analyzed by exome sequencing on Ilumina platform.	Illumina NovaSeq 6000	1
EGAD00001009787	WGS files for CIC paper titled "Malignant progression of an ancestral bone marrow clone harboring a CIC-NUTM2A fusion in isolated myeloid sarcoma"	Illumina HiSeq 2000	3
EGAD00001009788	RNASeq files for CIC paper titled "Malignant progression of an ancestral bone marrow clone harboring a CIC-NUTM2A fusion in isolated myeloid sarcoma"	Illumina HiSeq 2000	2
EGAD00001009789	A DNA methylation atlas of normal human cell types. The atlas includes 205 whole-genome bisulfite sequencing (WGBS) samples, from 39 sorted cell types. The samples were paired-end sequenced with 30x coverage.	Illumina NovaSeq 6000	410
EGAD00001009790	The TEP dataset consists of 549 Fastq samples which are divided into two experiments: a training data cohort, used to train the classifier, and a validation data cohort, used to assess classifier performance	Illumina HiSeq 2500	548
EGAD00001009791	Whole genome sequencing data of 8 High-grade serous carcinoma (HGSC) patients (20 samples) sequenced with HiSeq X Ten.	HiSeq X Ten	20
EGAD00001009792	This dataset contains RNA-sequencing of whole blood samples from Healthy Controls (n=11), AAV patients (n=30, GPAn=22, MPAn=8).	NextSeq 500	6
EGAD00001009793	We demonstrate that ATRT tumoroids retain subgroup-specific epigenetic and gene expression profiles	Illumina NovaSeq 6000	8
EGAD00001009794	We demonstrate that ATRT tumoroids retain subgroup-specific epigenetic and gene expression profiles	Illumina NovaSeq 6000	4
EGAD00001009796	12.5 ng of cfDNA was used as input for shallow whole-genome sequencing (sWGS), aiming for a coverage of x0.2-0.4-fold. Library preparation was performed using the TruSeq Nano DNA High Throughput Library Prep Kit (Illumina, San Diego, CA, USA) on an automated Hamilton STAR liquid handling system (Hamilton, Germany GmbH, Robotics, Gräfeling, Germany) with dual indexing, and sequencing was performed on the NextSeq500/550 platform (Illumina). The fraction of tumor-derived DNA in cell-free DNA was estimated using the R package ichorCNA.	NextSeq 500	1
EGAD00001009797	Clinical & biomarker data from IMagyn050: treatment arm, treatment approach, outcome of surgery, ECOG PS, PD-L1 status, race, age, disease stage, progression free survival (investigator assessed), overall survival, histology, tumor mutation burden and status, genomic loss of heterozygosity, microsatellite status, BRCA1/2 mutation status, tissue of origin. Mutation status based on FoundationOne NGS for the following genes is also being provided: TP53, BRCA1, CCNE1, MYC, NF1, PIK3CA, RAD21, TERC, PRKCI, KRAS, RB1, BRCA2, ARID1A, AKT2, PTEN, KDM5A, NOTCH3, FGF12, ERBB2, CDK12, EMSY, WHSC1L1, BCL2L1, CDKN2A, GNAS, ARFRP1, ZNF217, SOX2, CCND2, FGF6, FGF23, LYN, MUTYH, AURKA, FGFR1, MCL1, MLL2, MYCL1, ZNF703, BRAF, MAP2K4, CREBBP, TSC2		1
EGAD00001009798	Smart-seq3 scRNA-seq of cells from primary (OV2295) and metastatic (OV2295R2) high-grade serous ovarian cancer cell-line	Illumina NovaSeq 6000	768
EGAD00001009799	RNA-Seq data for manuscript titled: CBL0137 impairs homologous recombination repair and sensitizes high-grade serous ovarian carcinoma to PARP inhibitors Sequencing	MGISEQ-2000RS	6
EGAD00001009800	This dataset was used to compare gene expression profiles of ex vivo isolated classical CD14+ monocytes from patients with moderate COVID-19 to those of healthy individuals. Blood samples were taken from patients with moderate COVID-19 admitted to hospitals in London (Hammersmith Hospital, Charing Cross Hospital, Saint Mary’s Hospital) 3-14 days after disease onset and 0-2 days after hospitalization and positive PCR, and before study treatment initiation. Moderate patients displayed mild or moderate COVID-19 pneumonia, defined as grade 3 or 4 WHO severity. Samples were collected from March 2020 to February 2021. Healthy donors were Imperial College staff with no prior diagnosis of or recent symptoms consistent with COVID-19, and where possible, were matched in age and sex distribution with COVID-19 patients. None of the participants of this study were COVID-19 vaccinated. Peripheral blood mononuclear cells (PBMCs) were isolated by Ficoll Hypaque (GE Healthcare) gradient centrifugation <4 hours after blood collection. CD14+ monocytes were isolated using a positive selection magnetic sorting kit (StemCell Technologies, UK) from total PBMC and stimulated with vehicle, UV-inactivated SARS-CoV-2 (CoV-2). RNA was isolated using the RNeasy Micro Plus Kit (QIAGEN) following the manufacturer’s guidelines. RNA-sequencing was performed by the Oxford Genomics Centre. PolyA-enriched strand- specific libraries were prepared using NEBNext Ultra II Directional RNA Library Prep Kits (Illumina). All samples were pooled together and 150bp PE reads were sequenced on a Novaseq 6000 system.	Illumina NovaSeq 6000	42
EGAD00001009801	The primitive streak emerges during the 14th day of human development and establishes dorsoventral and antero-posterior (craniocaudal) body axes. Segmentation along the craniocaudal axis is governed by Hox genes; four clusters of transcription factors whose 3’ to 5’ expression correlates with position along the axis. The precise utilisation of Hox genes in different cell types remains incompletely characterised in humans. In this study, we applied single-cell and spatial transcriptomics to contiguous regions of the human fetal spine between the 5th and 13th post-conception weeks. We built a detailed developmental atlas to examine the segmental expression of Hox genes across different cell types, observing that the Hox code was displayed by all anatomically fixed cell types along the craniocaudal axis. By contrast, mature derivatives of neural crest cells retained the anatomical Hox code of their origin within the crest, a pattern reproduced across neural crest derivatives in other human fetal organs. These findings indicate that scars of Hox gene expression persist in crest cells which may serve as barcodes of neural crest migration.	Illumina NovaSeq 6000	1
EGAD00001009806	64 paired-end Illumina RNAseq whole transcriptome stranded libraries from 32 pairs of matched primary and recurrent GBM	Illumina HiSeq 2500	64
EGAD00001009807	Single-cell RNA sequencing of 18 peripheral blood samples from six melanoma patients. The raw data is available as fastq files.	Illumina NovaSeq 6000	72
EGAD00001009808	Targeted DNA sequencing on 37 Merkel Cell Carcinomas from New Zealand with known Merkel cell polyomavirus status	Ion Torrent Proton NextSeq 500	92
EGAD00001009809	WGS of MAPKi acquired resistant samples from patients and PDX models	Illumina NovaSeq 6000	104
EGAD00001009811	Single-cell RNA-seq dataset recovering the gene expression programs of around 128,000 single cells derived from 17 fetal testes and 16 fetal ovaries from 5 to 12 postconceptional weeks.	Illumina HiSeq 4000	33
EGAD00001009812	Cancers of adults typically arise through progressive rounds of clonal diversification and intratumoral selective sweeps which generate a long mutational trunk with shorter subclonal branches. Here, we investigated whether tumors of young children exhibit the same phylogenetic configuration. We studied three infants, including two newborns, with the childhood kidney cancer, Wilms tumour, through whole genome sequencing of bulk tissues, of single cell derived organoids, and of microdissections. All three cancers exhibited unusual driver events, with tumours of newborns harbouring FOXR2 rearrangements, delineating a distinct variant of Wilms tumour. Phylogenetic analyses suggest that tumors were seeded in an early, possibly confined window of development. Unusually, following seeding there was extensive polyclonal diversification with little evidence of clonal sweeps, leading to a distinct phylogenetic configuration more reminiscent of normal tissues rather than of adult cancers. These findings indicate that some childhood cancers may diversify via unorthodox phylogenetic pathways.	HiSeq X Ten Illumina NovaSeq 6000	1
EGAD00001009813	Cancers of adults typically arise through progressive rounds of clonal diversification and intratumoral selective sweeps which generate a long mutational trunk with shorter subclonal branches. Here, we investigated whether tumors of young children exhibit the same phylogenetic configuration. We studied three infants, including two newborns, with the childhood kidney cancer, Wilms tumour, through whole genome sequencing of bulk tissues, of single cell derived organoids, and of microdissections. All three cancers exhibited unusual driver events, with tumours of newborns harbouring FOXR2 rearrangements, delineating a distinct variant of Wilms tumour. Phylogenetic analyses suggest that tumors were seeded in an early, possibly confined window of development. Unusually, following seeding there was extensive polyclonal diversification with little evidence of clonal sweeps, leading to a distinct phylogenetic configuration more reminiscent of normal tissues rather than of adult cancers. These findings indicate that some childhood cancers may diversify via unorthodox phylogenetic pathways.	Illumina HiSeq 4000	41
EGAD00001009814	Concatenated long-read single-cell RNA sequencing samples prepared using 10X and the HIT-scIsoSeq protocol. The sequencing was performed on Sequel II Pacbio machine. 3 ovarian cancer patients, 5 omentum biopsies samples: 3 metastasis samples (one per patient), 2 healthy samples (one per patient except Patient2). 4 bam files per metastasis sample, 2 bam files per healthy sample.	Sequel	5
EGAD00001009815	Illumina Novaseq paired-end single-cell RNA sequencing samples prepared using 10X Genomics platform. 3 ovarian cancer patients, 5 omentum biopsies samples: 3 HGSOC metastasis samples (one per patient), 2 healthy samples (one per patient except Patient2). 4 paired fastq files per sample.	Illumina NovaSeq 6000	5
EGAD00001009816	This dataset contains the CRAM files of the samples used for the article "Neutrophil extracellular traps have auto-catabolic activity and produce mononucleosome-associated circulating DNA" published in Genome Medicine.	Illumina MiSeq	12
EGAD00001009817	This dataset contains raw ITS amplicon sequencing data for 719 sputum samples from individuals in Guangdong province, China.	Illumina NovaSeq 6000	625
EGAD00001009818	Whole genome sequencing of 17 paired tumor-normal Hodgkin Lymphoma whole genomes. Bulk tumor (-T) and flow-sorted Reed Sternberg cell (-HRS) samples. Approximately 30x wgs normal depth, 40-50x wgs depth tumor samples.	Illumina NovaSeq 6000	53
EGAD00001009819	This dataset contains the raw sequencing data (Runs) from the 10x Genomics single-cell Multiome Experiments which belong to the MCL group	Illumina NovaSeq 6000	-
EGAD00001009820	This dataset contains the raw sequencing data (Runs) from all of the 10x Genomics single-cell CITE-seq Experiments, as well as the demultiplexed cell-donor identities matrices which were generated with Vireo (Analysis).	Illumina NovaSeq 6000	-
EGAD00001009821	This dataset contains the raw sequencing data (Runs) from all of the 10x Genomics single-cell Visium Experiments, as well as the corresponding imaging data (Analyses).	Illumina NovaSeq 6000	-
EGAD00001009822	This dataset contains the raw sequencing data (Runs) for all 10x Genomics single-cell ATAC-seq Experiment.	Illumina NovaSeq 6000	2
EGAD00001009823	This dataset contains the raw sequencing data (Runs) for all 10x Genomics single-cell RNA-seq Experiments.	Illumina NovaSeq 6000	2
EGAD00001009824	This dataset contains the raw sequencing data (Runs) from all of the 10x Genomics single-cell Multiome Experiments	Illumina NovaSeq 6000	1
EGAD00001009825	TRACERx NSCLC - Whole exome multiregion sequencing data from the 421 cohort	Illumina HiSeq 2000	2193
EGAD00001009826	This is a test dataset derived from public data of the 1000 Genomes Project. Its purpose is not to allow for any inference about cohort data or results, but to aid bioinformaticians in the technical development and testing of tools, as well as data consumers in learning how to access information. This dataset consists of 3 pairs of light-weight (sliced) files: BAM + BAI, CRAM + CRAI and VCF + TBI. These files can be downloaded directly through the EGA-download-client PyEGA3 (https://github.com/EGA-archive/ega-download-client). For any further questions, please contact the DAC (Helpdesk - email: helpdesk [at] ega-archive [dot] org).	unspecified	1
EGAD00001009827	Seven clonal organoid lines and one bulk wild-type control sample were paired-end whole-genome sequenced using the Illumina Novaseq 6000 system. We sequenced four clonal intestinal organoid lines harbouring engineered TP53 and FBXW7 mutations as well as three lines targeted for oncogenic APC/TP53/PIK3CA/SMAD4 mutations. The reads were mapped to hg38 genome assembly and data is provided as BAM files.	Illumina NovaSeq 6000	8
EGAD00001009828	FASTQ and BAM files for 46 samples (consisting of 35 samples for patients diagnosed with CLL, 5 samples consisting of a dilution series of patient DNA, and 6 samples of cell lines and dilutions involving the cell lines) from targeted capture next-generation sequencing from the LySeq panel. Libraries were sequenced on either Illumina HiSeq 2500 (LySeq66 PoP, R1-3) or Illumina NovaSeq 6000 (LySeq66 Validation Round).	Illumina HiSeq 2500 Illumina NovaSeq 6000	46
EGAD00001009829	11 plasma cases and 4 urine cases (mouse)	NextSeq 500	15
EGAD00001009830	Sixteen patients with refractory solid cancers received up to three distinct neoTCR-transgenic cell products, each expressing a patient-specific neoTCR, in a cell dose-escalation, first-in-human phase 1 clinical trial (NCT03970382). Included are the tumor and normal WXS and tumor RNAseq for dosed patients.	Illumina HiSeq 2500	86
EGAD00001009831	Sorted single CD8+T cells expressing CD14 from human liver for SMARTSeq2. Livers processed: Kucykowicz et al STAR Prot 2022:pubmed.ncbi.nlm.nih.gov/35516846/ Published: Pallett et al Nature 2022 Tissue CD14+CD8+T-cells are reprogrammed by myeloid cells and modulated by LPS A modified SMART-seq2 protocol was performed on the single flow cytometry sorted-cells as previously described58. After cDNA generation, libraries were prepared (384 cells per library) using the Illumina Nextera XT kit (Illumina). Each library was sequenced to achieve a minimum depth of 1-2 million raw reads per cell using an Illumina HiSeq 4000 using v. 4 SBS chemistry to generate 75-bp paired-end reads.	Illumina NovaSeq 6000	378
EGAD00001009834	RNA-Seq data of 46 matched lobular breast cancer metastatic samples obtained from 21 unique patients from GELATO clinical trial assayed at three timepoints: at baseline (directly after patient randomization), pre-atezolizumab (after induction treatment with carboplatin for two weeks) and on atezolizumab (after two cycles of atezolizumab combined with carboplatin). The included raw transcriptome sequencing data in fastq format was generated using Illumina NovaSeq 6000 from fresh frozen material.	Illumina NovaSeq 6000	46
EGAD00001009835	RNA-Seq data of 10 lobular breast cancer primary tumors and 3 local recurrences obtained from 11 unique patients from GELATO clinical trial. The included raw transcriptome sequencing data in fastq format was generated using Illumina NovaSeq 6000 from archived FFPE material.	Illumina NovaSeq 6000	13
EGAD00001009836	Paired-end whole exome sequencing of 10 lobular breast cancer primary tumors, 3 local recurrences and matched normal samples obtained from 10 unique patients from the GELATO clinical trial. The included raw sequencing data in fastq format was generated using Illumina NovaSeq 6000 from archived FFPE material (tumor data) and fresh frozen material (matched normal data).	Illumina NovaSeq 6000	23
EGAD00001009837	Paired-end whole exome sequencing of 19 lobular breast cancer metastatic tumors and matched normal samples obtained from 19 unique patients from the GELATO clinical trial. The included raw sequencing data in fastq format was generated using Illumina NovaSeq 6000 from fresh frozen material.	Illumina NovaSeq 6000	38
EGAD00001009838	15 Healthy controls, 25 conlonrectal cancer patients without liver metastasis and 24 conlonrectal cancer patients with liver metastasis (target capture)	NextSeq 500	64
EGAD00001009839	18 plasma samples and their paired 18 urinary cfDNA samples without cancer	NextSeq 500	36
EGAD00001009840	12 Nasopharyngeal carcinoma patients without treatment and 12 Nasopharyngeal carcinoma patients with treatment (WGS)	NextSeq 500	48
EGAD00001009844	RNAseq of 45 high-grade serous ovarian cancer tumour samples. Libraries were generated using the NEB Ultra II Directional RNA library Prep kit with polyA enrichment. Libraries were sequenced as paired-end 50 or 100bp on an Illumina NextSeq or NovaSeq.	Illumina NovaSeq 6000	1
EGAD00001009845	Clonal evolution drives cancer progression and therapeutic resistance. Recent studies revealed divergent longitudinal trajectories in gliomas, but early molecular traits steering post-treatment cancer evolution remain unclear. We analyzed sequencing data of 544 initial-recurrent adult diffuse glioma pairs to identify genomic and transcriptomic early predictors of tumor evolution in each molecular subtype.	Illumina NovaSeq 6000	366
EGAD00001009847	We provide whole exome DNA sequening data in fastq format for 23 clinical samples of chronic myeloid leukemia stem cells (CML-SC) plus two buccal swipes derived normal samples. CML samples are comprised of 4 to 8 replicates from two patients, at diagnosis and after treatment. Single CML stem cells before treatment and single non-transformed hematopoietic stem cells (HSC) at remission were selected from bone marrow samples by FACS, according to newly identified genetic markers CD33+CD26+ at diagnosis and CD33+CD26-/CD33-CD26- at remission. WES libraries of colony forming assays derived CML-SC and HSC populations were prepared using Agilent SureSelect Human All Exon V6 kit and sequenced running 150 cycles (2x 75bp paired-end) on an Illumina NextSeq 500 platform.	NextSeq 500	25
EGAD00001009848	Pathogenic germline variants in the protection of telomeres 1 gene (POT1) have been associated with predisposition to a range of tumor types, including melanoma, glioma, leukemia and cardioangiosarcoma. We sequenced all coding exons of the POT1 gene in 2,929 European-descent melanoma cases and 3,298 controls, identifying 43 protein-changing genetic variants. We performed functional studies on each of these variants and explored their possible contribution to disease risk.	Illumina MiSeq	6226
EGAD00001009851	A collection of four induced pluripotent stem cell models (iPSC) derived from patients diagnosed with Spinocerebellar ataxia 15 (SCA15) . Spinocerebellar ataxia 15 (SCA15) is a neurological condition characterised by progressive gait and limb ataxia as well as abnormalities in eye movement and difficulties with balance, speech and swallowing (Synofzik et al., 2011). Whole Genome Sequencing was performed to confirm the presence of a heterozygous deletions in the inositol 1,4,5-triphosphate receptor gene (ITPR1), characteristic of the disease. Cell models names: HPSI0216i-vieg_5 or Vieg_5 (WTSIi472-A), HPSI0216i-vieg_3 or Vieg_3 (WTSIi472-B), HPSI0216i-dacv_6 or Dacv_6 (WTSIi554-A), HPSI0216i-boho_3 or Boho_3 (WTSIi502-A). All iPSC models are available via ECACC-Culture Collections.	Illumina NovaSeq 6000	1
EGAD00001009852	Whole-exome sequence (WES) data of tumor-normal pairs from 40 ENKTCL patients and RNA sequence (RNA-seq) data of tumors from 20 ENKTCL patients.	HiSeq X Ten Illumina HiSeq 2000 Illumina NovaSeq 6000 PromethION unspecified	48
EGAD00001009853	198 exome sequencing samples	Illumina NovaSeq 6000	198
EGAD00001009854	This data set includes bam files (aligned to hg38) from the germline of children who have pathogenic mutations in cancer predisposing genes	Illumina HiSeq 2500 NextSeq 550	4
EGAD00001009855	Dataset containing 2068 WES tumor and control samples of central nervous system neoplasm patients. The data was sequenced on a Illumina NextSeq 500 using a NPHD2015A kit. The sequencing was always paired.	NextSeq 500	2068
EGAD00001009856	This dataset consists of WGS from six patients with a recent diagnosis of gastrointestinal adenocarcinoma. DNA-Seq libraries were prepared from platelet DNA (pDNA) and plasma cfDNA obtained from the same peripheral blood sample. pDNA was size selected into 2 groups (small, s-pDNA: less than 600 bp and long, b-pDNA: greater than 600 bp). The short fragments were further cleaned to remove <100 bp fragments and large platelet fragments were fragmented via sonication. Libraries were then prepared using the NEBNext Ultra II DNA Sample Preparation Kit for Illumina according to the manufacturer’s protocol and sequenced on an Illumina NextSeq 500 (300 cycle PE) at low-pass (0.1X) for all samples, and 10X for the cfDNA and s-pDNA, using four lanes in each sample.	NextSeq 500	20
EGAD00001009857	Fastq files from RNAseq of breast cancer bone metastases PDX of tumor HBC-124 treated by IACS-010759 (4 samples) or not (4 samples).	Illumina NovaSeq 6000	8
EGAD00001009858	WGS data of multi-region samples from PLANET 123 Patient cohort	HiSeq X Ten Illumina HiSeq 4000	12
EGAD00001009859	RNA-seq data of multi-region samples from PLANET 123 Patient cohort	Illumina HiSeq 4000	-
EGAD00001009860	The dataset for the study “Dynamics of sequence and structural cell-free DNA landscapes in small-cell lung cancer” includes 171 bam files from targeted next-generation sequencing (TEC-Seq) from plasma cell-free DNA and matched white blood cell DNA from 33 individuals with small cell lung cancer, alongside 10 bam files from whole exome sequencing of tumor and matched normal DNA for 5 individuals with small cell lung cancer.	Illumina HiSeq 2500	181
EGAD00001009861	scRNASeq analysis of human Lin neg lymphocytes from control liver, cirrhotic liver, tonsil, duodenum, and colon tissues.	Illumina NovaSeq 6000	15
EGAD00001009862	RNAseq data from the TRACERx 421 cohort	Illumina HiSeq 4000	1051
EGAD00001009863	The pediatric cancer cohort in this study included 70 PDX models from 65 different individuals. This cohort included a total of 16 different pediatric solid tumor subtypes, including fourteen Wilms tumors, thirteen hepatoblastomas, thirteen osteosarcomas, ten germ cell tumors, four neuroblastomas, three clear cell sarcomas, two adrenal cortical carcinomas, two leydig cell tumors, two medulloblastomas, one embryonal rhabdomyosarcoma (ERMS), one Ewing sarcoma, one pleomorphic sarcoma, one adenocarcinoma, one glioblastoma, one mesothelioma and one ovarian tumor. Notably, we have five samples with multiple PDX models from same patient, including two cases with duplicates (564 and 564-Dup, 1796 and 1796-Dup), one case with two different metastasis (560-SM, 560-LM), one case with two blocks from same tumor (1939 and 1939-Dup), and one case with different primary tumor from same patient (2264 and 1932). We have a total of 353 sequencing data, including 82 RNA sequencing data (RNA-seq), 138 whole-exome sequencing (WES) and 135 low-pass whole-genome sequencing (WGS). For RNA-seq data, we have 61 PDXs and 21 PTs; for WES, we have 67 PDXs, 30 PTs and 40 matched normal germlines; for WGS, we have 64 PDXs, 30 PTs and 40 matched normal germlines. Of which, 19 PT-PDX paired RNA-seq, 28 paired PT-PDX paired WES and WGS were included.	Illumina HiSeq 3000	352
EGAD00001009864	Data from NABUCCO cohort 2 (NCT03387761). This dataset includes Whole exome DNA sequencing on bladder tumor samples matched with blood samples for patients from NABUCCO Cohort 2 (Cohort 2A and Cohort 2B). The data is pre-treatment	Illumina NovaSeq 6000	59
EGAD00001009865	Single-cell RNA sequencing was performed on bone marrow mononuclear cells of a patient with acute myeloid leukemia with erythroid differentiation of the blasts and on peripheral blood mononuclear cells of a patient with acute myeloid leukemia with megakaryocytic differentiation of the blasts. The dataset contains raw fastq files of these two samples with single-cell RNA sequencing performed using the 10x Genomics platform.	Illumina NovaSeq 6000	10
EGAD00001009866	whole genome sequencing data	Illumina NovaSeq 6000	512
EGAD00001009867	methyl-seq data 64 cases	Illumina NovaSeq 6000	128
EGAD00001009868	ATAC-seq data 72cases	Illumina HiSeq 2500	144
EGAD00001009869	Using meRIP-sequencing, we profiled N-6 methyladenosine (m6A) in a cohort of 148 primary prostate cancer samples as part of the Canadian Prostate Cancer Genome Network project (CPC-GENE). Paired-end sequencing of 150 bp reads were mapped to GRCh38.p13 using annotations from gencode.v34.chr_patch_hapl_scaff.annotation.gtf. Peaks were called using MeTPeak, joint peaks were identified and IP reads were quantitated, normalized and adjusted using Input reads to obtain estimates of m6A abundance.	HiSeq X Ten	148
EGAD00001009870	In Vivo Loss of Tumorigenicity in a Patient-Derived Orthotopic Xenograft Mouse Model of Ependymoma. Whitehouse et al. 2023 Frontiers in Oncology. We describe the establishment of a patient-derived orthotopic xenograft (PDOX) model of posterior fossa A (PFA) EPN, derived from a metastatic cranial lesion. Patient and PDOX tumors were analyzed using RNA sequencing. RNAseq data (paired end) provided here correspond to Primary tumour, two metastatic lesions (one spinal, one cranial), and a patient-derived xenograft derived from the patient.	unspecified	7
EGAD00001009871	van Hijfte snRNA glioblastoma dataset	Illumina NovaSeq 6000	1
EGAD00001009872	LP2100030-DNA_A02	HiSeq X Ten	1
EGAD00001009873	LP2100082-DNA_A01	HiSeq X Ten	1
EGAD00001009874	LP2100030-DNA_A08	HiSeq X Ten	1
EGAD00001009875	LP2100030-DNA_A01	HiSeq X Ten	1
EGAD00001009876	LP2100030-DNA_A07	HiSeq X Ten	1
EGAD00001009877	LP2100030-DNA_A06	HiSeq X Ten	1
EGAD00001009878	LP2100030-DNA_A03	HiSeq X Ten	1
EGAD00001009879	LP2100030-DNA_A09	HiSeq X Ten	1
EGAD00001009880	LP2100030-DNA_A05	HiSeq X Ten	1
EGAD00001009881	LP2100030-DNA_A04	HiSeq X Ten	1
EGAD00001009882	LP2100082-DNA_A02	HiSeq X Ten	1
EGAD00001009883	We have assessed the added value of long-read sequencing for PGx focusing on the clinically important and highly polymorphic CYP2C19 gene within 48 samples.	PacBio RS II	1
EGAD00001009884	WGS data normal and hypomethylation	Illumina NovaSeq 6000	6
EGAD00001009885	Nanopore pericentromere normal methylation	MinION	1
EGAD00001009886	Nanopore pericentromere hypomethylation	MinION	1
EGAD00001009887	RNA seq data normal and hypomethylation	Illumina HiSeq 2000	6
EGAD00001009888	Whole genome sequencing data of 9 high-grade serous carcinoma (HGSC) patients (55 samples) sequenced with HiSeq X Ten.	HiSeq X Ten	55
EGAD00001009890	This meta data contains extensive meta data from the cross sectional flow of the Isala Citizen Science project. The meta data file contains ENA Accession numbers for 16S Microbiome data, and cleaned responses to questionnaires.		3453
EGAD00001009891	LP2100082-DNA_G04	HiSeq X Ten	2
EGAD00001009892	LP2100082-DNA_A03	HiSeq X Ten	1
EGAD00001009893	LP2100082-DNA_A04	HiSeq X Ten	1
EGAD00001009894	LP2100082-DNA_A05	HiSeq X Ten	1
EGAD00001009895	LP2100082-DNA_A06	HiSeq X Ten	1
EGAD00001009896	LP2100082-DNA_A07	HiSeq X Ten	1
EGAD00001009897	LP2100082-DNA_B01	HiSeq X Ten	1
EGAD00001009898	LP2100082-DNA_B02	HiSeq X Ten	1
EGAD00001009899	LP2100082-DNA_B03	HiSeq X Ten	1
EGAD00001009900	LP2100082-DNA_B04	HiSeq X Ten	1
EGAD00001009901	LP2100082-DNA_B05	HiSeq X Ten	1
EGAD00001009902	LP2100082-DNA_B06	HiSeq X Ten	1
EGAD00001009903	LP2100082-DNA_B07	HiSeq X Ten	1
EGAD00001009904	LP2100082-DNA_C01	HiSeq X Ten	1
EGAD00001009905	LP2100082-DNA_C02	HiSeq X Ten	1
EGAD00001009906	LP2100082-DNA_C03	HiSeq X Ten	1
EGAD00001009907	LP2100082-DNA_C05	HiSeq X Ten	1
EGAD00001009908	LP2100082-DNA_D02	HiSeq X Ten	1
EGAD00001009909	LP2100082-DNA_D01	HiSeq X Ten	1
EGAD00001009910	LP2100082-DNA_D03	HiSeq X Ten	1
EGAD00001009911	LP2100082-DNA_D05	HiSeq X Ten	1
EGAD00001009912	LP2100082-DNA_D06	HiSeq X Ten	1
EGAD00001009913	LP2100082-DNA_E01	HiSeq X Ten	1
EGAD00001009914	LP2100082-DNA_E02	HiSeq X Ten	1
EGAD00001009915	LP2100082-DNA_E03	HiSeq X Ten	1
EGAD00001009916	LP2100082-DNA_E04	HiSeq X Ten	1
EGAD00001009917	LP2100082-DNA_E05	HiSeq X Ten	1
EGAD00001009918	LP2100082-DNA_E06	HiSeq X Ten	1
EGAD00001009919	LP2100082-DNA_F01	HiSeq X Ten	1
EGAD00001009920	LP2100082-DNA_F02	HiSeq X Ten	1
EGAD00001009921	LP2100082-DNA_F03	HiSeq X Ten	1
EGAD00001009922	LP2100082-DNA_F04	HiSeq X Ten	1
EGAD00001009923	LP2100082-DNA_F05	HiSeq X Ten	1
EGAD00001009924	LP2100082-DNA_F06	HiSeq X Ten	1
EGAD00001009925	LP2100082-DNA_G01	HiSeq X Ten	1
EGAD00001009926	LP2100082-DNA_G02	HiSeq X Ten	1
EGAD00001009927	LP2100082-DNA_G03	HiSeq X Ten	1
EGAD00001009928	LP2100082-DNA_H01	HiSeq X Ten	1
EGAD00001009929	LP2100082-DNA_H02	HiSeq X Ten	1
EGAD00001009930	LP2100082-DNA_H03	HiSeq X Ten	1
EGAD00001009931	LP2100082-DNA_H04	HiSeq X Ten	1
EGAD00001009932	LP2100082-DNA_H05	HiSeq X Ten	1
EGAD00001009933	LP2100098-DNA_A01	HiSeq X Ten	1
EGAD00001009934	LP2100098-DNA_A03	HiSeq X Ten	1
EGAD00001009935	LP2100098-DNA_A05	HiSeq X Ten	1
EGAD00001009936	LP2100098-DNA_A07	HiSeq X Ten	1
EGAD00001009937	LP2100098-DNA_A09	HiSeq X Ten	1
EGAD00001009938	LP2100098-DNA_B01	HiSeq X Ten	1
EGAD00001009939	LP2100098-DNA_B03	HiSeq X Ten	1
EGAD00001009940	LP2100098-DNA_B05	HiSeq X Ten	1
EGAD00001009941	LP2100098-DNA_B07	HiSeq X Ten	1
EGAD00001009942	LP2100098-DNA_B09	HiSeq X Ten	1
EGAD00001009943	LP2100098-DNA_C01	HiSeq X Ten	1
EGAD00001009944	LP2100098-DNA_C03	HiSeq X Ten	1
EGAD00001009945	LP2100098-DNA_C05	HiSeq X Ten	1
EGAD00001009946	LP2100098-DNA_C07	HiSeq X Ten	1
EGAD00001009947	LP2100098-DNA_C09	HiSeq X Ten	1
EGAD00001009948	LP2100098-DNA_D01	HiSeq X Ten	1
EGAD00001009949	LP2100098-DNA_D03	HiSeq X Ten	1
EGAD00001009950	LP2100098-DNA_D05	HiSeq X Ten	1
EGAD00001009951	LP2100098-DNA_D07	HiSeq X Ten	1
EGAD00001009952	LP2100098-DNA_D09	HiSeq X Ten	1
EGAD00001009953	LP2100098-DNA_E03	HiSeq X Ten	1
EGAD00001009954	LP2100098-DNA_E05	HiSeq X Ten	1
EGAD00001009955	LP2100098-DNA_E07	HiSeq X Ten	1
EGAD00001009956	LP2100098-DNA_E09	HiSeq X Ten	1
EGAD00001009957	LP2100098-DNA_F01	HiSeq X Ten	1
EGAD00001009958	LP2100098-DNA_F03	HiSeq X Ten	1
EGAD00001009959	LP2100098-DNA_F05	HiSeq X Ten	1
EGAD00001009960	LP2100098-DNA_F07	HiSeq X Ten	1
EGAD00001009961	LP2100098-DNA_G01	HiSeq X Ten	1
EGAD00001009962	LP2100098-DNA_G05	HiSeq X Ten	1
EGAD00001009963	LP2100098-DNA_G07	HiSeq X Ten	1
EGAD00001009964	Bolleboom-Gao peri-tumoral snRNA-seq glioblastoma dataset 2022/A	Illumina NovaSeq 6000	1
EGAD00001009965	Imputed HLA alleles and variation. Imputation was carried out using the Multi-Ethnic HLA reference panel (version 1.0 2021) available on the Michigan imputation server		-
EGAD00001009966	Phenotype and covariates		-
EGAD00001009967	LP2100098-DNA_H01	HiSeq X Ten	1
EGAD00001009968	LP2100098-DNA_H03	HiSeq X Ten	1
EGAD00001009969	LP2100098-DNA_H05	HiSeq X Ten	1
EGAD00001009970	LP2100098-DNA_H07	HiSeq X Ten	1
EGAD00001009971	Variant Calls for all 97 consenting participants in the study.		1
EGAD00001009972	Low-pass whole genome sequencing samples from pediatric solid tumor patients who are deceased	Illumina HiSeq 4000	26
EGAD00001009973	Autopsy-derived, later snap-frozen tissue fragments from a 5-year-old female with recurrent metastatic fusion-negative embryonal rhabdomyosarcoma (RMS) primary tumor of the left thigh were analyzed. No quality matched tissue was available. Prior panel analysis identified 2 prominent genetic changes: NRAS Q61K, PIK3CA H1047R, CDKN2A/B loss, and ERBB3 overexpression.	Illumina HiSeq 2000	2
EGAD00001009974	Dataset contains four samples taken from a neonate with congenital KMT2A-rearranged Acute Lymphoblastic Leukemia patient (CHI-0391) with rare IKZF1 gene fusions. Sequencing was carried out using mRNA-seq sequencing on a Illumina NextSeq 500 machine	NextSeq 500	4
EGAD00001009975	LP2100100-DNA_A01	HiSeq X Ten	1
EGAD00001009976	LP2100100-DNA_A03	HiSeq X Ten	1
EGAD00001009977	LP2100100-DNA_C01	HiSeq X Ten	1
EGAD00001009978	LP2100100-DNA_C03	HiSeq X Ten	1
EGAD00001009979	LP2100100-DNA_E01	HiSeq X Ten	1
EGAD00001009980	LP2100100-DNA_E03	HiSeq X Ten	1
EGAD00001009981	LP2100100-DNA_G01	HiSeq X Ten	1
EGAD00001009982	LP2100100-DNA_G03	HiSeq X Ten	1
EGAD00001009983	LP2100100-DNA_H03	HiSeq X Ten	1
EGAD00001009984	Pre-diagnostic saliva microbiota samples of Finnish children (aged 11/12 years). This is a case-control study, where case refers to the children who developed Type 1 DM or IBD later in life and control refers to the children who were free from these diseases. The aim of the study was to find biomarkers in saliva microbiota that may help us predict DM or IBD before the onset of these diseases.	Illumina MiSeq	163
EGAD00001009985	The dataset contains proteomics data of seven healthy family members. The samples were taken from peripheral blood mononuclear cells.		1
EGAD00001009986	single cell RNAseq and TCR sequencing data of 5 individuals. 10x Genomics VDJ single cell sequencing (17 runs) and 10x Genomics sc RNA-Seq (19 runs). The VDJ sequencing was done on a Nextseq 550 using the Chromium Single Cell VDJ Reagent Kit. The RNA-Seq was done either on HiSeq4000 or NovaSeq 6000 using the Chromium Single Cell 5 Reagent Kit.	Illumina HiSeq 4000 Illumina NovaSeq 6000 NextSeq 550	10
EGAD00001009987	This dataset includes NGS profiling of 13 women with simultaneous bilateral breast cancer. Seven women have WES of untreated surgical resections and matched healthy tissue. The six other women have WES of healthy tissue, WES+RNAseq of pre-neoadjuvant tumor biopsies, and when residual disease was present (6 tumors in 4 patients), WES+RNAseq of residual disease from post-neoadjuvant therapy surgery. One patient from the WES+RNAseq cohort had multifocal bilateral disease at diagnosis so there are 2 pre-neoadjuvant biopsy samples from each breast.	Illumina HiSeq 2500 Illumina NovaSeq 6000	47
EGAD00001009988	The dataset includes cram files from WGS of 2 NEN tumors and matched PDTOs. For all tumors, cram files from WGS of matched normal tissue from the corresponding patients are included. Analysis VCF files are also included in the dataset. The sequencing was done with a NovaSeq 6000 instrument.	Illumina NovaSeq 6000	1
EGAD00001009989	The dataset includes cram files from WGS of 6 NEN tumors or metastases and matched PDTOs. For all tumors, cram files from WGS of either matched normal tissue or matched blood from the corresponding patients are included. Analysis VCF files are also included in the dataset. The sequencing was done with a NovaSeq 6000 instrument.	Illumina NovaSeq 6000	1
EGAD00001009990	The dataset includes cram files from WGS of 2 LCNEC tumors and matched PDTOs. For all tumors, cram files from WGS of matched normal tissue derived organoids from the corresponding patients is included. Analysis VCF files are also included in the dataset. The sequencing was done with a NovaSeq 6000 instrument.	Illumina NovaSeq 6000	1
EGAD00001009991	The dataset includes fastq files from 4 NEN tumors or metastases and matched PDTOs. The sequencing was done with either a Nextseq 2000 or a NovaSeq 6000 instrument.	Illumina NovaSeq 6000	1
EGAD00001009992	The dataset includes fastq files from 15 NEN tumors or metastases and matched PDTOs. The sequencing was done with either a Nextseq 2000 or a NovaSeq 6000 instrument.	Illumina NovaSeq 6000 NextSeq 500	1
EGAD00001009993	The dataset includes fastq files from 2 LCNEC tumors and matched PDTOs. The sequencing was done with either a Nextseq 2000 or a NovaSeq 6000 instrument.	Illumina NovaSeq 6000 NextSeq 500	1
EGAD00001009994	The dataset includes RNA-seq expression R data, RNA-seq gene counts matrix, and RNA-seq gene FPKM matrix from 21 NEN tumors or metastases and matched PDTOs. The sequencing was done with either a Nextseq 2000 or a NovaSeq 6000 instrument.		1
EGAD00001009995	Adipose-derived mesenchymal stromal cells from subcutaneous (n=4) and visceral (n=4) tissue, along with dermal fibroblasts (n=3) were analyzed by single-cell RNA sequencing.	NextSeq 550	11
EGAD00001009997	RNAseq in cryostat-microdissected metastatic and primary prostate cancer tissues and matched noncancerous tissues from the same study subjects.	HiSeq X Ten	130
EGAD00001009998	Long-read (PacBio) RNA sequencing dataset of in vitro stimulated PBMC cells. 5 samples consisting of 1 RPMI control and 4 stimulus conditions (lipopolysaccharide (LPS), polyI-polyC, S. aureus and C. albicans) all originating from one donor. Files are raw BAM format files generated by Sequel 2 machine.	Sequel	5
EGAD00001009999	Cell-free methylated DNA immunoprecipitation sequencing of plasma samples from healthy control patients.	unspecified	28
EGAD00001010000	Shallow whole genome sequencing of plasma samples from healthy control patients.	unspecified	30
EGAD00001010001	Targeted panel sequencing of hereditary cancer syndrome-associated genes (TP53, BRCA1, BRCA2, PALB2, MLH1, MSH2, MSH6, PMS2, EPCAM, and APC) in plasma and buffy coat samples from healthy control patients.	unspecified	23
EGAD00001010002	Shallow whole genome sequencing of plasma samples from patients with Li-Fraumeni syndrome.	unspecified	173
EGAD00001010003	RNA-seq data. 287 Japanese RCC cases.	Illumina HiSeq 2500	574
EGAD00001010004	CD8 T cells (5 Samples treated, 5 samples control) form the same donors (5 donors, paired design) was subject to RNA-seq (Illumina stranded mRNA) processing. Single end fastq-files are supplied.	NextSeq 550	10
EGAD00001010005	CD8 Tcells were FACS sorted and processed with 10x Genomics Chromium Next GEM SingleCell V(D)J Reagents Kits v1.1 sequencing. In total 6 samples were processed. Fastq files are supplied.	NextSeq 550	6
EGAD00001010007	CD8 Tcells were FACS sorted and processed with 10x Genomics Chromium Next GEM SingleCell V(D)J Reagents Kits v1.1 sequencing. In total 6 samples were processed. Fastq files are supplied.	NextSeq 550	6
EGAD00001010008	In Vivo Loss of Tumorigenicity in a Patient-Derived Orthotopic Xenograft Mouse Model of Ependymoma. Whitehouse et al. 2023 Frontiers in Oncology. We describe the establishment of a patient-derived orthotopic xenograft (PDOX) model of posterior fossa A (PFA) EPN, derived from a metastatic cranial lesion. Patient and PDOX tumors were analyzed using RNA sequencing. WSG data (paired end) provided here correspond to germline DNA, Surgical sample 4 (described in the above manuscript as a cranial metastasis of PFA ependymoma), and a patient-derived xenograft derived from the patient.	HiSeq X Ten unspecified	10
EGAD00001010009	Fasq-files from 3 unaffected TET2 mutation carriers, 2 mutation carriers diagnosed with lymphoma and 3 family members without TET2 mutation. Time series data collected from 0, 6 and 12 months after daily dose of 1g vitamin C.	Illumina NovaSeq 6000	24
EGAD00001010010	Here, we explore the molecular signatures in RNA sequencing data from blood associated with disease severity as measured in Myotonic dystrophy type 1 (DM1) patients with less than 400 CTG-repeat length size in the DMPK gene in blood. These DM1 patients participated in the OPTIMISTIC study. This approach involved stratifying those within the OPTIMISTIC study into different patient groups with different degrees of disease severity (as measured by the muscle-impairment rating scale (MIRS)) and assessed at baseline. Patients were divided into groups with mild (MIRS 1–2) and severe (MIRS 3–5) neuromuscular symptoms with different DMPK repeat length characteristics. Therefore these .Bam files are baseline samples from this study.	Ion Torrent Proton	32
EGAD00001010011	Targeted panel sequencing of hereditary cancer syndrome-associated genes (TP53, BRCA1, BRCA2, PALB2, MLH1, MSH2, MSH6, PMS2, EPCAM, and APC) in plasma and buffy coat samples from patients with Li-Fraumeni syndrome.	unspecified	151
EGAD00001010012	Dataset containing 48 samples: 12 per timepoint (before or after treatment) and group (MMR vaccine or Placebo). Each sequencing run contains the sequencing data from 4 randomized samples. Genotype data is used to demultiplex sample ids inside of each pool. Phenotype data contains the information per pool.	Illumina NovaSeq 6000	23
EGAD00001010013	Cell-free methylated DNA immunoprecipitation sequencing of plasma samples from patients with Li-Fraumeni syndrome.	unspecified	174
EGAD00001010014	WGS Cram files from the Childhood Cerebral Palsy Integrated Neuroscience Discovery Network "CP-NET" - Clinical Database Platforms - Phase 3 project.	HiSeq X Ten	287
EGAD00001010015	Genotype data typed on the Human Origins array for 1510 individuals published in "Dense sampling of ethnic groups within African countries reveals fine-scale genetic structure and extensive historical admixture."		1
EGAD00001010016	BAM files containing paired-end mtDNA sequencing data from morphologically normal human liver. CCO-proficient hepatocytes acquired from human livers in which clonal CCO-deficient hepatocyte patches had been previously identified. Individual BAM files are named according to their patch, line and sample location, where PT denotes tissue near to the portal triad (PT), central hepatic vein (CV) and midway between these two structures (Mid). "Stroma" control samples were used for identifying germ-line variants. Sequenced on NextSeq 500 platform.	NextSeq 500	80
EGAD00001010017	This dataset is part of a study that aims to compare in vivo human trophoblast differentiation into EVTs to different in vitro trophoblast organoids using single-cell and single-nuclei RNA sequencing. This specific dataset includes scRNA-seq and snRNA-seq data from trophoblast stem cells (TSCs). Trophoblast stem cell (TSC) lines BTS5 and BTS11 derived by Okae and colleagues were grown as described previously (Okae et al. 2018) together with EVT differentiation media. This study shows that the main regulatory programs mediating EVT invasion in vivo are preserved in in vitro models of EVT differentiation from primary trophoblast organoids and trophoblast stem cells. Data for primary trophoblast organoids is available under E-MTAB-12650.	Illumina NovaSeq 6000	6
EGAD00001010018	Chimeric antigen receptor (CAR)-modified T-cells have become established as an effective treatment of haematological cancers. In the context of relapsed and refractory childhood pre-B cell acute lymphoblastic leukaemia (B ALL), CD19 targeting CAR T-cells often induce durable remissions. Previously, we generated a novel low-affinity CAR incorporating a CD19-specific single-chain variable fragment (scFV) called CAT, displaying a faster off-rate of interaction than the FMC63 CD19 binder used in prior clinical studies. Here, we systematically analysed CD19 CAR T-cells of ten children with relapsed or refractory B ALL enrolled in the CARPALL trial (NCT02443831). To characterize persisting CD19 CAR T-cells, we performed high throughput single-cell gene expression and T-cell receptor (TCR) sequencing of infusion products and serial blood and bone marrow samples up to five years post-infusion. We isolated CAR T-cells from peripheral blood or bone marrow by flow cytometry for CD3 and CAR expression, prior to single cell sequencing (Chromium 10X) platform.	Illumina HiSeq 4000 Illumina NovaSeq 6000	66
EGAD00001010019	Sequencing data of 10 high-grade serous carcinoma (HGSC) patients (58 samples including blood derived normal samples as germline controls, fresh frozen tissue samples as tissue controls and organoids) sequenced with HiSeq X Ten / BGISEQ-500 / MGISEQ-2000.	HiSeq X Ten unspecified	58
EGAD00001010020	Single-cell RNA-seq and spatial transcriptomics data for 12 patients with sarcoidosis. From each patient, we analyzed skin biopsies of both lesional and non-lesional skin and we performed spatial transcriptomics for lesional skin samples. The data are provided as aligned BAM files.	Illumina HiSeq 4000 Illumina NovaSeq 6000	41
EGAD00001010021	Air Pollution Study - DuplexSeq data	unspecified	81
EGAD00001010022	Bank of treateed and control PDXs metastastic colorectal cancer sample RNAseq	unspecified	185
EGAD00001010023	Bulk B Cell Receptor high-throughput sequencing data across 27 metastatic breast tumours obtained from 8 donors with therapy-resistant lethal metastatic breast cancer at the time of a warm autopsy. The 35 samples were sequenced on an Illumina MiSeq instrument and their raw FastQ files deposited here.	Illumina MiSeq	27
EGAD00001010024	Bronchial brushing dataset from healthy never-smokers after exposure to diesel exhaust. Include 18 samples from 9 research participants who underwent bronchoscopy after controlled exposure to diesel exhaust. Main study design described in detail in Ryu et al 2022 AJRCCM (PMID: 35202552). This dataset was used in Hill et al Nature 2023.	Illumina NovaSeq 6000	18
EGAD00001010025	scRNA This dataset contains 50 scRNA-seq samples from bone marrow aspirates of 11 multiple myeloma patients experiencing long-term survival and 3 healthy donors. For each donor, total bone marrow and CD3+ T cells were sequenced. For multiple myeloma patients, paired samples were collected at initial diagnosis and between 7-17 years after first-line therapy. Bone marrow mononuclear cells were isolated by Ficoll density gradient centrifugation. For sorting of total bone marrow cells singlet, live cells were gated and sorted, for sorting of T cells CD45+, CD3+ cells were gated and sorted on either FACSAria Fusion or FACSAria II. Single-cell RNA sequencing were generated using 10x Genomics single-cell RNAseq technology (Chromium Single Cell 3’ Solution v2) according to the manufacturer’s protocol and sequenced on an Illumina HiSeq4000 (paired end, 26 and 74 bp). Bulk RNA Singlet, live CD3+CD4- CXCR3+CD8+ and CD3+CD4- CXCR3-CD8+ cells were sorted from 7 bone marrow and 3 peripheral blood samples of 7 multiple myeloma patients using a FACSAria Fusion machine. Bulk-RNA sequencing libraries were generated using the SMART Seq Stranded Total RNA-Seq kit (Takara) and sequenced using the Illumina NovaSeq 6000 platform (2 x 100 bp).	Illumina HiSeq 4000 Illumina NovaSeq 6000	82
EGAD00001010026	The dataset consists of: 51 paired tumor/normal WGS samples (26 tumors and 25 normals), and 13 normal targeted samples.	Illumina NovaSeq 6000	64
EGAD00001010028	Metagenomics shotgun sequencing was conducted on fecal samples from the Australian patients enrolled in the OpACIN-neo clinical trial (n = 38). Metagenomic shotgun sequencing was performed utilizing the same DNA from the same preparations as for the 16S rRNA gene analysis. Individual libraries were prepared using Nextera XT, and sequencing was performed on the Illumina NovaSeq 6000 S1 (2 x 150bp; Xp workflow).	Illumina NovaSeq 6000	38
EGAD00001010029	Whole genome sequencing data of 19 high-grade serous carcinoma (HGSC) patients (47 samples) sequenced with HiSeq X Ten.	HiSeq X Ten	78
EGAD00001010030	Dataset with whole-genome sequencing tumor and normal samples from 14 neuroblastoma patients.	HiSeq X Ten	28
EGAD00001010031	Whole-exome sequencing of tumour regions and deep targeted sequencing of plasma samples.	Illumina HiSeq 2000 unspecified	1106
EGAD00001010033	RNAseq data from Passman et al 2023. Clonal CCO-deficient hepatocyte patches and nearby CCO-proficient hepatocytes were identified in morphologically normal human livers and sampled at varying distances along the PT-CV axis. Samples are characterised according to their location within the liver lobule, with "PT" denoting samples abutting the portal triad, "CV" denoting samples abutting the central hepatic vein, and "Mid" sampled acquired midway between these structures. Analysis was performed on an Illumina Nextseq using a high output kit and 100 single-end cycles.	NextSeq 500	114
EGAD00001010034	This dataset has the raw RNA sequencing data for the cancer models in CCMA.	unspecified	184
EGAD00001010035	This dataset has the mapped bam files from WGS for the cancer models in CCMA.	unspecified	146
EGAD00001010036	This dataset includes trios of germline/constitutional, primary tumor (small bowel carcinoid), and metastatic tumor (liver) trios for 5 patients (i.e. 15 samples total). Constitutional, primary tumor, and metastatic tumor samples all underwent whole exome sequencing (WES or WXS). Primary and metastatic tumor samples underwent RNA sequencing (RNA-seq).	Illumina NovaSeq 6000	15
EGAD00001010037	This dataset is part of a study that aims to provide a spatially resolved single-cell multiomics map of human trophoblast differentiation in early pregnancy. This dataset contains snucRNAseq; from three human implantation sites (between 8 and 12 post-conceptional weeks, PCW) from medical hysterectomies.	Illumina NovaSeq 6000	11
EGAD00001010038	This dataset is part of a study that aims to provide a spatially resolved single-cell multiomics map of human trophoblast differentiation in early pregnancy. This dataset contains 10x multiome snRNA-seq/snATAC-seq from human implantation sites, decidual and placental samples from 8-9 PCW.	Illumina NovaSeq 6000	6
EGAD00001010039	Whole-exome sequencing was performed on DNA extracted from blood samples of 50 children diagnosed with cutaneous melanoma prior to 18 years of age. Patients were all diagnosed in Queensland, Australia, and self-reported as of European descent. Sequencing was done the Illumina platform using SureSelct V7 Post capture kits.	Illumina NovaSeq 6000	6
EGAD00001010040	scRNA-seq analysis of 384 placental immune cells	Illumina NovaSeq 6000	1
EGAD00001010041	The cohort comprised of 48 pediatric patients with 21 different relapsed or refractory solid neoplasms. This cohort was analysed by RNASeq. The corresponding datasets contains fasq files.	NextSeq 500	46
EGAD00001010042	Molecular analysis of cancer genomes in children with Lynch syndrome: exploring causal associations WGS		1
EGAD00001010043	Molecular analysis of cancer genomes in children with Lynch syndrome: exploring causal associations WXS		1
EGAD00001010044	The single cell data were generated based on four cartridges. The samples are multiplexed: 0. Sample Run Id Hash Gender Age 1. FL-6 Run1_C1 4 f 71 2. rLN-3 Run1_C1 6 f 72 3. DLBCL-4 Run1_C1 7 f 68 4. DLBCL-1 Run1_C1 8 m 66 5. FL-4 Run1_C1 9 m 41 6. DLBCL-3 Run1_C2 10 f 60 7. rLN-4 Run1_C2 11 f 58 8. rLN-5 Run1_C2 12 m 61 9. FL-1 Run2_C1 1 f 81 10. DLBCL-2 Run2_C1 2 m 80 11. rLN-2 Run2_C1 3 f 42 12. FL-5 Run2_C1 4 m 66 13. DLBCL-8 Run2_C1 5 m 64 14. DLBCL-6 Run2_C2 6 m 76 15. rLN-1 Run2_C2 7 m 39 16. FL-3 Run2_C2 8 f 76 17. DLBCL-5 Run2_C2 9 m 66 18. FL-2 Run2_C2 10 f 58 19. DLBCL-7 Run2_C2 11 f 71	NextSeq 500	4
EGAD00001010046	WES/WGS sequencing data of 37 germline runs, which were uploaded to umbrella studies. The sequencing was always paired. The WGS sequencing was on HiSeq X Ten using the Illumina TruSeq DNA Nano Kit. The WES Sequencing was on HiSeq4000 with Agilent Sureselect V5+UTR.	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000	1
EGAD00001010047	The control samples (mostly blood) of 351 samples (paired WGS and WES sequencing) are in this dataset. The WGS was in nearly all cases at an Illumina HighSeq X Ten with the Illumina TruSeq Nano DNA Kit. The WES mostly on Illumina HighSeq 4000 with the Agilent SureSelect V5 plus UTRs Kit.	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina NovaSeq 6000	353
EGAD00001010049		Illumina NovaSeq 6000	1
EGAD00001010050	Dataset includes 1) RNAseq for Bob_Ngn2 cell differentiation from iPSC to iNeuron stage (time point 0, 24, 48 and 96 hours). 2) Bob_Ngn2 cell line chip-seq (H3K4me3, H3K4me1, H3K27me3, H3K9me3, H3K27ac, H3K36me3) for both iPSC and iNeuron stage with 3 replicates at each time point. 3) Single cell CRISPR activation experiment with 96 endogenous genes.	Illumina HiSeq 4000 Illumina NovaSeq 6000	38
EGAD00001010051	Dataset of CageKid Targeted Sequencing DNA samples	Illumina HiSeq 2500 Illumina NovaSeq 6000	1022
EGAD00001010052	Samples of blood, muscle and fat were collected from individuals with TS (n = 33) and KS (n = 22) and from male (n = 16) and female (n = 44) controls. The RNA-seq libraries were multiplexed paired-end sequenced on an Illumina Novaseq 6000 (100 bp) and subjected to initial quality control using FastQC (BAbraham Bioinformatics). In addition to trimming of low-quality ends, adaptor removal was conducted using Trim Galore with default settings (BAbraham Bioinformatics).	Illumina NovaSeq 6000	212
EGAD00001010055	We have paired 39 individuals in 4 conditions: T0_RPMI (baseline, before BCG and without LPS); T0_LPS (before BCG and with LPS); T3m_RPMI (3 months after BCG, without LPS); T3m_LPS (3 months after BCG, with LPS).	Illumina NovaSeq 6000	32
EGAD00001010056	This multi-centre, non-randomized, open-label, phase II trial (NCT03016338), assessed niraparib monotherapy (cohort 1, C1), or niraparib and dostarlimab (cohort 2, C2) in patients with recurrent serous or endometrioid endometrial carcinoma. The primary endpoint was clinical benefit rate (CBR). Secondary outcomes were safety and objective response rate (ORR). Translational research was an exploratory outcome. Potential biomarkers were evaluated in archival tissue by immunohistochemistry and next generation sequencing panel. Feasibility of liquid biopsy by ctDNA was assessed.	NextSeq 500	86
EGAD00001010057	We analyzed multiple myeloma samples from two patients included in the observational prospective cohort MYRACLE before talquetamab treatment and after relapse. Five other myelomas from the same cohort were included for comparison. Normal plasma cells were also retrieved. All samples were analyzed by whole genome sequencing and single-nucleus Multiome, except one that could only be analyzed by bulk RNA sequencing.	Illumina NovaSeq 6000	17
EGAD00001010059	Dataset including paired tumor-normal whole-genome deep-sequenced samples from 18 neuroblastoma patients (part1 of a total of 36 patients).	unspecified	36
EGAD00001010063	Dataset including paired tumor-normal whole-genome deep-sequenced samples from 18 neuroblastoma patients (part 2 of a total of 36 patients).	unspecified	1
EGAD00001010064	38 STEMI patients at hospital admission, 24 hours (acute phase) and 6-8 weeks (chronic phase) after STEMI	unspecified	9
EGAD00001010065	Dataset for the paper: Non-muscle Invasive Bladder Cancer Molecular Subtypes Predict Differential Response to Intravesical Bacillus Calmette-Guérin	Illumina HiSeq 4000 Illumina NovaSeq 6000	327
EGAD00001010066	Whole genome sequencing data of 7 high-grade serous carcinoma (HGSC) patients (32 samples) sequenced with HiSeq X Ten.	HiSeq X Ten	-
EGAD00001010067	Bank of metastasis-derived organoids (LMO)	unspecified	220
EGAD00001010068	156 samples of shot-gun gut metagenomics, corresponding to 51 patients with CRCm 54 patients with adenoma, and 51 healthy controls	Illumina HiSeq 4000	2
EGAD00001010069	Dataset contains all available exome sequencing paired-end fastq files from our study "A generalizable machine learning framework for classifying DNA repair defects using ctDNA exomes"	Illumina NovaSeq 6000	310
EGAD00001010070	Targeted gene panel sequencing for 206 genes with relevance in normal an leukemic lymphopoiesis was performed in bone marrow / peripheral blood samples from n=96 patients with first diagnosis of B cell precursor acute lymphoblastic leukemia.	Illumina HiSeq 1500	96
EGAD00001010071	This dataset includes all data produced in the study describing "scEC&T-seq", a method for parallel sequencing of extrachromosomal circular DNA and transcriptome in single cells. This dataset includes: - Illumina scEC&T-seq Circle-seq data (scCircle-seq) for a total of 626 single cells / nuclei - bam files - Illumina scEC&T-seq RNA-seq data (scRNA-seq-Illumina) for the same single cells / nuclei - bam files - Nanopore scCircle-seq data for 18 single cells - bam files - Nanopore bulk WGS for 2 cell lines and 2 primary tumor samples - bam files - Illumina bulk WGS for 2 cell lines - bam files - Illumina bulk Circle-seq data from 1 cell line - bam file - Illumina ChIP-seq H3K27me3 data from 1 cell line - fasta files + peaks bed file + coverage bw file	Illumina HiSeq 4000 Illumina MiniSeq Illumina NovaSeq 6000 MinION	1180
EGAD00001010073	This data set include FASTQ files for five experiments: Human scRNA-Seq data (6 samples), Human Visium spatial transcriptomic data (3 samples), Mouse scRNA-seq data (4 samples), Mouse scATAC-seq data (2 samples), Mouse ChIP-seq data (8 samples). For the mouse scATAC-seq data (2 samples), there are originally three FASTQ files, R1, R2 and R3 files. The R1 and R2 FASTQ files were merged into one larger file ("R1R2") per sample for submission as paired-end sequencing setting to follow the EGA guidelines. They can be split into individual R1 and R2 files in order to be processed by Cell Ranger software. The files with IC1 suffix are CHIP-seq Input control for anti-KDM6B antibody pull down experiments and the files with IC2 suffix are CHIP-seq Input control for anti-H3K27ME3 antibody pull down experiments in murine model.	NextSeq 500 unspecified	22
EGAD00001010074	Data supporting: “Single-cell RNA sequencing unifies developmental programs of Esophageal and Gastric Intestinal Metaplasia.” Nowicki-Osuch, Zhuang et al. scRNAseq (FASTQ files) 59 samples	unspecified	16
EGAD00001010075	WGS was performed for five Japanese subjects. DNA samples isolated from whole blood were sequenced at Macrogen Japan Corporation. All libraries were constructed using the TruSeq DNA PCR-Free Library Preparation Kit according to the manufacturer’s protocols. Libraries were sequenced on HiSeqX (Illumina, San Diego, CA, USA) or Novaseq6000 (Illumina, San Diego, CA, USA).	Illumina HiSeq 3000 Illumina NovaSeq 6000	5
EGAD00001010076	This datasets contains raw sequencing fastq data of 17 samples from 7 donors of single cell RNA using10x genome technology of human postmenopausal fallopian tube and ovary tissues.	NextSeq 500	17
EGAD00001010077	This datasets contains raw sequencing fastq data of14 samples from 5 donors of single cell ATAC using10x genome technology of human postmenopausal fallopian tube and ovary tissues.	NextSeq 500	14
EGAD00001010078	RNA-sequencing profiling of leucocytes from peripheral blood samples from 9 KS patients, 9 control males and 13 female controls	Illumina NovaSeq 6000	18
EGAD00001010079	446 samples of covid19 patients. Raw Reads in fastq format.	Illumina NovaSeq 6000	446
EGAD00001010080	This depository contains data from two bulk RNA sequencing experiments: 1) Bulk RNA sequencing data of peripheral blood neutrophils from healthy donors cultured with a) human adipose-derived stromal cells (ADSC) as a model for mesenchymal stromal cells (MSC), and b) IL-1β stimulated ADSC as a model for inflammatory MSC as found in multiple myeloma (MM). 2) Bulk RNA sequencing data from ADSCs cultured a) without stimuli, b) with recombinant human IL-1β, c) with supernatant from iMSC-like cells, d) with neutrophils previously cultured with MSC, e) with neutrophils previously cultured with iMSC, f) with neutrophils previously cultured with iMSC in the presence of anti-human IL-1β or g) with neutrophils previously cultured with iMSC in the presence of an isotype control.	Illumina NovaSeq 6000	41
EGAD00001010081	28 patients. Cell-free DNA and leukocyte DNA, both from before any neoadjuvant treatment. Tumor FFPE tissue (plus metastasic tissue for some patients) from after any neoadjuvant treatment. Some cfDNA from follow-up time points (e.g. after neodj. treatment or in the clinical course). IDT xgen Pan-Cancer panel for hybridisation capture. Illumina TruSeq library prep for cfDNA and leukocyte DNA, Illumina DNA Prep for FFPE DNA. n=150 libraries in total.	Illumina NovaSeq 6000	150
EGAD00001010082	Technical replicates	NextSeq 500	1
EGAD00001010083	Technical replicates	NextSeq 500	1
EGAD00001010084	Freezing replicates	NextSeq 500	1
EGAD00001010085	Freezing replicates	NextSeq 500	1
EGAD00001010086	FACS processing technical replicates	NextSeq 500	1
EGAD00001010087	FACS processing technical replicates	NextSeq 500	1
EGAD00001010088	time-course biological replicates; 10x lane replicate 1	NextSeq 500	1
EGAD00001010089	time-course biological replicates; 10x lane replicate 1	NextSeq 500	1
EGAD00001010090	time-course biological replicates; 10x lane replicate 2	NextSeq 500	1
EGAD00001010091	time-course biological replicates; 10x lane replicate 2	NextSeq 500	1
EGAD00001010092	time-course biological replicates; 10x lane replicate 3	NextSeq 500	1
EGAD00001010093	time-course biological replicates; 10x lane replicate 3	NextSeq 500	1
EGAD00001010094	Fastq files of single-cell RNA-sequencing data generated with 10X Genomics of twelve non-invasive cervical samples from pregnant women (7-12 weeks gestational age) and six placental biopsies from patients who had a recurrent miscarriage early during gestation (<12 weeks gestational age).	Illumina NovaSeq 6000	18
EGAD00001010095	Whole-genome sequencing data in the form of multi-sample, per-chromosome VCFs for n=449 individuals across 47 unique ethnolinguistic groups.		1
EGAD00001010096	This dataset contains single-cell RNA and DNA sequencing data (fastq, n=928) obtained after genome-and-transcriptome separation. The RNA-seq data was obtained by Smart-seq2 amplification and Nextera XT library preparation. The DNA-seq data was obtained either by Gtag library preparation and amplification or by PicoPlex whole-genome amplification followed by Nextera XT library preparation. The single cells originate from a human PDX melanoma model and from HCC38 and HCC38 BL cell lines. The sequencing libraries were sequenced with Illumina instruments.	Illumina HiSeq 2500 unspecified	2080
EGAD00001010097	In order to investigate possible mechanisms underlying the phenotype of cell fitness decline observed following decrease of VRK3 in pontine DMG-K27 altered cells 7, we examined differences in global gene expression. RNA-seq was performed 44h and 60h post-transduction with two distinct shRNAs targeting VRK3 to be able to evaluate the early impact of VRK3 knock-down (KD) in four independent in vitro models of DMG	Illumina NovaSeq 6000	36
EGAD00001010098	Dataset contains 483 Irish origin Individuals with Covid19. For WGS, Alignment has been done using BWA-mem (sention v 201808.03 ) and BAM file generated.	Illumina NovaSeq 6000	483
EGAD00001010099	The Dataset contains 446 Covid19 patient's RNASeq Alignment files in BAM format for both genomic and transcriptomic alignment.	Illumina NovaSeq 6000	446
EGAD00001010100	These are aligned paired-end reads from Illumina NovaSeq 6000 whole-genome sequencing of 4 cfDNA samples extracted from blood plasma (plasma-Seq). Three samples from patients with breast cancer, prostate cancer, or colorectal cancer and one sample from a healthy individual were aligned to GRCh38 (GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set). The observed GC bias is different in each of these cfDNA samples which leads to different average GC content per sample. This bias is corrected using the GCparagon commandline tool.	Illumina NovaSeq 6000	2
EGAD00001010101	Bank of primary sites (PRs) colorectal cancer of Patient Derived Xenografts (PDXs)	unspecified	159
EGAD00001010102	Whole exome and RNA sequencing of 5 samples of patient-derived xenograft (PDX). Available files are raw sequencing fastq files.	Illumina NovaSeq 6000	5
EGAD00001010103	This dataset included 19 paired diagnostic and remission samples with high hyperdiploid acute lymphoblastic leukemia (ALL) that were collected from four different cohorts: the Division of Clinical Genetics, Lund University, Sweden. All samples were subjected to whole genome sequencing using the Illumina HiSeqX platform. Paired-end sequencing (2x150bp) was done to ~60x coverage for diagnostic samples and ~30x coverage for remission samples. The paired-end reads were aligned to the human reference genome GRCh37 (ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/vertebrate_mammalian/Homo_sapiens/all_assembly_versions/GCF_000001405.25_GRCh37.p13/GCF_000001405.25_GRCh37.p13_genomic.fna.gz) by the Burrows-Wheeler Aligner tool (version 0.7.17). Duplicate reads marking and local realignment were performed by GATK (version 4.0.11.0).	HiSeq X Ten	23
EGAD00001010104	LLD PhIP-Seq reads. Oligopeptides were designed at Eran's Segal group at Weizmann Institute of Science. Dataset include 1,783 plasma samples. In 340 participants, a second time point was taken after a 4-year follow-up.	NextSeq 500	1784
EGAD00001010105	This submission contains gzipped fastq files from paired-end targeted sequencing.	HiSeq X Ten	242
EGAD00001010106	This dataset contains scRNA-seq fastq files to the paper entitled "Single-cell profiles reveal distinctive immune response in atopic dermatitis in contrast to psoriasis". The details of experiment setup was described in the paper.	NextSeq 500	12
EGAD00001010108	Whole genome, exome and RNA sequencing of TFCP2-rearranged rhabdomyosarcoma, 86 samples of paired fastq files, sequenced on: Illumina HiSeq 2500 using Agilent SureSelect WGS, Illumina HiSeq 2500 using Illumina TruSeq RNA, Illumina HiSeq 2500 using Agilent SureSelect v5 WES (+UTRs), Illumina HiSeq 4000 using Agilent SureSelect WGS, Illumina HiSeq 4000 using Illumina TruSeq stranded mRNA Kit, Illumina HiSeq 4000 using Agilent SureSelect v5 WES stranded mRNA Kit, Illumina NovaSeq 6000 using Illumina TruSeq Stranded mRNA, Illumina NovaSeq 6000 using Illumina TruSeq Nano DNA, Illumina HiSeq X Ten using TruSeq Nano DNA.	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina NovaSeq 6000	50
EGAD00001010109	Sequencing of LCM-derived microbiopsies from 10 women who underwent reduction mammoplasty. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this portion of the study will be compared to women who had breast cancer and those who are BRCA 1/2 carriers. This dataset contains all the data available for this study on 2023-03-08.	HiSeq X Ten	48
EGAD00001010110	Sequencing of LCM-derived microbiopsies from 10 women who underwent reduction mammoplasty. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Exome data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequently be sent for whole-genome sequencing. Results from this portion of the study will be compared to women who are BRCA1/2 germline carriers and those with cancer. . This dataset contains all the data available for this study on 2023-03-08.	HiSeq X Ten Illumina HiSeq 4000	92
EGAD00001010111	Sequencing of LCM-derived microbiopsies from explanted lung from COPD patient. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Deep sampling throughout multiple regions of the lung will determine whether there are differences in smoking-related mutaiton burden in different portions of the lung. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this poriton of the study will be compared to other individuals with smoking-related diseases (COPD, pulmonary fibrosis, lung cancer), and normal, non-smoking lungs. . This dataset contains all the data available for this study on 2023-03-08.	Illumina NovaSeq 6000	25
EGAD00001010112	Sequencing of LCM-derived microbiopsies from 20 women who underwent risk-reducing reduction mastecomies due to germline BRCA1/2. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Exome data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequenctly be sent for whole-genome sequencing. Results from this poriton of the study will be compared to women who had cosmetic breast reduction surgeries and those with cancer. . This dataset contains all the data available for this study on 2023-03-08.	Illumina NovaSeq 6000	67
EGAD00001010113	Sequencing of LCM-derived microbiopsies from 10 women who underwent reduction mammoplasty. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Exome data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequenctly be sent for whole-genome sequencing. Results from this poriton of the study will be compared to women who are BRCA1/2 germline carriers and those with cancer. . This dataset contains all the data available for this study on 2023-03-08.	Illumina NovaSeq 6000	48
EGAD00001010114	Sequencing of LCM-derived microbiopsies from 40 women who underwent mastecomies due to breast cancer. LCM and sequencing will be conducted on both normal, unaffected breast, and, where possible, tumour tissue. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue, and compare findings between the normal and associated cancer tissues. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this poriton of the study will be compared to women who had cosmetic breast reduction surgeries and those who are BRCA carriers. . This dataset contains all the data available for this study on 2023-03-08.	HiSeq X Ten Illumina NovaSeq 6000	251
EGAD00001010115	Sequencing of LCM-derived microbiopsies from 20 women who underwent risk-reducing reduction mastecomies due to germline BRCA1/2. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Exome data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequenctly be sent for whole-genome sequencing. Results from this poriton of the study will be compared to women who had cosmetic breast reduction surgeries and those with cancer. . This dataset contains all the data available for this study on 2023-03-08.	Illumina NovaSeq 6000	315
EGAD00001010116	Sequencing of LCM-derived microbiopsies from 10 women who underwent reduction mammoplasty. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue. Exome data will be used as a driver and clonality screen, highly clonal or driver-containing samples will subsequenctly be sent for whole-genome sequencing. Results from this poriton of the study will be compared to women who are BRCA1/2 germline carriers and those with cancer. . This dataset contains all the data available for this study on 2023-03-08.	Illumina NovaSeq 6000	199
EGAD00001010117	Sequencing of LCM-derived microbiopsies from 40 women who underwent mastecomies due to breast cancer. LCM and sequencing will be conducted on both normal, unaffected breast, and, where possible, tumour tissue. Goal to assess the mutational burden, spectrum, and clonal dynamics within the tissue, and compare findings between the normal and associated cancer tissues. Whole-genome sequencing will be conducted on samples identified as promising from the initial targeted data. Results from this poriton of the study will be compared to women who had cosmetic breast reduction surgeries and those who are BRCA carriers. . This dataset contains all the data available for this study on 2023-03-08.	Illumina HiSeq 4000 Illumina NovaSeq 6000	480
EGAD00001010118	PhIP-Seq experiment conducted in Eran Sigal's group at WIS.	NextSeq 500	497
EGAD00001010119	Formalin-fixed, paraffin-embedded samples from 643 colorectal adenomas collected in different hospitals in Norway, from which DNA was extracted, were analysed for DNA copy number alterations. Some adenomas had more than one block (n=42), thus 643 individuals, 685 blocks. Low-coverage whole genome sequencing was run in all samples. For 529 individuals all the clinical information was available. A subset was matched for follow-up time, age and sex in a nested case-control approach (n=366; cases - individuals who developed later on CRC; controls - individuals who did not develop CRC within the same follow-up time).	Illumina HiSeq 2500	685
EGAD00001010120	Microhaplotype amplicon sequencing of cervical samples (n=10), parental DNA (n=20), cfDNA (n=10) and control experiments using HapMap DNA in different spike-in percentages.	Illumina NovaSeq 6000	81
EGAD00001010121	Early-stage Luminal B breast cancer is frequent and is a major cause of breast cancer death due to its poor prognosis. Our proposal aims to study the biology behind the sensitivity and resistance of Luminal B breast cancer to chemotherapy (CHT) or a non-CHT regimen composed of hormone therapy in combination with ribociclib, a CDK4/6 inhibitor. To accomplish this, we first completed the SOLTI-1402 CORALLEEN phase II trial, a study where 106 patients with early-stage Luminal B breast cancer were randomized to standard neoadjuvant CHT for 6 months, or neoadjuvant letrozole and ribociclib for 6 months. After treatment, patients underwent surgery. The primary results of the study, which showed that the response rate to letrozole+ribociclib was similar to CHT, was reported (Prat et al; Lancet Oncol). Tumor biopsies were available at baseline, week 3 and surgery. A total of 257 samples were analyzed using the Illumina TruSeq Stranded Total RNA w/Ribo Zero Gold with MiSeq in TGL (Sequencer NovaSeq S4/PE/100x)	Illumina HiSeq 2500 Illumina NovaSeq 6000	257
EGAD00001010122	In this study we aim to characterise the landscape of mutation and clonal selection in normal lung and premalignant lung disease. The study combines targeted sequencing and whole-genome sequencing of microbiopsies of lung and bronchial epithelium. The range of patients studied will include healthy individuals, both smokers and non-smokers, and patients with premalignant lung disease. . This dataset contains all the data available for this study on 2023-03-09.	Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina MiSeq	1
EGAD00001010123	Cancer is a genetic disease caused by an accumulation of mutations, however many of these mutations have been identified in pathologically normal tissue. We aim to use laser-capture microscopy (LCM) to sample individual clones from breast tissue to identify whether cancer-associated mutations appear in this normal tissue, assess the mutational burden present, and identify the mutational processes causing these mutations. We will sample from a wide age range of individuals (<20 to >70 years old) to determine whether these processes differ in pre- and post-menopausal women. We will also be comparing the tissue from healthy individuals (samples from breast reduction surgery) to those at elevated risk of breast cancer (mastectomy from BRCA1/2 patients) and those who have breast cancer (adjacent normal, distal normal, and tumour tissue from mastectomy). This will allow us to determine how these processes are different between these groups of individuals, and gain insight into the earliest stages of tumour development. . This dataset contains all the data available for this study on 2023-03-09.	HiSeq X Ten	1
EGAD00001010124	Whole genome sequencing to identify subclonal variants for subsequent mapping back to fixed tissue specimens. . This dataset contains all the data available for this study on 2023-03-09.	HiSeq X Ten	1
EGAD00001010125	This project is correlating the molecular profiling of renal tumours with multiparametric and 13C-MRI including by 13C-MRSI. . This dataset contains all the data available for this study on 2023-03-09.	Illumina HiSeq 4000	1
EGAD00001010126	Melanoma is the most aggressive type of skin cancer, causing about 75% of dermatological cancer deaths. Acral lentiginous melanoma (ALM) is the most common subtype of melanoma in admixed Latin American populations, but very few tumour genomes and exomes, all from European-descent individuals, have been analysed across several studies. Because of this, the genomic landscape of ALM is mostly unknown. Our aim in this project is to define this landscape and identify driver somatic alterations by whole-exome sequencing a collection of ALM germline/tumour paired FFPE samples from the National Cancer Institute of Mexico. . This dataset contains all the data available for this study on 2023-03-09.	Illumina HiSeq 4000	1
EGAD00001010128	Whole genome sequencing data from paediatric (<18-y) ETV6-RUNX1 fusion positive acute lymphoblastic leukemias. Dataset includes fastq and BAM files from diagnostic and remission (control) samples of 33 patients. Dataset consists of two experiments depending on sequencing instrument; "Experiment 1" sequenced by using Illumina Hiseq X Ten instrument, and "Experiment 2" Illumina Novaseq 6000, respectively.	HiSeq X Ten Illumina NovaSeq 6000	66
EGAD00001010129	We perform targeted sequencing on 1217 pre-malignant gastric biopsies.	Illumina NovaSeq 6000	1900
EGAD00001010130	This dataset contains single cell RNA-seq data of stromal cells derived from two PDX models (N = 2 in total) and bulk RNA-seq data of two PDX models treated with gemcitabine and our novel antibody-drug conjugate, C6-EBET (N = 59 in total). Bulk RNA-seq experiments were performed with Agilent SureSelect Strand Specific RNA Library Prep Kit (Agilent). Single cell RNA-seq experiments were performed with Chromium Single Cell 3' Reagent Kits v2 Chemistry (10x Genomics).	HiSeq X Ten Illumina NovaSeq 6000	61
EGAD00001010131	Bulk RNAseq from 183 premalignant gastric biopsies	Illumina NovaSeq 6000	183
EGAD00001010132	This dataset includes RNA-seq fastq files of pre-treatment and on-treatment samples from 28 patients	Illumina HiSeq 4000	62
EGAD00001010133	This dataset includes WES fastq files of pre-treatment and on-treatment samples from 29 patients	Illumina HiSeq 4000	94
EGAD00001010134	patient-derived head and neck cancer organoids	Illumina NovaSeq 6000	64
EGAD00001010135	FASTQ reads for 34 matched tumour-normal WGS pairs for high grade serous ovarian cancer patients. Scottish HGSOC samples were collected via local Bioresource facilities at Edinburgh, Glasgow, Dundee and Aberdeen and stored in liquid Nitrogen until required. HGSOC patients were determined from pathology records and were included in the study where there was matched tumour and whole blood samples. Tumour samples were divided into two for DNA and RNA extraction and slivers of tissue were taken, fixed in formalin and embedded in paraffin wax (FFPE). Samples were only included if they were confirmed as HGSOC and there was greater than 40% tumour cellularity throughout the tumour, determined using H&sE staining of the FFPE sections and pathology review. Somatic DNA was extracted from the tumour and germline DNA was extracted from whole blood. Somatic DNA was extracted using the Qiagen DNeasy Blood and tissue kit (cat no 69504). The tissue was initially homogenised using a Qiagen Bioruptor, followed by the manufacturers recommended protocol (including RNase digestion step). Germline DNA was extracted from 1-3ml whole blood using the Qiagen FlexiGene kit (cat no 51206) following the manufacturers recommended protocol. The resulting DNA underwent quality control as follows: firstly, A260 and A280nm were measured on a Denovix DS-11 Fx to qualitatively illustrate A260/280nm and A260/230nm ratios as surrogate measures of DNA purity. A260/280 had to be 1.8 or greater and A260/230 had to be 2.0 or greater. Then, DNA was quantified using LifeTechnologies Qubit dsDNA BR kit (cat no Q32850) and we required a minimum of 50ul at 25ng/ul for WGS. Thirdly, DNA was diluted to 25ng/ul and a representative sample was loaded onto a 0.8% TAE gel, ran at 100v for 60mins and then imaged using a BioRad ChemiDoc imaging system to visualise the DNA quality.	HiSeq X Ten Illumina NovaSeq 6000	68
EGAD00001010136	DDD resource files (e.g. link between sample and individual ids)		1
EGAD00001010137	Candidate diagnostic variants reported into DECIPHER by 4 April 2022, annotated with clinical and automated pathogenicity assertions (see DOI: 10.1056/NEJMoa2209046). Genomic Diagnosis of Rare Pediatric Disease in the United Kingdom and Ireland, Wright et al, NEJM 2023.		1
EGAD00001010138	483 samples were collected for WGS and aligned with GRCh38 human genome. The variants were called using GATK (Sentieon v. 201808.03) in GVCF and VCF format.		-
EGAD00001010139	Somatic RNA for 37 samples was extracted using the Qiagen Qiasymphony RNA protcol (cat no 931636). The tissue was initially homogenised using a Qiagen Bioruptor, followed by the manufacturers recommended protocol (including DNase digestion). The resulting RNA the underwent quality control as follows: firstly, A260 and A280nm were measured on a Denovix DS-11 Fx to qualitatively illustrate A260/280nm and A260/230nm ratios as measures of RNA purity. A260/280 had to be 2.0 and A260/230 had to be 2.0-2.2. Then RNA was quantified using LifeTechnologies Qubit RNA BR kit (cat no Q10210). RNAseq was carried out by the Edinburgh Clinical Research Facility on an Illumina NExtSeq500. Total RNA samples were assessed on the Agilent Bioanalyser (Agilent Technologies, #G2939AA) with the RNA 6000 Nano Kit (#5067-1512) for quality and integrity of total RNA, and then quantified using the Qubit 2.0 Fluorometer (Thermo Fisher Scientific Inc, #Q32866) and the Qubit RNA HS assay kit (#Q32855). Libraries were prepared from total-RNA sample using the NEBNext Ultra 2 Directional RNA library prep kit for Illumina (#E7760S) with the NEBNext rRNA Depletion kit (#E6310) according to the provided protocol. 400ng of totalRNA was then added to the ribosomal RNA (rRNA) depletion reaction using the NEBNext rRNA depletion kit (Human/mouse/rat) (#E6310). This step uses specific probes that bind to the rRNA in order to cleave it. rRNA-depleted RNA was then DNase treated and purified using Agencourt RNAClean XP beads (Beckman Coulter Inc, #66514). RNA was then fragmented using random primers before undergoing first strand and second strand synthesis to create cDNA. cDNA was end repaired before ligation of sequencing adapters, and libraries were enriched by PCR using the NEBNext Multiplex oligos for Illumina set 1 and 2 (#E7500). Final libraries had an average peak size of 271bp. Libraries were quantified by fluorometry using the Qubit dsDNA HS assay and assessed for quality and fragment size using the Agilent Bioanalyser with the DNA HS Kit (#5067-4626). Sequencing was performed using the NextSeq 500/550 High-Output v2 (150 cycle) Kit (# FC- 404-2002) on the NextSeq 550 platform (Illumina Inc, #SY-415-1002). Libraries were combined in an equimolar pool based on the library quantification results and run across 5 High-Output Flow Cell v2.5.	NextSeq 550	37
EGAD00001010140	This dataset contains 55 Whole Genome Sequencing of the study titled Spatial transcriptomics reveal topological immune landscapes of Asian head and neck angiosarcoma.	unspecified	59
EGAD00001010141	Shallow whole-genome sequencing data divided into three groups: - sWGS data from Pap smears of patients with confirmed high grade serous ovarian cancer - sWGS data from matched tumor tissue (at diagnosis) from the same patients - sWGS data from Pap tests smears of healthy women	NextSeq 550	186
EGAD00001010142	The PGAP dataset 2 includes 82 whole genome sequences for Papua New Guinean individuals sampled in Daru (N=1), Port Moresby (N=64) and Mount Wilhelm (N=17). DNA was extracted from saliva samples (Oragen kit). Sequencing libraries were prepared using the TruSeq DNA PCR-Free HT kit. 150 bp paired-end sequencing was performed on the Illumina HiSeq X5 sequencer. The PGAP dataset provides Fastq, mapped cram files (GRCh38) and phenotype measurements.	HiSeq X Five	82
EGAD00001010143	The PGAP dataset 1 includes 81 whole genome sequences for Papua New Guinean individuals sampled in Daru (N=38) and Mount Wilhelm (N=43). DNA was extrated from saliva samples (Oragen kit). Sequencing libraries were prepared using the TruSeq DNA PCR-Free HT kit. 150 bp paired-end sequencing was performed on the Illumina HiSeq X5 sequencer. The PGAP dataset provides Fastq, mapped cram files (GRCh38) and phenotype measurements.	HiSeq X Five	81
EGAD00001010144	Processed somatic variant calls		1
EGAD00001010145	Core phenotypic variables		1
EGAD00001010146	Whole-transcriptome sequencing (RNAseq) in patients with chronic rhinosinusitis with nasal polyps (CRSwNP) and controls. Differential expression patterns in genes involved in cilia, viral defense and NKT-cell specific pathways, suggesting a role of viral immunity in combination with cilia functionality in CRSwNP.	Illumina NovaSeq 6000	53
EGAD00001010147	This dataset includes an analyzed DMP file that provide the information about differential methylation positions based on Illumina Infinium MethylationEPIC BeadChip. All samples (5 lung cancer cases vs. 5 benign lung disease controls) were obtained from bronchial washings at the site of the lesion under bronchoscopy manipulation. Of the five lung cancer cases, 3 are adenocarcinoma and 2 are squamous carcinoma.		1
EGAD00001010148	Contains IMCISION samples sequenced on Flongle flowcells	MinION	1
EGAD00001010149	Contains Synthetic samples sequenced on R9 flowcells	MinION	3
EGAD00001010150	Contains Healthy samples sequenced on Flongle flowcells	MinION	2
EGAD00001010151	Contains Healthy samples sequenced on R9 flowcells	MinION	7
EGAD00001010152	Contains Synthetic samples sequenced on Flongle flowcells	MinION	6
EGAD00001010153	Contains PREDICT samples sequenced on R10 flowcells	MinION	2
EGAD00001010154	Contains IMCISION samples sequenced on R9 flowcells	MinION	13
EGAD00001010155	Contains PREDICT samples sequenced on R9 flowcells	MinION	28
EGAD00001010156	We stratified 69 primary IDH-wt GBM patients into TMZ-resistant (n = 29) and sensitive (n = 40) groups, using TMZ screening of the corresponding patient-derived glioma stem-like cells (GSCs). Genomic and transcriptomic features were then examined to identify TMZ-associated molecular alterations. Subsequently, we developed a machine learning (ML) model to predict TMZ response from combined signatures. Moreover, TMZ response in multisector samples (52 tumor sectors from 18 cases) was evaluated to validate findings and investigate the impact of intra-tumoral heterogeneity on TMZ efficacy.	Illumina HiSeq 2500	113
EGAD00001010157	Whole genome sequencing of 5 IM samples	Illumina NovaSeq 6000	10
EGAD00001010158	NCCS-NSCLC-ITH2 dataset of 185 sectors from 48 patients with early-stage non-small cell lung cancer diagnosed in National Cancer Centre Singapore; these are paired-end, whole-exome and bulk RNA sequencing data, sequenced by Illumina HiSeq 4000/2000.	Illumina HiSeq 2000 Illumina HiSeq 4000	336
EGAD00001010159	This dataset contains in solution target-enrichment bisulfite sequencing of placental tissue, buffy coat and plasma DNA from pregnant women. Blood samples were taken for cell-free DNA (cfDNA) DNA extraction from 64 women at the time of early-onset preeclampsia (PE) diagnosis, or from 38 controls (uncomplicated pregnancies) at a similar gestational age that did not develop preeclampsia subsequently. Among these subjects, plasma samples from 7 PE patients and 6 controls were also subjected to oxidative bisulfite sequencing. Placental tissues from 11 PE and 26 control subjects after delivery, and buffy coat from 16 PE and 16 control subjects at the same time of cfDNA sampling were profiled. A discovery cohort for early PE assessment in the first trimester was collected. In this cohort, cfDNA from 75 pregnancies that went on to develop early-onset PE and from 124 matched controls were collected and methylome sequencing were carried out. An independent validation cohort to validate early PE assessment with methylome profiling was collected as well. This validation cohort includes cfDNA samples from 61 PE and 136 control pregnancies.	Illumina NovaSeq 6000	604
EGAD00001010161	77 samples collected from 35 multiple myeloma patients. Each patient provided one healthy sample and one primary tumor sample and, in some cases, also samples collected after the progression of the disease.	Illumina NovaSeq 6000	77
EGAD00001010162	tissue, organoid and normal bam files for whole exome samples	NextSeq 500	3
EGAD00001010163	Batch RNA sequencing of passages 5-7 of three patient-derived monolayer glioblastoma cultures in which TMZ and BMP4 synergize. 3 biological replicates, either untreated, only temozolomide, only BMP4 or temozolomide + BMP4	Illumina HiSeq 2500	36
EGAD00001010164	High-throughput transcriptome sequencing data from paediatric (<18-y) ETV6-RUNX1 fusion positive acute lymphoblastic leukemias. Dataset includes fastq and BAM files from diagnostic samples of 33 patients.	Illumina NovaSeq 6000	32
EGAD00001010165	Human small non-coding RNA sequencing of serum from sons of PCOS mothers (n=9) and sons of control mothers (n=9), see publication for details.	Illumina HiSeq 3000	18
EGAD00001010166	Single-cell RNA sequencing on 10 antrum and 4 body gastric biopsies	Illumina HiSeq 4000	14
EGAD00001010167	scRNAseq dataset of colonic epithelium from distal colon biopsies from 4 patients with ulcerative colitis and 4 healthy individuals. Includes 11 samples split into three conditions: healthy, healthy margin and ulcerated. Dataset includes raw Fastq files and processed csv count matrices. Fastq files are divided into 4 lanes and into index (I1) and read (R1, R2) files. Count matrices contain comma-separated values with cell barcodes as column names and gene names as row names. Cell Ranger (v3.0.1) software from 10x Genomics was used to process the output and align the reads. The refdata-cellranger-GRCh38-3.0.0 reference was downloaded from the 10x Genomics website. First, cellranger mkfastq function was used to demultiplex raw base call files into FASTQ files. Then, the FASTQ files were aligned and filtered with cellranger count function.	NextSeq 500	11
EGAD00001010168	scRNAseq dataset of colonic organoids derived from epithelium from biopsies taken from three healthy human individuals. The organoids have either been grown in standard conditions (control) or treated with IL22 (treated). Includes 6 samples in total, one control from each individual (ctrl1, ctrl2, ctrl3) and one treated from each (treat1, treat2, treat3). The samples have been multiplexed using the antibody hashing technique. The 6 samples have been pooled into the one organoids sample. In order to analyse the raw files, they have to be demultiplexed first. Information necessary for demultiplexing, as well as which files belong to which sample, can be found in the map_file.csv, attached to each sample. Dataset includes raw Fastq files and processed csv count matrices. Fastq files are divided into HTO (hashtag) and RNA (transcriptome) files. HTO has one index (I1) and two read (R1, R2) files and RNA has two index (I1, I2) and two read (R1, R2) files. The fastq files are for the pooled (organoids) sample and need to be demultiplexed. Count matrices contain comma-separated values with cell barcodes as column names and gene names as row names. Since count matrices have been created after the demultiplexing step, there’s one matrix for each of the 6 individual samples. scRNA-seq data from human colon organoids was analysed in the same manner as for the Colitis dataset, apart from the following changes. Data was generated with the Cell Hashing technique, which uses oligo-tagged antibodies against surface proteins to barcode single cells. This allows for samples to be multiplexed together and run in a single experiment. The data was demultiplexed using the HTODemux() function from Seurat (Hao et al., 2021).	unspecified	1
EGAD00001010169	This dataset includes WES and RNAseq from 11 patients with metastatic melanoma, lung, kidney, and stomach cancers, enrolled in phase I clinical trials of TIL ACT (NCT03475134 & NCT04643574). WES was performed on matched cancer and healthy samples, whereas RNAseq was performed on cancer tissues, using Illumina HiSeq 2500/4000 and Illumina NextSeq 550 systems. Paired-end reads are provided in fastq format.	Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina NovaSeq 6000 NextSeq 550	36
EGAD00001010170	This dataset contains the open chromatin profiles of 8 patient H3-K27M mutant DMGs utilizing the single-cell/nucleus assay for transposase-accessible chromatin using sequencing (snATAC-seq)	NextSeq 500	2
EGAD00001010171	Paired RNA-Seq was performed on 125 samples of low grade pediatric glioma. The sequencing was done with Illumina Novaseq 6000 and the Illumina TruSeq stranded mRNA kit.	Illumina NovaSeq 6000	124
EGAD00001010172	Paired RNA seq of wild type VDH15 cells (3 replicates) - a cell line of oral squamous cell carcinoma (OSCC). RNA was extracted and sequencing libraries were prepared using TruSeq Stranded mRNA Library Prep Kit following manufacturer's protocol. The sequencing was done using Illumina Novaseq 6000.	Illumina NovaSeq 6000	3
EGAD00001010173	Fastq files of WES data. Primary tumor and/or metastasic samples of three chRCC patients. DNA libraries were produced using the SureSelect XT HumanAllExon V5 kit (Agilent Technologies), and sequencing was performed in a HiSeq instrument (Illumina) using a 100-bp paired-end mode.	Illumina HiSeq 2500	6
EGAD00001010174	Nanostring DSP spatial profiles of 8 patients whose antral sections contained histologically normal, IM, GC, lymphoid aggregates, and stromal regions, representing 480 regions of interest (ROIs) and 76 CD45-segmented areas of illumination (AOIs).	unspecified	556
EGAD00001010175	organoid and tissue bam files from rna-seq experiment	NextSeq 500	2
EGAD00001010176	These data correspond to samples used in the following papers: 1. GWAS and meta-analysis identifies 49 genetic variants underlying critical Covid-19 2. Genetic determinants of monocyte splicing are enriched for disease susceptibility loci including for COVID-19. 3. Identification of Genetic Determinants of Transcriptional Response to Metformin in Primary Human Monocytes. They are the raw RNA sequencing files as used in all downstream analysis.	Illumina HiSeq 4000	611
EGAD00001010178	iTHER is a prospective national precision oncology program aiming to define tumor molecular profiles in children and adolescents with primary very high-risk, relapsed, or refractory pediatric tumors in order to identify relevant aberrations to inform treatment.		1
EGAD00001010179	iTHER is a prospective national precision oncology program aiming to define tumor molecular profiles in children and adolescents with primary very high-risk, relapsed, or refractory pediatric tumors in order to identify relevant aberrations to inform treatment.		1
EGAD00001010180	We collected longitudinal samples from 15 patients with MCL at various clinical time points before and after CAT-T therapy brexucabtagene autoleucel (BA) infusion. The patients were grouped into three categories based on their clinical responses after BA treatment: 1) responsive (n = 9), 2) relapsed (n = 5), and 3) refractory (n = 1). All patients in category #1 and #2 had initially attained a complete response (CR) after BA therapy. The responsive group maintained CR with no relapse at the time of last follow up, while the relapsed group achieved initial CR but eventually relapsed. The 10x Chromium™ Single-Cell 5′ Reagent Kit v2 (PN-1000190, 10x GENOMICS) and Chromium Single-Cell Human TCR amplification Kit (PN-1000252, 10x GENOMICS) were used to perform single-cell separation, cDNA amplification, and library construction for gene expression and TCR repertoire following the manufacturer’s guidelines. Thirty-nine samples passed quality control and underwent single-cell transcriptome profiling with simultaneous single-cell T-cell receptor (TCR) repertoire analysis.	Illumina HiSeq 4000	654
EGAD00001010181	27 Chip-Seq samples from Human CD4+ T cells	NextSeq 500	27
EGAD00001010182	20 RNAseq samples of Human CD4+ cells	NextSeq 500	20
EGAD00001010183	Throughout an individual’s lifetime, genomic alterations accumulate in somatic cells. However, the mutational landscape induced by retrotransposition of long interspersed nuclear element-1 (L1), a widespread mobile element in the human genome, is poorly understood in normal cells. Here, we explored the whole-genome sequences of 406 normal colorectal clones, 12 MUTYH-associated adenomatous clones, and 19 matched colorectal cancer tissues. In addition, we analyzed promoter DNA methylation status of retrotransposition-competent L1 (in 139 clones) and read-through RNA expression profiles (in 116 clones) to investigate the epigenetic regulation of L1 activity.	Illumina NovaSeq 6000	654
EGAD00001010184	This dataset includes Illumina EPIC Capture Sequencing Data of 376 samples from 188 men with prostate cancer. Samples were taken from primary tissue obtained at prostatectomy, with matched pathologically assessed non-cancer control material. This DNA methylation data includes donors and samples included in previously published WGS datasets (from CRUK-ICGC batches 1 to 3 [EGAD00001001116] and batches 4 to 6 [EGAD00001003225]; including the majority of donors used in Wedge et al, Nature Genetics 2018 [PMID: 29662167]). The targeted DNA methylation sequencing data in this EGA dataset were generated using the Illumina TruSeq methyl capture method (EPICseq), covering over 3.3 million CpGs in the human genome, representing a total targeted hybridisation capture panel of 107Mbp. According to the EPICseq protocol, DNA samples extracted from prostatectomy tissue samples were enriched for target regions using hybridisation capture, prior to bisulfite conversion, amplification and sequencing in pools of 12 samples (150 single end reads over two Illumina HiSeq4000 lanes). This approach generated DNA methylation profiles from prostate cancer and control samples at base-pair resolution across millions of CpGs in the human genome.	Illumina HiSeq 4000	376
EGAD00001010185	The dataset contains the gene expression profile of each individual along with gene fusion events.		446
EGAD00001010186	Dataset contains 483 Irish origin Individuals with Covid19. paired end sequencing has been performed For WGS and later processed by using sention v 201808.03.	Illumina NovaSeq 6000	483
EGAD00001010187	Single-cell whole transcriptome and antibody expression for bone marrow samples from Cohorts A and B. CITEseq protocol was followed. 37 and 77 surface markers were measured in each cohort, respectively (see Supplementary Table 1). For details on cell sorting prior scRNAseq see the methods section of the manuscript.	Illumina NovaSeq 6000	49
EGAD00001010188	Bulk ATAC libraries were generated for patient samples A.1, A.3, A.5, A.6, A.7, A.11, A.12, A.13 and A.15. For each patient a library of CD3- cells (myeloid cells, considered tumor cells) and CD3+ cells (T cells, considered healthy) were generated.	Illumina NovaSeq 6000	18
EGAD00001010189	MutaSeq was applied to CD34+ cells from patient samples A.10, A.11 and A.12 of the study. The protocol followed is described in https://doi.org/10.1038/s41467-021-21650-1 and in the methods of the manuscript	NextSeq 500	3
EGAD00001010190	Libraries were constructed using SureSelect HS XT Target Enrichment System v6 (Agilent). For each patient a library of CD3- cells (myeloid cells, considered tumor cells) and CD3+ cells (T cells, considered healthy) were generated.	Illumina NovaSeq 6000	30
EGAD00001010191	Optimized 10x library to increase the coverage of the mitochondrial genome from 3’ 10x gene expression data. See details of the experimental method in the methods section of the manuscript	Illumina NovaSeq 6000	24
EGAD00001010192	Optimized 10x library to increase the coverage of selected nuclear variants from 3’ 10x Genomics scRNAseq data. SNVs were selected based on exome data and criteria described in the Supplementary Information of the manuscript.See details of the experimental method in the methods section of the manuscript	NextSeq 500	21
EGAD00001010193	Targeted DNA sequencing was applied to colonies grown from single-cells of patient A.6. The protocol followed is described in https://doi.org/10.1038/s41467-021-21650-1 and the methods of the manuscript	NextSeq 500	1
EGAD00001010194	In mammals, X-chromosomal genes are expressed from a single copy since males (XY) possess a single X chromosome, while females (XX) undergo X inactivation. To compensate for this reduction in dosage compared to two active copies of autosomes, it has been proposed that genes from the active X chromosome exhibit dosage compensation. However, the existence and mechanism of X-to-autosome dosage compensation are still under debate. Here, we show that X-chromosomal transcripts are reduced in m6A modifications and more stable compared to their autosomal counterparts. Acute depletion of m6A selectively stabilises autosomal transcripts, resulting in perturbed dosage compensation in mouse embryonic stem cells. We propose that higher stability of X-chromosomal transcripts is directed by lower levels of m6A, indicating that mammalian dosage compensation is partly regulated by epitranscriptomic RNA modifications.	NextSeq 500	6
EGAD00001010195	94 human adipocyte samples isolated from whole adipose tissues using collagenase digestion of tissue and flotation of lipid-laden adipocytes, followed by RNA isolation and RNA sequencing (SMARTer Stranded Total RNA-Seq library preparation, HiSeq 4000 100-bp paired-end reads). Adipocyte samples comprise subcutaneous and visceral adipocytes isolated from obese and lean people (N=24 obese-subcutaneous, N=24 obese-visceral, N=22 control-subcutaneous, N=24 control-visceral). Human adipocyte RNA sequencing data are provided as BAM files.	Illumina HiSeq 4000	93
EGAD00001010196	This dataset contains 46 fastq files of paired-end RNA sequencing of an Illumina®️ TrueSeq stranded mRNA library of 23 glioblasoma PDX samples.	Illumina NovaSeq 6000	23
EGAD00001010197	exome data from leiomyosarcoma, large cell neuroendocrine carcinoma, and clear cell sarcoma	Illumina NovaSeq 6000	7
EGAD00001010198	Transcriptome sequencing from the Rare Cancer Research Foundation: leiomyosarcoma, large cell neuroendocrine carcinoma, clear cell carcinoma	Illumina NovaSeq 6000	4
EGAD00001010199	Rare Cancer Research Foundation: leiomyosarcoma, large cell neuroendorcrine carcinoma, clear cell carcinoma	Illumina NovaSeq 6000	2
EGAD00001010200	We performed whole genome sequencing on 84 LFS family members from 47 families: 22 with wildtype TP53 and 62 with variant TP53. The variant TP53 cohort consists of 49 individuals who developed cancer and 13 individuals who remain cancer-free; 34 were from 13 families with 2-4 individuals sequenced within a given family and the remaining 28 had no family members sequenced. The wildtype cohort consists of 14 individuals who developed cancer and 8 individuals who are cancer-free, from 6 families.	HiSeq X Ten	80
EGAD00001010201	294 formalin-fixed paraffin-embedded (FFPE) tissue samples were sent to the UNC Lineberger Comprehensive Cancer Center Translational Genomics Lab (TGL) for RNA isolation using the Maxwell 16 MDx Instrument (Promega AS3000) and the Maxwell 16 LEV RNA FFPE Kit (Promega AS1260) following the manufacturer’s protocol (Promega 9FB167). 279 total RNA sequencing libraries were prepared at TGL using a Bravo Automated Liquid-Handling Platform (Agilent G5562A) and the TruSeq Stranded Total RNA Library Prep Gold Kit (Illumina 20020599) following the manufacturer’s protocol (Illumina 1000000040499). RNAseq library quality and quantity were measured using a TapeStation 4200 (Agilent G2991AA) and Qubit 3.0 fluorometer (Life Technologies Q33216), pooled at equal molar ratios, and denatured following the manufacturer’s protocol (Illumina 1000000106351). 271 total RNA sequencing libraries were sequenced at TGL on NovaSeq 6000 (Illumina 20012850) S4 flow cells (Illumina 20028313) following the manufacturer’s protocol (Illumina 1000000019358) using a 2x50 bp paired-end configuration and pool sizes of 91 libraries to target a read depth of 110 million clusters per library on average.	Illumina NovaSeq 6000	271
EGAD00001010202	Whole genome sequencing data of 31 high-grade serous carcinoma (HGSC) patients (101 samples) sequenced with HiSeq X Ten and BGISEQ-500	HiSeq X Ten unspecified	90
EGAD00001010203	Spatial characterization by TCR sequencing in tumor core and matched stroma in seven cases of triple negative breast cancer.	Illumina MiSeq	14
EGAD00001010204	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0167_003 for Follicular lymphoma patient sample TFRI_Cont_2	Illumina HiSeq 2500	1
EGAD00001010205	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0168_003 for Follicular lymphoma patient sample TFRI_Cont_3	Illumina HiSeq 2500	1
EGAD00001010206	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0169_003 for Follicular lymphoma patient sample TFRI_Cont_4	Illumina HiSeq 2500	1
EGAD00001010207	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0170_003 for Follicular lymphoma patient sample TFRI_Cont_5	Illumina HiSeq 2500	1
EGAD00001010208	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0171_003A for Follicular lymphoma patient sample TFRI_Cont_6	Illumina HiSeq 2500	1
EGAD00001010209	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0178_003A for Follicular lymphoma patient sample TFRI_Cont_8	Illumina HiSeq 2500	1
EGAD00001010210	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0183_003A for Follicular lymphoma patient sample TFRI_Cont_9	Illumina HiSeq 2500	1
EGAD00001010211	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0187_003A for Follicular lymphoma patient sample TFRI_Cont_10	Illumina HiSeq 2500	1
EGAD00001010212	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0188_003A for Follicular lymphoma patient sample TFRI_Cont_11	Illumina HiSeq 2500	1
EGAD00001010213	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0197_003A for Follicular lymphoma patient sample TFRI_Cont_12	Illumina HiSeq 2500	1
EGAD00001010214	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0198_003A for Follicular lymphoma patient sample TFRI_Cont_13	Illumina HiSeq 2500	1
EGAD00001010215	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0167_001 for Follicular lymphoma patient sample TFRIPAIR2_FL	Illumina HiSeq 2500	1
EGAD00001010216	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0168_001 for Follicular lymphoma patient sample TFRIPAIR3_FL	Illumina HiSeq 2500	1
EGAD00001010217	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0169_001 for Follicular lymphoma patient sample TFRIPAIR4_FL	Illumina HiSeq 2500	1
EGAD00001010218	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0170_001A for Follicular lymphoma patient sample TFRIPAIR5_FL	Illumina HiSeq 2500	1
EGAD00001010219	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0171_001A for Follicular lymphoma patient sample TFRI_Pair_6_FL	Illumina HiSeq 2500	1
EGAD00001010220	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0178_001A for Follicular lymphoma patient sample TFRIPAIR8_FL	Illumina HiSeq 2500	1
EGAD00001010221	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0183_001A for Follicular lymphoma patient sample TFRIPAIR9_FL	Illumina HiSeq 2500	1
EGAD00001010222	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0187_001A for Follicular lymphoma patient sample TFRIPAIR10_FL	Illumina HiSeq 2500	1
EGAD00001010223	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0188_001A for Follicular lymphoma patient sample TFRIPAIR11_FL	Illumina HiSeq 2500	1
EGAD00001010224	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0197_001A for Follicular lymphoma patient sample TFRIPAIR12_FL	Illumina HiSeq 2500	1
EGAD00001010225	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0198_001A for Follicular lymphoma patient sample TFRI_Pair_13_FL	Illumina HiSeq 2500	1
EGAD00001010226	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0167_002 for Diffuse large B-cell lymphoma patient sample TFRIPAIR2_DLBCL	Illumina HiSeq 2500	1
EGAD00001010227	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0168_002 for Diffuse large B-cell lymphoma patient sample TFRIPAIR3_DLBCL	Illumina HiSeq 2500	1
EGAD00001010228	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0169_002 for Diffuse large B-cell lymphoma patient sample TFRIPAIR4_DLBCL	Illumina HiSeq 2500	1
EGAD00001010229	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0170_002A for Diffuse large B-cell lymphoma patient sample TFRIPAIR5_DLBCL	Illumina HiSeq 2500	1
EGAD00001010230	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0171_002A for Diffuse large B-cell lymphoma patient sample TFRI_Pair_6_DLBCL	Illumina HiSeq 2500	1
EGAD00001010231	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0178_002A for Diffuse large B-cell lymphoma patient sample TFRIPAIR8_DLBCL	Illumina HiSeq 2500	1
EGAD00001010232	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0183_002A for Diffuse large B-cell lymphoma patient sample TFRI_Pair_9_DLBCL	Illumina HiSeq 2500	1
EGAD00001010233	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0187_002A for Diffuse large B-cell lymphoma patient sample TFRIPAIR10_DLBCL	Illumina HiSeq 2500	1
EGAD00001010234	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0188_002A for Diffuse large B-cell lymphoma patient sample TFRIPAIR11_DLBCL	Illumina HiSeq 2500	1
EGAD00001010235	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0197_002A for Diffuse large B-cell lymphoma patient sample TFRIPAIR12_DLBCL_rel	Illumina HiSeq 2500	1
EGAD00001010236	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0198_002A for Diffuse large B-cell lymphoma patient sample TFRI_Pair_13_DLBCL	Illumina HiSeq 2500	1
EGAD00001010237	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0198_004A for Lymph node patient sample RLN_02_5Prime	Illumina HiSeq 2500	1
EGAD00001010238	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0198_005A for Lymph node patient sample RLN_03_5Prime	Illumina HiSeq 2500	1
EGAD00001010239	DLP+ Single Cell Genomic Library 98211 for Diffuse large B-cell lymphoma patient sample TFRIPAIR11_DLBCL	HiSeq X Ten	1
EGAD00001010240	DLP+ Single Cell Genomic Library A98180 for Diffuse large B-cell lymphoma patient sample TFRI_Pair_13_DLBCL	HiSeq X Ten	1
EGAD00001010241	DLP+ Single Cell Genomic Library A98193 for Follicular lymphoma patient sample TFRI_Pair_6_FL	HiSeq X Ten	1
EGAD00001010242	DLP+ Single Cell Genomic Library A98203 for Diffuse large B-cell lymphoma patient sample TFRIPAIR8_DLBCL	HiSeq X Ten	1
EGAD00001010243	DLP+ Single Cell Genomic Library A98205 for Follicular lymphoma patient sample TFRIPAIR9_FL	HiSeq X Ten	1
EGAD00001010244	DLP+ Single Cell Genomic Library A98208 for Diffuse large B-cell lymphoma patient sample TFRIPAIR10_DLBCL	HiSeq X Ten	1
EGAD00001010245	DLP+ Single Cell Genomic Library A98221A for Follicular lymphoma patient sample TFRIPAIR5_FL	HiSeq X Ten	1
EGAD00001010246	DLP+ Single Cell Genomic Library A98225 for Diffuse large B-cell lymphoma patient sample TFRIPAIR12_DLBCL_rel	HiSeq X Ten	1
EGAD00001010247	DLP+ Single Cell Genomic Library A98288 for Follicular lymphoma patient sample TFRIPAIR2_FL	HiSeq X Ten	1
EGAD00001010248	DLP+ Single Cell Genomic Library A98297 for Follicular lymphoma patient sample TFRIPAIR3_FL	HiSeq X Ten	1
EGAD00001010249	DLP+ Single Cell Genomic Library A98167 for Diffuse large B-cell lymphoma patient sample TFRIPAIR4_DLBCL	HiSeq X Ten	1
EGAD00001010250	Genomic data used in "ACT-Discover: identifying karyotype heterogeneity in pancreatic cancer evolution using ctDNA"	unspecified	111
EGAD00001010251	Single-cell RNA seq data of epithelial cells ((EpCAM-enriched by FACS) isolated from cryopreserved human lung tissue (3 samples, 3 donors)	NextSeq 550	3
EGAD00001010252	Bulk RNA seq data from human lung fibroblasts isolated from fresh and cryopreserved lung tissue (18 samples, 3 donors)	NextSeq 500	3
EGAD00001010253	This dataset contains 210 fastq files (RNA sequencing was performed in two centers) from 105 individuals (106 files in subcutaneous tissue and 104 files in visceral tissue). Of the 210 fastq files, 129 files are in PEx100 mode (appeared in a single fastq file) and 81 files are in PEx49bp mode (appeared in two separate fastq files). Sequencing was done on the Illumina HiSeq2000 platform.	Illumina HiSeq 2000	204
EGAD00001010254	This dataset contains 28 fastq files (11 files for subcutaneous tissue and 17 files for visceral tissue) from nine individuals. All samples were initially sequenced by SEx50 mode (16 files) and some of them were also sequenced by PEx100 mode (12 files). Sequencing was done on the Illumina HiSeq2000 platform.	Illumina HiSeq 2000	18
EGAD00001010255	This dataset contains a vcf file for 99 GM individuals genotyped on the Illumina HumanOmni2.5 array . The vcf file is originated after imputation (IMPUTE2) and filtering for minor allele frequency MAF≥0.05, imputation confidence score INFO of >0.4 and Hardy-Weinberg Equilibrium (HWE) p>1e-06, yielding ~6.3 million variants.		1
EGAD00001010256	Paired RNA-Seq of 199 samples of Sarcoma tumors. The sequencing was done mostly on HiSeq 4000 with Illumina TruSeq Stranded mRNA, ist was also done on NovaSeq 6000 using the same kit. Very few samples were sequenced on HiSeq X Ten with Illumina TruSeq RNA or Illumina TruSeq Stranded mRNA.	HiSeq X Ten Illumina HiSeq 4000 Illumina NovaSeq 6000	198
EGAD00001010257	Sequencing data of 77 sarcoma tumor and control runs, which were uploaded to EGAS00001004813. The sequencing was always paired.	HiSeq X Ten	1
EGAD00001010258	Sequencing data of 144 sarcoma tumor and control runs, which were uploaded to EGAS00001004813. The sequencing was always paired.	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	1
EGAD00001010259	Dataset contains total RNA sequencing data from plasma of 65 different human donors: 30 diffuse large B-cell lymphoma (DLBCL) and 13 primary mediastinal large B-cell lymphoma (PMBCL) patients, and 22 cancer-free controls. Samples were sequenced on a NovaSeq 6000 and are provided in FASTQ format.	Illumina NovaSeq 6000	65
EGAD00001010260	FASTQ files from Illumina sequencing of 10x Visium experiments.	Illumina NovaSeq 6000	25
EGAD00001010264	Additional WGS files for Roussel-ATRT-TM paper titled "Atypical teratoid/ rhabdoid tumoroids reveal subgroup-specific drug vulnerabilities"	Illumina HiSeq 2000	3
EGAD00001010265	Additional WXS files for Roussel-ATRT-TM paper titled "Atypical teratoid/ rhabdoid tumoroids reveal subgroup-specific drug vulnerabilities"	Illumina HiSeq 2000	2
EGAD00001010268	16S bacterial amplicon sequencing data for Guangzhou cohort	Illumina NovaSeq 6000	1
EGAD00001010269	This is the raw TCRseq data for the manuscript T cell receptor repertoire sequencing reveals chemotherapy-driven clonal expansion in colorectal liver metastases.	NextSeq 500	8
EGAD00001010270	Additional WGS files for Genomic Landscape ALL paper titled "The genomic landscape of pediatric acute lymphoblastic leukemia"	Illumina HiSeq 2000	2
EGAD00001010272	Contains WES and RNA-seq for the 59 patients with first- and second-generation EGFR TKI-resistant metastatic EGFR-mutated NSCLC.	Illumina HiSeq 2000 Illumina HiSeq 4000	179
EGAD00001010273	This dataset contains raw fastq files from the RNA-Seq of 96 T-Acute Lymphoblastic Leukemia. Libraries where prepared using Agilent SureSelect XT-HS2 RNA Reagent Kit. As such, reads contain molecular barcodes that could be specifically handled using the AGeNT tool.	Illumina NovaSeq 6000	96
EGAD00001010274	There is a need for quantitative measurements of evolutionary metrics in controlled clinical trials with long term follow-up information. This is particularly true in advanced localised prostate cancer, which can recur more than a decade after diagnosis. Here we mapped genomic intra-tumour heterogeneity in 642 tumour samples from 114 patients who took part in the IMRT and DELINEATE clinical trials, for which full clinical information and 12y median follow-up was available. We concomitantly assessed phenotypic (morphological) heterogeneity using Deep Learning in 1,923 histological sections from 250 IMRT patients (fully overlapping with the genetic set). This study shows that combining genomics with AI-aided histopathology in clinical trials leads to novel clinical biomarkers. This EGA repository contains data produced from tumour samples using low coverage whole genome sequencing and a prostate cancer specific gene panel data following compression of unique molecular identifiers.	Illumina NovaSeq 6000	1272
EGAD00001010275	Targeted cancer therapy inevitably selects for tumor cells that harbor some form of therapy resistance. This phenomenon is a principle reason why advanced prostate cancer, which can be treated with agents targeting androgen signaling dependency and DNA repair failure, is a lethal condition. The influence of a patient's clinical history and disease evolution on how their disseminated tumors develop resistance has been difficulty to study, because few autopsy studies have been performed in heavily treated patients with DNA-repair deficient metastatic castration-resistant prostate cancer (mCRPC). Here, we assessed how resistance to targeted cancer therapies evolved in an autopsy cohort of 54 mCRPC tumors from six men. This dataset includes targeted sequencing, exome sequencing, and RNA-seq conducted on these biopsies.	Illumina NovaSeq 6000	173
EGAD00001010276	Sequencing data of 410 sarcoma tumor runs, which were uploaded to EGAS00001004813. The sequencing was always paired.	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000	1
EGAD00001010277	Paired WGS samples (tumor and control) of one Sarcoma case. The paired sequencing was done on Hiseq X Ten with Illlumina TruSeq Nano DNA.	HiSeq X Ten	2
EGAD00001010278	Paired Exome sequencing of Sarcoma tumor and control of 10 samples (5 tumor/control pairs). The sequencing was done on Illumina Hiseq 4000 and Agilent Sureselect V5+UTRs kit.	Illumina HiSeq 4000	10
EGAD00001010279	120 samples of cachectic and non cachectic. 240 Fastq files from Illumina metagenomic shothun paired End Sequencing	Illumina NovaSeq 6000	1
EGAD00001010280	This dataset contains the raw RNA-seq data (FASTQ files) of the samples used in the study title: "Evaluation of triple negative breast cancer with heterogeneous immune infiltration". The dataset is composed of 3 patients with 6 samples per patient (3 with high TILs and 3 with low TILs).	NextSeq 500	18
EGAD00001010281	Ither NB in Organoids WXS dataset - We aimed to launch an online repository integrating genomics and transcriptomics with high-throughput drug screening (HTS) of nineteen commonly used neuroblastoma cell lines and fourteen generated neuroblastoma patient-derived organoids (NBL-PDOs) to improve identification of molecularly matched therapies and support clinical uptake.	Illumina HiSeq 4000	17
EGAD00001010282	Ither NB in Organoids WGS dataset - We aimed to launch an online repository integrating genomics and transcriptomics with high-throughput drug screening (HTS) of nineteen commonly used neuroblastoma cell lines and fourteen generated neuroblastoma patient-derived organoids (NBL-PDOs) to improve identification of molecularly matched therapies and support clinical uptake.	Illumina HiSeq 4000	17
EGAD00001010283	Ither NB in Organoids RNA-Seq dataset - We aimed to launch an online repository integrating genomics and transcriptomics with high-throughput drug screening (HTS) of nineteen commonly used neuroblastoma cell lines and fourteen generated neuroblastoma patient-derived organoids (NBL-PDOs) to improve identification of molecularly matched therapies and support clinical uptake.	Illumina HiSeq 4000	9
EGAD00001010284	cfChIP-seq using an H3K4me3-specific antibody of 9 plasma samples collected after marathon run. cfChIP-seq was performed as described in Sadeh et al. 2021. dataset contains paired-end fastq files and BAM files of raw sequencing data.	NextSeq 500	9
EGAD00001010287	This dataset contains 9 bulk RNAseq of neuroblastoma patient's tumors used to compare with derived PDXs and/or single-cell data in the Thirant C et al, Nature Communications, 2023. They were intially produced for the Berlanga P., Cancer Discovery, 2022.	NextSeq 500	9
EGAD00001010288	scWGS-seq of flow sorted blast and normal cells from SJBALL030072with 66 high quality cells sequenced (61 blast and 5 normal)	Illumina NovaSeq 6000	66
EGAD00001010289	Dataset contains 50 paired-end Whole Exome sequencing samples from 3 patients. 3 normal blood samples are also included.		50
EGAD00001010290	Dataset contains 46 paired-end RNA-seq samples from 3 patients.		46
EGAD00001010292	This dataset contains WGS data in the form of BAM files for NPC268 - "Tumor" derived from snap-frozen tissue while "Cell line" derived from late passage NPC268 cell line. Extracted DNA was sent for 100x and 60x WGS with Novogene via Apical Scientific Sdn Bhd.	Illumina NovaSeq 6000	2
EGAD00001010293	Paired end shallow whole genome sequencing (sWGS) data (FASTQ) for the identification of genomewide somatic copy number alterations (SCNA) and the estimation of tumor fractions.	NextSeq 550	30
EGAD00001010294	Aligned NGS data (BAM) of 77 frequently mutated genes in cancer using the AVENIO Expanded platform.	NextSeq 550	30
EGAD00001010295	There is a need for quantitative measurements of evolutionary metrics in controlled clinical trials with long term follow-up information. This is particularly true in advanced localised prostate cancer, which can recur more than a decade after diagnosis. Here we mapped genomic intra-tumour heterogeneity in 642 tumour samples from 114 patients who took part in the IMRT and DELINEATE clinical trials, for which full clinical information and 12y median follow-up was available. We concomitantly assessed phenotypic (morphological) heterogeneity using Deep Learning in 1,923 histological sections from 250 IMRT patients (fully overlapping with the genetic set). This study shows that combining genomics with AI-aided histopathology in clinical trials leads to novel clinical biomarkers. This EGA repository contains data produced from tumour samples using low coverage whole genome sequencing and a prostate cancer specific gene panel data following compression of unique molecular identifiers.	Illumina NovaSeq 6000	100
EGAD00001010296	CPC-GENE H3K27ac ChIP-seq data comprises 48 tumour samples and 6 benign samples sequenced with single end Illumina HiSeq 2000.	Illumina HiSeq 2000	108
EGAD00001010297	sWGS dataset of 18 matched PDO and ascites samples, and scDNA sequencing of three of these PDOs.	Illumina HiSeq 4000 Illumina NovaSeq 6000	39
EGAD00001010298	One of the most dangerous forms of DNA damage are interstrand crosslinks (ICLs), which covalently crosslink the two strands of the DNA double helix. The repair of these lesions is crucial for cellular survival due to their ability to block transcription and DNA replication. Initially, the major pathway that has been described in ICL repair involves a network of 22 genes that are mutated in a severe human genetic disease known as Fanconi Anemia (FA). Using synthetic lethality screens in the near-haploid human HAP1 cell line, we recently identified two potentially novel regulators of ICL repair, C1orf112 and THAP12. Loss of C1orf112 and THAP12 causes hypersensitivity to ICL-inducing DNA damaging agents, such as Mitomycin C (MMC). Additionally, C1orf112-depleted cells show elevated levels of micronuclei and accumulation of DNA damage in S-phase. To better understand how C1orf112 and THAP12 mediate the repair of ICLs, we want to perform mutational signature analysis, using the BotSeq method. Therefore, WT, C1orf112 and THAP12 knockout cells were cultured in vehicle or MMC treated conditions for 10 days and the genomic DNA was isolated. FANCA and FANCD2 knockout cells are taken along as controls in this experimental setting. . This dataset contains all the data available for this study on 2023-04-20.	Illumina NovaSeq 6000	18
EGAD00001010299	Chromatin accessibility profiles of bulk Acute Myeloid Leukemia (AML) samples, 60 datasets. Part of the study EGAS00001004893 and EGAS00001004896.	unspecified	60
EGAD00001010300	The circulating tumor DNA (ctDNA) mutation-based approach shows limited performance in minimal residual disease (MRD) detection, especially for landmark MRD detection at an early-stage cancer after surgery. Here, A total of 87 NSCLC patients, who received curative surgical resections (23 patients relapsed during follow-up), enrolled in this study. A total of 163 plasma samples, collected at 7 days and 6 months postsurgical, were used for high-throughput sequencing.		1
EGAD00001010301	Smart-seq2 single-cell RNA-seq of human liver non-parenchymal cells from lean and obese individuals	Illumina HiSeq 3000	1351
EGAD00001010302	Bulk tumour, germline and DigiPico WGS BAM files from patients 11611, 11615, 11619	Illumina HiSeq 4000	8
EGAD00001010304	RNAseq and ATACseq data for the FMF patients and healthy control. The RNAseq data was sequenced on a BGI MGI G400 machine, with PE100 reads. ATAC-seq libraries were prepared with Illumina Nextera primers and sequenced on NovaSeq 6000 platform with 50bp paired-end sequencing, where each sample was sequenced to approximate 60 million reads.	Illumina NovaSeq 6000 unspecified	58
EGAD00001010305	Bam and indexed bam files after removal of duplicates and trimming of the unique molecular identifiers. Sequencing was performed on NovaSeq 6000 platform.		1
EGAD00001010306	This dataset contains bulk transcriptomes from the operable cohorts of the LUD2015-005 study (NCT02735239, EudraCT 2015-005298-19), used to verify the deconvolution method used for the scRNA-seq dataset.	NextSeq 2000	24
EGAD00001010307	This dataset is derived from whole-transcriptome sequencing (RNA-seq) of RNA from 57 BCR-ABL1 lymphoblastic leukemias (53 diagnostic, 4 relapse).		57
EGAD00001010308	Bulk RNA-seq data of 8 RNA samples were generated. Libraries were prepared using the Stranded mRNA Library Prep, Ligation Kit (Illumina) following manufacturer’s recommendations and sequenced on a NextSeq 2000 (2x50 bp, Illumina).	unspecified	7
EGAD00001010309	Purified DNA from PDX samples were subjected to WGS. Libraries were performed using the TruSeq DNA PCR-Free kit (Illumina) starting with 1μg of input DNA and performed following manufacturer's instructions. Libraries were sequenced on a NovaSeq 6000 (2x151 bp) instrument (Illumina). A mean coverage of 30.4x was obtained.	Illumina NovaSeq 6000	6
EGAD00001010310	Revision Experiments UMI	Illumina NovaSeq 6000	12
EGAD00001010311	Dataset contains 70 paired-end ATAC-seq samples from 8 patients.		70
EGAD00001010312	Dataset contains 21 paired-end Hi-C samples from 9 patients.		21
EGAD00001010313	Dataset contains 11 paired-end snATAC-seq samples from 4 patients.		11
EGAD00001010314	Here, seven patients with BPDCN were characterized using RNA-seq and WXS.	Illumina NovaSeq 6000	14
EGAD00001010315	Genome and transcriptome sequence data from a breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010316	Genome and transcriptome sequence data from a metastatic pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010317	Genome and transcriptome sequence data from a pancreatic ductal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010318	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010320	We performed single cell RNA sequencing (scRNA-seq) for 208,506 cells derived from 58 lung adenocarcinomas from 44 patients, which covers primary tumour, lymph node and brain metastases, and pleural effusion in addition to normal lung tissues and lymph nodes. The extensive single cell profiles depicted a complex cellular atlas and dynamics during lung adenocarcinoma progression which includes cancer, stromal, and immune cells in the surrounding tumor microenvironments.	Illumina HiSeq 2500	1
EGAD00001010321	Paired diagnostic and relapse medulloblastoma sequencing. Targetted panel sequencing (n=8 relapse, n=7 matched diagnostic). Whole exome sequencing (n=23 relapse, n=18 matched diagnostic). Capture using Agilent SureSelect.	Illumina HiSeq 2000	56
EGAD00001010322	180502: RNA-Sequencing data of cocultured matched CRC patient (P4) derived normal fibroblasts (NFs), cancer associated fibroblasts (CAFs) and tumor spheroids. 200503_coculture: RNA-Sequencing data of cocultured CRC patient derived normal fibroblasts (NFs) or cancer associated fibroblasts (CAFs) (P16, P19, P22, P32, P41, P42) and tumor spheroids (HT29). 200503_il1b: RNA-Sequencing data of IL-1β stimulated fibroblasts (NFs and CAFs) Cole: scRNA-sequencing of matched CRC tumour samples and normal tissue counterparts derived from 3 patients. 220501: RNA-Sequencing of FACS sorted IL1R1 high and IL1R1 low CT5.3 CAFs	NextSeq 500 unspecified	96
EGAD00001010323	This dataset is derived from whole-genome sequencing (WGS) of DNA from 57 BCR-ABL1 lymphoblastic leukemias (53 diagnostic, 4 relapse) and 53 germline samples.		110
EGAD00001010324	We have generated and analyzed genomic data from a cohort of metastatic urothelial carcinoma patients treated with ICI such as anti-PD-(L)1 monoclonal antibodies. The dataset contains whole exome sequencing data of 27 whole blood samples and 27 FFPE tumor samples. Further, it includes RNA sequencing data from 21 tumor samples. Following the RECIST criteria, 10 patients were classified as non-responders to the treatment, and 17 were responders. The dataset also contains a merged vcf file containing somatic mutations called by Strelka2 and Mutect2 following the gatk best practice pipeline.	Illumina HiSeq 2500	74
EGAD00001010326	A set of tumor/normal paired sequencing experiments, performed in short read WGS and 10X linked read whole genomes. BAM files aligned to hg19 are provided. Sample Alias number and Subject ID reflects patient of origin. T/N distinction discriminates between tumor and normal (peripheral blood) tissue.	Illumina NovaSeq 6000	98
EGAD00001010327	Targeted capture ctDNA Library CRCQV34Run002-10 from patient PBC0002108, plasma baseline sample	Illumina MiSeq	1
EGAD00001010328	Targeted capture ctDNA Library CRCQV34Run002-11 from patient PBC0002429, plasma baseline sample	Illumina MiSeq	1
EGAD00001010329	Targeted capture ctDNA Library CRCQV34Run002-12 from patient PBC0002459, plasma baseline sample	Illumina MiSeq	1
EGAD00001010330	Targeted capture ctDNA Library CRCQV34Run002-13 from patient PBC0002595, plasma baseline sample	Illumina MiSeq	1
EGAD00001010331	Targeted capture ctDNA Library CRCQV34Run002-14 from patient PBC0001051, saliva sample	Illumina MiSeq	1
EGAD00001010332	Targeted capture ctDNA Library CRCQV34Run002-16 from patient PBC0001627, saliva sample	Illumina MiSeq	1
EGAD00001010333	Targeted capture ctDNA Library CRCQV34Run002-18 from patient PBC0002062, saliva sample	Illumina MiSeq	1
EGAD00001010334	Targeted capture ctDNA Library CRCQV34Run002-20 from patient PBC0002108, saliva sample	Illumina MiSeq	1
EGAD00001010335	Targeted capture ctDNA Library CRCQV34Run002-21 from patient PBC0002429, saliva sample	Illumina MiSeq	1
EGAD00001010336	Targeted capture ctDNA Library CRCQV34Run002-22 from patient PBC0002459, saliva sample	Illumina MiSeq	1
EGAD00001010337	Targeted capture ctDNA Library CRCQV34Run002-23 from patient PBC0002595, saliva sample	Illumina MiSeq	1
EGAD00001010338	Targeted capture ctDNA Library CRCQV34Run002-4 from patient PBC0001051, plasma baseline sample	Illumina MiSeq	1
EGAD00001010339	Targeted capture ctDNA Library CRCQV34Run002-6 from patient PBC0001627, plasma baseline sample	Illumina MiSeq	1
EGAD00001010340	Targeted capture ctDNA Library CRCQV34Run002-8 from patient PBC0002062, plasma baseline sample	Illumina MiSeq	1
EGAD00001010341	Targeted capture ctDNA Library CRCQV34Run005-10 from patient PBC0002459, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010342	Targeted capture ctDNA Library CRCQV34Run005-11 from patient PBC0002595, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010343	Targeted capture ctDNA Library CRCQV34Run005-12 from patient PBC0001224, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010344	Targeted capture ctDNA Library CRCQV34Run005-13 from patient PBC0002383, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010345	Targeted capture ctDNA Library CRCQV34Run005-14 from patient PBC0002816, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010346	Targeted capture ctDNA Library CRCQV34Run005-15 from patient PBC0002824, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010347	Targeted capture ctDNA Library CRCQV34Run005-18 from patient PBC0001845, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010348	Targeted capture ctDNA Library CRCQV34Run005-20 from patient PBC0002680, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010349	Targeted capture ctDNA Library CRCQV34Run005-4 from patient PBC0001627, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010350	Targeted capture ctDNA Library CRCQV34Run005-5 from patient PBC0002062, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010351	Targeted capture ctDNA Library CRCQV34Run005-8 from patient PBC0002108, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010352	Targeted capture ctDNA Library CRCQV34Run005-9 from patient PBC0002429, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010353	Targeted capture ctDNA Library CRCQV34Run007-10 from patient PBC0002680, plasma baseline sample	Illumina MiSeq	1
EGAD00001010354	Targeted capture ctDNA Library CRCQV34Run007-12 from patient PBC0002816, plasma baseline sample	Illumina MiSeq	1
EGAD00001010355	Targeted capture ctDNA Library CRCQV34Run007-13 from patient PBC0002824, plasma baseline sample	Illumina MiSeq	1
EGAD00001010356	Targeted capture ctDNA Library CRCQV34Run007-15 from patient PBC0001255, saliva sample	Illumina MiSeq	1
EGAD00001010357	Targeted capture ctDNA Library CRCQV34Run007-18 from patient PBC0001845, saliva sample	Illumina MiSeq	1
EGAD00001010358	Targeted capture ctDNA Library CRCQV34Run007-20 from patient PBC0002680, saliva sample	Illumina MiSeq	1
EGAD00001010359	Targeted capture ctDNA Library CRCQV34Run007-22 from patient PBC0002816, buffy coat sample	Illumina MiSeq	1
EGAD00001010360	Targeted capture ctDNA Library CRCQV34Run007-23 from patient PBC0002824, buffy coat sample	Illumina MiSeq	1
EGAD00001010361	Targeted capture ctDNA Library CRCQV34Run007-24 from patient PBC0001051, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010362	Targeted capture ctDNA Library CRCQV34Run007-5 from patient PBC0001255, plasma baseline sample	Illumina MiSeq	1
EGAD00001010363	Targeted capture ctDNA Library CRCQV34Run007-8 from patient PBC0001845, plasma baseline sample	Illumina MiSeq	1
EGAD00001010364	Targeted capture ctDNA Library CRCQV34Run011-10 from patient PBC0001673, plasma baseline sample	Illumina MiSeq	1
EGAD00001010365	Targeted capture ctDNA Library CRCQV34Run011-11 from patient PBC0002255, plasma baseline sample	Illumina MiSeq	1
EGAD00001010366	Targeted capture ctDNA Library CRCQV34Run011-12 from patient PBC0002294, plasma baseline sample	Illumina MiSeq	1
EGAD00001010367	Targeted capture ctDNA Library CRCQV34Run011-13 from patient PBC0001328, plasma baseline sample	Illumina MiSeq	1
EGAD00001010368	Targeted capture ctDNA Library CRCQV34Run011-14 from patient PBC0001224, saliva sample	Illumina MiSeq	1
EGAD00001010369	Targeted capture ctDNA Library CRCQV34Run011-15 from patient PBC0002383, buffy coat sample	Illumina MiSeq	1
EGAD00001010370	Targeted capture ctDNA Library CRCQV34Run011-18 from patient PBC0001432, saliva sample	Illumina MiSeq	1
EGAD00001010371	Targeted capture ctDNA Library CRCQV34Run011-20 from patient PBC0001673, saliva sample	Illumina MiSeq	1
EGAD00001010372	Targeted capture ctDNA Library CRCQV34Run011-21 from patient PBC0002255, saliva sample	Illumina MiSeq	1
EGAD00001010373	Targeted capture ctDNA Library CRCQV34Run011-22 from patient PBC0002294, saliva sample	Illumina MiSeq	1
EGAD00001010374	Targeted capture ctDNA Library CRCQV34Run011-23 from patient PBC0001328, saliva sample	Illumina MiSeq	1
EGAD00001010375	Targeted capture ctDNA Library CRCQV34Run011-4 from patient PBC0001224, plasma baseline sample	Illumina MiSeq	1
EGAD00001010376	Targeted capture ctDNA Library CRCQV34Run011-5 from patient PBC0002383, plasma baseline sample	Illumina MiSeq	1
EGAD00001010377	Targeted capture ctDNA Library CRCQV34Run011-8 from patient PBC0001432, plasma baseline sample	Illumina MiSeq	1
EGAD00001010378	Targeted capture ctDNA Library CRCQV34Run015-12 from patient PBC0002108, plasma 12 month sample	Illumina MiSeq	1
EGAD00001010379	Targeted capture ctDNA Library CRCQV34Run015-13 from patient PBC0002459, plasma 9 month sample	Illumina MiSeq	1
EGAD00001010380	Targeted capture ctDNA Library CRCQV34Run015-14 from patient PBC0002595, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010381	Targeted capture ctDNA Library CRCQV34Run015-17 from patient PBC0002824, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010382	Targeted capture ctDNA Library CRCQV34Run015-20 from patient PBC0001432, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010383	Targeted capture ctDNA Library CRCQV34Run015-22 from patient PBC0001673, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010384	Targeted capture ctDNA Library CRCQV34Run015-23 from patient PBC0001255, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010385	Targeted capture ctDNA Library CRCQV34Run015-24 from patient PBC0002255, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010386	Targeted capture ctDNA Library CRCQV34Run015-9 from patient PBC0001627, plasma 12 month sample	Illumina MiSeq	1
EGAD00001010387	Targeted capture ctDNA Library CRCQV40Run027-10 from patient PBC0004414, saliva sample	Illumina MiSeq	1
EGAD00001010388	Targeted capture ctDNA Library CRCQV40Run027-11 from patient PBC0004565, saliva sample	Illumina MiSeq	1
EGAD00001010389	Targeted capture ctDNA Library CRCQV40Run027-13 from patient PBC0005116, saliva sample	Illumina MiSeq	1
EGAD00001010390	Targeted capture ctDNA Library CRCQV40Run027-17 from patient PBC0004076, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010391	Targeted capture ctDNA Library CRCQV40Run027-18 from patient PBC0004350, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010392	Targeted capture ctDNA Library CRCQV40Run027-19 from patient PBC0004414, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010393	Targeted capture ctDNA Library CRCQV40Run027-20 from patient PBC0004565, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010394	Targeted capture ctDNA Library CRCQV40Run027-22 from patient PBC0005116, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010395	Targeted capture ctDNA Library CRCQV40Run027-6 from patient PBC0001335, buffy coat sample	Illumina MiSeq	1
EGAD00001010396	Targeted capture ctDNA Library CRCQV40Run027-7 from patient PBC0003364, saliva sample	Illumina MiSeq	1
EGAD00001010397	Targeted capture ctDNA Library CRCQV40Run027-8 from patient PBC0004076, saliva sample	Illumina MiSeq	1
EGAD00001010398	Targeted capture ctDNA Library CRCQV40Run027-9 from patient PBC0004350, saliva sample	Illumina MiSeq	1
EGAD00001010399	Targeted capture ctDNA Library CRCQV42Run016-11 from patient PBC0001467, plasma baseline sample	Illumina MiSeq	1
EGAD00001010400	Targeted capture ctDNA Library CRCQV42Run016-16 from patient PBC0001396, saliva sample	Illumina MiSeq	1
EGAD00001010401	Targeted capture ctDNA Library CRCQV42Run016-17 from patient PBC0001467, saliva sample	Illumina MiSeq	1
EGAD00001010402	Targeted capture ctDNA Library CRCQV42Run016-9 from patient PBC0001396, plasma baseline sample	Illumina MiSeq	1
EGAD00001010403	Targeted capture ctDNA Library CRCQV42Run017-10 from patient PBC0002255, plasma 12 month sample	Illumina MiSeq	1
EGAD00001010404	Targeted capture ctDNA Library CRCQV42Run017-11 from patient PBC0001845, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010405	Targeted capture ctDNA Library CRCQV42Run017-12 from patient PBC0002429, plasma 9 month sample	Illumina MiSeq	1
EGAD00001010406	Targeted capture ctDNA Library CRCQV42Run017-13 from patient PBC0002383, plasma 12 month sample	Illumina MiSeq	1
EGAD00001010407	Targeted capture ctDNA Library CRCQV42Run017-15 from patient PBC0001673, plasma 24 month sample	Illumina MiSeq	1
EGAD00001010408	Targeted capture ctDNA Library CRCQV42Run017-16 from patient PBC0002816, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010409	Targeted capture ctDNA Library CRCQV42Run017-17 from patient PBC0002062, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010410	Targeted capture ctDNA Library CRCQV42Run017-18 from patient PBC0002294, plasma 12 month sample	Illumina MiSeq	1
EGAD00001010411	Targeted capture ctDNA Library CRCQV42Run017-19 from patient PBC0001432, plasma 18 month sample	Illumina MiSeq	1
EGAD00001010412	Targeted capture ctDNA Library CRCQV42Run017-20 from patient PBC0001051, plasma 18 month sample	Illumina MiSeq	1
EGAD00001010413	Targeted capture ctDNA Library CRCQV42Run017-22 from patient PBC0001627, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010414	Targeted capture ctDNA Library CRCQV42Run017-24 from patient PBC0002294, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010415	Targeted capture ctDNA Library CRCQV42Run017-8 from patient PBC0001224, plasma 15 month sample	Illumina MiSeq	1
EGAD00001010416	Targeted capture ctDNA Library CRCQV42Run017-9 from patient PBC0001255, plasma 18 month sample	Illumina MiSeq	1
EGAD00001010417	Targeted capture ctDNA Library CRCQV42Run018-18 from patient PBC0001859, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010418	Targeted capture ctDNA Library CRCQV42Run018-20 from patient PBC0001859, saliva sample	Illumina MiSeq	1
EGAD00001010419	Targeted capture ctDNA Library CRCQV42Run019-12 from patient PBC0001295, plasma baseline sample	Illumina MiSeq	1
EGAD00001010420	Targeted capture ctDNA Library CRCQV42Run019-14 from patient PBC0001304, plasma baseline sample	Illumina MiSeq	1
EGAD00001010421	Targeted capture ctDNA Library CRCQV42Run019-18 from patient PBC0001295, saliva sample	Illumina MiSeq	1
EGAD00001010422	Targeted capture ctDNA Library CRCQV42Run019-21 from patient PBC0001310, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010423	Targeted capture ctDNA Library CRCQV42Run019-6 from patient PBC0001295, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010424	Targeted capture ctDNA Library CRCQV42Run019-8 from patient PBC0001304, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010425	Targeted capture ctDNA Library CRCQV42Run020-10 from patient PBC0001315, plasma baseline sample	Illumina MiSeq	1
EGAD00001010426	Targeted capture ctDNA Library CRCQV42Run020-14 from patient PBC0001312, saliva sample	Illumina MiSeq	1
EGAD00001010427	Targeted capture ctDNA Library CRCQV42Run020-15 from patient PBC0001315, saliva sample	Illumina MiSeq	1
EGAD00001010428	Targeted capture ctDNA Library CRCQV42Run020-23 from patient PBC0001310, saliva sample	Illumina MiSeq	1
EGAD00001010429	Targeted capture ctDNA Library CRCQV42Run020-24 from patient PBC0001310, plasma baseline sample	Illumina MiSeq	1
EGAD00001010430	Targeted capture ctDNA Library CRCQV42Run020-4 from patient PBC0001312, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010431	Targeted capture ctDNA Library CRCQV42Run020-5 from patient PBC0001315, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010432	Targeted capture ctDNA Library CRCQV42Run020-9 from patient PBC0001312, plasma baseline sample	Illumina MiSeq	1
EGAD00001010433	Targeted capture ctDNA Library CRCQV42Run021-10 from patient PBC0001470, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010434	Targeted capture ctDNA Library CRCQV42Run021-11 from patient PBC0001516, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010435	Targeted capture ctDNA Library CRCQV42Run021-12 from patient PBC0001653, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010436	Targeted capture ctDNA Library CRCQV42Run021-13 from patient PBC0001299, plasma baseline sample	Illumina MiSeq	1
EGAD00001010437	Targeted capture ctDNA Library CRCQV42Run021-15 from patient PBC0001323, plasma baseline sample	Illumina MiSeq	1
EGAD00001010438	Targeted capture ctDNA Library CRCQV42Run021-16 from patient PBC0001470, plasma 1 month sample	Illumina MiSeq	1
EGAD00001010439	Targeted capture ctDNA Library CRCQV42Run021-17 from patient PBC0001516, plasma baseline sample	Illumina MiSeq	1
EGAD00001010440	Targeted capture ctDNA Library CRCQV42Run021-18 from patient PBC0001653, plasma baseline sample	Illumina MiSeq	1
EGAD00001010441	Targeted capture ctDNA Library CRCQV42Run021-19 from patient PBC0001299, saliva sample	Illumina MiSeq	1
EGAD00001010442	Targeted capture ctDNA Library CRCQV42Run021-21 from patient PBC0001323, saliva sample	Illumina MiSeq	1
EGAD00001010443	Targeted capture ctDNA Library CRCQV42Run021-22 from patient PBC0001470, saliva sample	Illumina MiSeq	1
EGAD00001010444	Targeted capture ctDNA Library CRCQV42Run021-23 from patient PBC0001516, saliva sample	Illumina MiSeq	1
EGAD00001010445	Targeted capture ctDNA Library CRCQV42Run021-24 from patient PBC0001653, saliva sample	Illumina MiSeq	1
EGAD00001010446	Targeted capture ctDNA Library CRCQV42Run021-4 from patient PBC0001328, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010447	Targeted capture ctDNA Library CRCQV42Run021-6 from patient PBC0001304, saliva sample	Illumina MiSeq	1
EGAD00001010448	Targeted capture ctDNA Library CRCQV42Run021-7 from patient PBC0001299, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010449	Targeted capture ctDNA Library CRCQV42Run021-9 from patient PBC0001323, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010450	Targeted capture ctDNA Library CRCQV42Run022-13 from patient PBC0002826, plasma baseline sample	Illumina MiSeq	1
EGAD00001010451	Targeted capture ctDNA Library CRCQV42Run022-16 from patient PBC0002406, saliva sample	Illumina MiSeq	1
EGAD00001010452	Targeted capture ctDNA Library CRCQV42Run022-17 from patient PBC0002744, saliva sample	Illumina MiSeq	1
EGAD00001010453	Targeted capture ctDNA Library CRCQV42Run022-18 from patient PBC0002826, saliva sample	Illumina MiSeq	1
EGAD00001010454	Targeted capture ctDNA Library CRCQV42Run022-19 from patient PBC0002680, plasma 6 month repeat sample	Illumina MiSeq	1
EGAD00001010455	Targeted capture ctDNA Library CRCQV42Run022-20 from patient PBC0001845, plasma 12 month repeat sample	Illumina MiSeq	1
EGAD00001010456	Targeted capture ctDNA Library CRCQV42Run022-22 from patient PBC0002062, plasma 12 month repeat sample	Illumina MiSeq	1
EGAD00001010457	Targeted capture ctDNA Library CRCQV42Run022-6 from patient PBC0002406, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010458	Targeted capture ctDNA Library CRCQV42Run022-7 from patient PBC0002744, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010459	Targeted capture ctDNA Library CRCQV42Run022-8 from patient PBC0002826, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010460	Targeted capture ctDNA Library CRCQV42Run023-10 from patient PBC0001470, earliest baseline plasma sample	Illumina MiSeq	1
EGAD00001010461	Targeted capture ctDNA Library CRCQV42Run023-16 from patient PBC0002824, saliva sample	Illumina MiSeq	1
EGAD00001010462	Targeted capture ctDNA Library CRCQV42Run023-19 from patient PBC0002383, plasma baseline repeat sample	Illumina MiSeq	1
EGAD00001010463	Targeted capture ctDNA Library CRCQV42Run023-20 from patient PBC0001224, plasma baseline repeat sample	Illumina MiSeq	1
EGAD00001010464	Targeted capture ctDNA Library CRCQV42Run023-24 from patient PBC0002383, saliva sample	Illumina MiSeq	1
EGAD00001010465	Targeted capture ctDNA Library CRCQV42Run024-11 from patient PBC0001306, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010466	Targeted capture ctDNA Library CRCQV42Run024-13 from patient PBC0001306, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010467	Targeted capture ctDNA Library CRCQV42Run024-14 from patient PBC0002853, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010468	Targeted capture ctDNA Library CRCQV42Run024-16 from patient PBC0003641, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010469	Targeted capture ctDNA Library CRCQV42Run024-18 from patient PBC0001306, saliva sample	Illumina MiSeq	1
EGAD00001010470	Targeted capture ctDNA Library CRCQV42Run024-20 from patient PBC0001589, saliva sample	Illumina MiSeq	1
EGAD00001010471	Targeted capture ctDNA Library CRCQV42Run024-21 from patient PBC0002853, saliva sample	Illumina MiSeq	1
EGAD00001010472	Targeted capture ctDNA Library CRCQV42Run024-23 from patient PBC0003641, saliva sample	Illumina MiSeq	1
EGAD00001010473	Targeted capture ctDNA Library CRCQV42Run024-4 from patient PBC0001306, plasma baseline sample	Illumina MiSeq	1
EGAD00001010474	Targeted capture ctDNA Library CRCQV42Run024-6 from patient PBC0001589, plasma baseline sample	Illumina MiSeq	1
EGAD00001010475	Targeted capture ctDNA Library CRCQV42Run024-9 from patient PBC0003641, plasma baseline sample	Illumina MiSeq	1
EGAD00001010476	Targeted capture ctDNA Library CRCQV42Run025-10 from patient PBC0001335, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010477	Targeted capture ctDNA Library CRCQV42Run025-11 from patient PBC0003595, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010478	Targeted capture ctDNA Library CRCQV42Run025-12 from patient PBC0003014, saliva sample	Illumina MiSeq	1
EGAD00001010479	Targeted capture ctDNA Library CRCQV42Run025-13 from patient PBC0003334, saliva sample	Illumina MiSeq	1
EGAD00001010480	Targeted capture ctDNA Library CRCQV42Run025-14 from patient PBC0003385, saliva sample	Illumina MiSeq	1
EGAD00001010481	Targeted capture ctDNA Library CRCQV42Run025-17 from patient PBC0003595, saliva sample	Illumina MiSeq	1
EGAD00001010482	Targeted capture ctDNA Library CRCQV42Run025-19 from patient PBC0003364, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010483	Targeted capture ctDNA Library CRCQV42Run025-5 from patient PBC0001589, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010484	Targeted capture ctDNA Library CRCQV42Run025-6 from patient PBC0003014, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010485	Targeted capture ctDNA Library CRCQV42Run025-7 from patient PBC0003334, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010486	Targeted capture ctDNA Library CRCQV42Run025-8 from patient PBC0003385, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010487	Targeted capture ctDNA Library CRCQV42Run026-14 from patient PBC0003587, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010488	Targeted capture ctDNA Library CRCQV42Run026-15 from patient PBC0002872, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010489	Targeted capture ctDNA Library CRCQV42Run026-16 from patient PBC0005064, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010490	Targeted capture ctDNA Library CRCQV42Run026-17 from patient PBC0003643, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010491	Targeted capture ctDNA Library CRCQV42Run026-18 from patient PBC0001299, plasma baseline repeat sample	Illumina MiSeq	1
EGAD00001010492	Targeted capture ctDNA Library CRCQV42Run026-19 from patient PBC0002872, saliva sample	Illumina MiSeq	1
EGAD00001010493	Targeted capture ctDNA Library CRCQV42Run026-21 from patient PBC0003587, saliva sample	Illumina MiSeq	1
EGAD00001010494	Targeted capture ctDNA Library CRCQV42Run026-22 from patient PBC0003364, saliva sample	Illumina MiSeq	1
EGAD00001010495	Targeted capture ctDNA Library CRCQV42Run026-24 from patient PBC0005064, saliva sample	Illumina MiSeq	1
EGAD00001010496	Targeted capture ctDNA Library CRCQV42Run026-4 from patient PBC0002872, plasma baseline sample	Illumina MiSeq	1
EGAD00001010497	Targeted capture ctDNA Library CRCQV42Run026-5 from patient PBC0003014, plasma baseline sample	Illumina MiSeq	1
EGAD00001010498	Targeted capture ctDNA Library CRCQV42Run026-6 from patient PBC0003595, plasma baseline sample	Illumina MiSeq	1
EGAD00001010499	Targeted capture ctDNA Library CRCQV42Run026-9 from patient PBC0003385, plasma baseline sample	Illumina MiSeq	1
EGAD00001010500	Targeted capture ctDNA Library CRCQV42Run028-10 from patient PBC0003587, plasma baseline sample	Illumina MiSeq	1
EGAD00001010501	Targeted capture ctDNA Library CRCQV42Run028-11 from patient PBC0003643, plasma baseline sample	Illumina MiSeq	1
EGAD00001010502	Targeted capture ctDNA Library CRCQV42Run028-12 from patient PBC0004076, plasma baseline sample	Illumina MiSeq	1
EGAD00001010503	Targeted capture ctDNA Library CRCQV42Run028-13 from patient PBC0004414, plasma baseline sample	Illumina MiSeq	1
EGAD00001010504	Targeted capture ctDNA Library CRCQV42Run028-14 from patient PBC0004565, plasma baseline sample	Illumina MiSeq	1
EGAD00001010505	Targeted capture ctDNA Library CRCQV42Run028-16 from patient PBC0005064, plasma baseline sample	Illumina MiSeq	1
EGAD00001010506	Targeted capture ctDNA Library CRCQV42Run028-17 from patient PBC0005116, plasma baseline sample	Illumina MiSeq	1
EGAD00001010507	Targeted capture ctDNA Library CRCQV42Run028-20 from patient PBC0002533, plasma baseline sample	Illumina MiSeq	1
EGAD00001010508	Targeted capture ctDNA Library CRCQV42Run028-23 from patient PBC0002533, saliva sample	Illumina MiSeq	1
EGAD00001010509	Targeted capture ctDNA Library CRCQV42Run028-4 from patient PBC0001335, plasma baseline sample	Illumina MiSeq	1
EGAD00001010510	Targeted capture ctDNA Library CRCQV42Run028-9 from patient PBC0003364, plasma baseline sample	Illumina MiSeq	1
EGAD00001010511	Targeted capture ctDNA Library CRCQV42Run029-11 from patient PBC0001353, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010512	Targeted capture ctDNA Library CRCQV42Run029-12 from patient PBC0004173, plasma baseline sample	Illumina MiSeq	1
EGAD00001010513	Targeted capture ctDNA Library CRCQV42Run029-16 from patient PBC0001353, plasma baseline sample	Illumina MiSeq	1
EGAD00001010514	Targeted capture ctDNA Library CRCQV42Run029-17 from patient PBC0004173, saliva sample	Illumina MiSeq	1
EGAD00001010515	Targeted capture ctDNA Library CRCQV42Run029-21 from patient PBC0001353, saliva sample	Illumina MiSeq	1
EGAD00001010516	Targeted capture ctDNA Library CRCQV42Run029-23 from patient PBC0004350, plasma baseline sample	Illumina MiSeq	1
EGAD00001010517	Targeted capture ctDNA Library CRCQV42Run029-24 from patient PBC0003334, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010518	Targeted capture ctDNA Library CRCQV42Run029-5 from patient PBC0002533, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010519	Targeted capture ctDNA Library CRCQV42Run029-7 from patient PBC0004173, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010520	Targeted capture ctDNA Library CRCQV42Run030-14 from patient PBC0001315, buffy coat sample	Illumina MiSeq	1
EGAD00001010521	Targeted capture ctDNA Library CRCQV42Run030-15 from patient PBC0003014, buffy coat sample	Illumina MiSeq	1
EGAD00001010522	Targeted capture ctDNA Library CRCQV42Run030-19 from patient PBC0001323, buffy coat sample	Illumina MiSeq	1
EGAD00001010523	Targeted capture ctDNA Library CRCQV42Run030-20 from patient PBC0001673, buffy coat sample	Illumina MiSeq	1
EGAD00001010524	Targeted capture ctDNA Library CRCQV42Run030-21 from patient PBC0003641, buffy coat sample	Illumina MiSeq	1
EGAD00001010525	Targeted capture ctDNA Library CRCQV42Run030-23 from patient PBC0002853, buffy coat sample	Illumina MiSeq	1
EGAD00001010526	Targeted capture ctDNA Library CRCQV42Run030-5 from patient PBC0001051, buffy coat sample	Illumina MiSeq	1
EGAD00001010527	Targeted capture ctDNA Library CRCQV42Run031-10 from patient PBC0001306, buffy coat sample	Illumina MiSeq	1
EGAD00001010528	Targeted capture ctDNA Library CRCQV42Run031-14 from patient PBC0001310, buffy coat sample	Illumina MiSeq	1
EGAD00001010529	Targeted capture ctDNA Library CRCQV42Run031-16 from patient PBC0001328, buffy coat sample	Illumina MiSeq	1
EGAD00001010530	Targeted capture ctDNA Library CRCQV42Run031-18 from patient PBC0001375, buffy coat sample	Illumina MiSeq	1
EGAD00001010531	Targeted capture ctDNA Library CRCQV42Run031-19 from patient PBC0001404, buffy coat sample	Illumina MiSeq	1
EGAD00001010532	Targeted capture ctDNA Library CRCQV42Run031-21 from patient PBC0001413, buffy coat sample	Illumina MiSeq	1
EGAD00001010533	Targeted capture ctDNA Library CRCQV42Run031-23 from patient PBC0001467, buffy coat sample	Illumina MiSeq	1
EGAD00001010534	Targeted capture ctDNA Library CRCQV42Run031-24 from patient PBC0001470, buffy coat sample	Illumina MiSeq	1
EGAD00001010535	Targeted capture ctDNA Library CRCQV42Run031-7 from patient PBC0001295, buffy coat sample	Illumina MiSeq	1
EGAD00001010536	Targeted capture ctDNA Library CRCQV42Run031-8 from patient PBC0001299, buffy coat sample	Illumina MiSeq	1
EGAD00001010537	Targeted capture ctDNA Library CRCQV42Run031-9 from patient PBC0001304, buffy coat sample	Illumina MiSeq	1
EGAD00001010538	Targeted capture ctDNA Library CRCQV42Run032-12 from patient PBC0001224, buffy coat sample	Illumina MiSeq	1
EGAD00001010539	Targeted capture ctDNA Library CRCQV42Run032-13 from patient PBC0001255, buffy coat sample	Illumina MiSeq	1
EGAD00001010540	Targeted capture ctDNA Library CRCQV42Run032-16 from patient PBC0001353, buffy coat sample	Illumina MiSeq	1
EGAD00001010541	Targeted capture ctDNA Library CRCQV42Run032-17 from patient PBC0001396, buffy coat sample	Illumina MiSeq	1
EGAD00001010542	Targeted capture ctDNA Library CRCQV42Run032-19 from patient PBC0001432, buffy coat sample	Illumina MiSeq	1
EGAD00001010543	Targeted capture ctDNA Library CRCQV42Run032-20 from patient PBC0001516, buffy coat sample	Illumina MiSeq	1
EGAD00001010544	Targeted capture ctDNA Library CRCQV42Run032-21 from patient PBC0001627, buffy coat sample	Illumina MiSeq	1
EGAD00001010545	Targeted capture ctDNA Library CRCQV42Run032-22 from patient PBC0001653, buffy coat sample	Illumina MiSeq	1
EGAD00001010546	Targeted capture ctDNA Library CRCQV42Run032-5 from patient PBC0003334, buffy coat sample	Illumina MiSeq	1
EGAD00001010547	Targeted capture ctDNA Library CRCQV42Run032-8 from patient PBC0001589, buffy coat sample	Illumina MiSeq	1
EGAD00001010548	Targeted capture ctDNA Library CRCQV42Run032-9 from patient PBC0001312, buffy coat sample	Illumina MiSeq	1
EGAD00001010549	Targeted capture ctDNA Library CRCQV42Run033-10 from patient PBC0002255, buffy coat sample	Illumina MiSeq	1
EGAD00001010550	Targeted capture ctDNA Library CRCQV42Run033-11 from patient PBC0002294, buffy coat sample	Illumina MiSeq	1
EGAD00001010551	Targeted capture ctDNA Library CRCQV42Run033-12 from patient PBC0002406, buffy coat sample	Illumina MiSeq	1
EGAD00001010552	Targeted capture ctDNA Library CRCQV42Run033-13 from patient PBC0002429, buffy coat sample	Illumina MiSeq	1
EGAD00001010553	Targeted capture ctDNA Library CRCQV42Run033-14 from patient PBC0002459, buffy coat sample	Illumina MiSeq	1
EGAD00001010554	Targeted capture ctDNA Library CRCQV42Run033-15 from patient PBC0002533, buffy coat sample	Illumina MiSeq	1
EGAD00001010555	Targeted capture ctDNA Library CRCQV42Run033-16 from patient PBC0002595, buffy coat sample	Illumina MiSeq	1
EGAD00001010556	Targeted capture ctDNA Library CRCQV42Run033-18 from patient PBC0002680, buffy coat sample	Illumina MiSeq	1
EGAD00001010557	Targeted capture ctDNA Library CRCQV42Run033-19 from patient PBC0002744, buffy coat sample	Illumina MiSeq	1
EGAD00001010558	Targeted capture ctDNA Library CRCQV42Run033-20 from patient PBC0002826, buffy coat sample	Illumina MiSeq	1
EGAD00001010559	Targeted capture ctDNA Library CRCQV42Run033-22 from patient PBC0003385, buffy coat sample	Illumina MiSeq	1
EGAD00001010560	Targeted capture ctDNA Library CRCQV42Run033-23 from patient PBC0003595, buffy coat sample	Illumina MiSeq	1
EGAD00001010561	Targeted capture ctDNA Library CRCQV42Run033-5 from patient PBC0001845, buffy coat sample	Illumina MiSeq	1
EGAD00001010562	Targeted capture ctDNA Library CRCQV42Run033-6 from patient PBC0001859, buffy coat sample	Illumina MiSeq	1
EGAD00001010563	Targeted capture ctDNA Library CRCQV42Run033-7 from patient PBC0002062, buffy coat sample	Illumina MiSeq	1
EGAD00001010564	Targeted capture ctDNA Library CRCQV42Run033-9 from patient PBC0002108, buffy coat sample	Illumina MiSeq	1
EGAD00001010565	Targeted capture ctDNA Library CRCQV42Run034-10 from patient PBC0004076, buffy coat sample	Illumina MiSeq	1
EGAD00001010566	Targeted capture ctDNA Library CRCQV42Run034-12 from patient PBC0004173, buffy coat sample	Illumina MiSeq	1
EGAD00001010567	Targeted capture ctDNA Library CRCQV42Run034-13 from patient PBC0004350, buffy coat sample	Illumina MiSeq	1
EGAD00001010568	Targeted capture ctDNA Library CRCQV42Run034-14 from patient PBC0004414, buffy coat sample	Illumina MiSeq	1
EGAD00001010569	Targeted capture ctDNA Library CRCQV42Run034-15 from patient PBC0004565, buffy coat sample	Illumina MiSeq	1
EGAD00001010570	Targeted capture ctDNA Library CRCQV42Run034-18 from patient PBC0005064, buffy coat sample	Illumina MiSeq	1
EGAD00001010571	Targeted capture ctDNA Library CRCQV42Run034-20 from patient PBC0005116, buffy coat sample	Illumina MiSeq	1
EGAD00001010572	Targeted capture ctDNA Library CRCQV42Run034-24 from patient PBC0001396, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010573	Targeted capture ctDNA Library CRCQV42Run034-4 from patient PBC0002872, buffy coat sample	Illumina MiSeq	1
EGAD00001010574	Targeted capture ctDNA Library CRCQV42Run034-5 from patient PBC0003364, buffy coat sample	Illumina MiSeq	1
EGAD00001010575	Targeted capture ctDNA Library CRCQV42Run034-6 from patient PBC0003587, buffy coat sample	Illumina MiSeq	1
EGAD00001010576	Targeted capture ctDNA Library CRCQV42Run034-7 from patient PBC0003643, buffy coat sample	Illumina MiSeq	1
EGAD00001010577	Targeted capture ctDNA Library CRCQV42Run035-10 from patient PBC0002872, plasma 12 month sample	Illumina MiSeq	1
EGAD00001010578	Targeted capture ctDNA Library CRCQV42Run035-11 from patient PBC0001589, plasma 30 month sample	Illumina MiSeq	1
EGAD00001010579	Targeted capture ctDNA Library CRCQV42Run035-12 from patient PBC0004414, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010580	Targeted capture ctDNA Library CRCQV42Run035-13 from patient PBC0001859, plasma 12 month sample	Illumina MiSeq	1
EGAD00001010581	Targeted capture ctDNA Library CRCQV42Run035-14 from patient PBC0001859, plasma 18 month sample	Illumina MiSeq	1
EGAD00001010582	Targeted capture ctDNA Library CRCQV42Run035-15 from patient PBC0002744, plasma 15 month sample	Illumina MiSeq	1
EGAD00001010583	Targeted capture ctDNA Library CRCQV42Run035-18 from patient PBC0002853, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010584	Targeted capture ctDNA Library CRCQV42Run035-19 from patient PBC0002853, plasma 9 month sample	Illumina MiSeq	1
EGAD00001010585	Targeted capture ctDNA Library CRCQV42Run035-20 from patient PBC0003334, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010586	Targeted capture ctDNA Library CRCQV42Run035-21 from patient PBC0003595, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010587	Targeted capture ctDNA Library CRCQV42Run035-23 from patient PBC0004350, plasma 9 month sample	Illumina MiSeq	1
EGAD00001010588	Targeted capture ctDNA Library CRCQV42Run035-24 from patient PBC0004565, plasma 9 month sample	Illumina MiSeq	1
EGAD00001010589	Targeted capture ctDNA Library CRCQV42Run035-6 from patient PBC0002406, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010590	Targeted capture ctDNA Library CRCQV42Run035-7 from patient PBC0001353, plasma 36 month sample	Illumina MiSeq	1
EGAD00001010591	Targeted capture ctDNA Library CRCQV42Run035-8 from patient PBC0002533, plasma 9 month sample	Illumina MiSeq	1
EGAD00001010592	Targeted capture ctDNA Library CRCQV42Run036-10 from patient PBC0002062, plasma 21 month sample	Illumina MiSeq	1
EGAD00001010593	Targeted capture ctDNA Library CRCQV42Run036-11 from patient PBC0002826, plasma 15 month sample	Illumina MiSeq	1
EGAD00001010594	Targeted capture ctDNA Library CRCQV42Run036-12 from patient PBC0003587, plasma 12 month sample	Illumina MiSeq	1
EGAD00001010595	Targeted capture ctDNA Library CRCQV42Run036-13 from patient PBC0001323, plasma 36 month sample	Illumina MiSeq	1
EGAD00001010596	Targeted capture ctDNA Library CRCQV42Run036-14 from patient PBC0001653, plasma 24 month sample	Illumina MiSeq	1
EGAD00001010597	Targeted capture ctDNA Library CRCQV42Run036-17 from patient PBC0001299, plasma 36 month sample	Illumina MiSeq	1
EGAD00001010598	Targeted capture ctDNA Library CRCQV42Run036-19 from patient PBC0004076, plasma 9 month sample	Illumina MiSeq	1
EGAD00001010599	Targeted capture ctDNA Library CRCQV42Run036-20 from patient PBC0003364, plasma 15 month sample	Illumina MiSeq	1
EGAD00001010600	Targeted capture ctDNA Library CRCQV42Run036-21 from patient PBC0004173, plasma 12 month sample	Illumina MiSeq	1
EGAD00001010601	Targeted capture ctDNA Library CRCQV42Run036-23 from patient PBC0002406, plasma 21 month sample	Illumina MiSeq	1
EGAD00001010602	Targeted capture ctDNA Library CRCQV42Run036-24 from patient PBC0001328, plasma 36 month sample	Illumina MiSeq	1
EGAD00001010603	Targeted capture ctDNA Library CRCQV42Run036-5 from patient PBC0001470, plasma 30 month sample	Illumina MiSeq	1
EGAD00001010604	Targeted capture ctDNA Library CRCQV42Run036-6 from patient PBC0005064, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010605	Targeted capture ctDNA Library CRCQV42Run036-8 from patient PBC0003641, plasma 12 month sample	Illumina MiSeq	1
EGAD00001010606	Targeted capture ctDNA Library CRCQV42Run039-10 from patient PBC0001299, plasma 9 month sample	Illumina MiSeq	1
EGAD00001010607	Targeted capture ctDNA Library CRCQV42Run039-11 from patient PBC0001323, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010608	Targeted capture ctDNA Library CRCQV42Run039-12 from patient PBC0001328, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010609	Targeted capture ctDNA Library CRCQV42Run039-14 from patient PBC0001353, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010610	Targeted capture ctDNA Library CRCQV42Run039-15 from patient PBC0001432, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010611	Targeted capture ctDNA Library CRCQV42Run039-16 from patient PBC0001653, plasma 9 month sample	Illumina MiSeq	1
EGAD00001010612	Targeted capture ctDNA Library CRCQV42Run039-17 from patient PBC0001673, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010613	Targeted capture ctDNA Library CRCQV42Run039-18 from patient PBC0002108, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010614	Targeted capture ctDNA Library CRCQV42Run039-19 from patient PBC0002383, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010615	Targeted capture ctDNA Library CRCQV42Run039-20 from patient PBC0002429, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010616	Targeted capture ctDNA Library CRCQV42Run039-21 from patient PBC0003014, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010617	Targeted capture ctDNA Library CRCQV42Run039-22 from patient PBC0003364, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010618	Targeted capture ctDNA Library CRCQV42Run039-23 from patient PBC0003643, plasma 15 month sample	Illumina MiSeq	1
EGAD00001010619	Targeted capture ctDNA Library CRCQV42Run039-24 from patient PBC0001413, plasma baseline sample	Illumina MiSeq	1
EGAD00001010620	Targeted capture ctDNA Library CRCQV42Run039-4 from patient PBC0001859, plasma baseline sample	Illumina MiSeq	1
EGAD00001010621	Targeted capture ctDNA Library CRCQV42Run039-5 from patient PBC0002406, plasma baseline sample	Illumina MiSeq	1
EGAD00001010622	Targeted capture ctDNA Library CRCQV42Run039-6 from patient PBC0002853, plasma baseline sample	Illumina MiSeq	1
EGAD00001010623	Targeted capture ctDNA Library CRCQV42Run039-7 from patient PBC0002853, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010624	Targeted capture ctDNA Library CRCQV42Run039-8 from patient PBC0003334, plasma baseline sample	Illumina MiSeq	1
EGAD00001010625	Targeted capture ctDNA Library CRCQV42Run039-9 from patient PBC0001335, saliva, repeat sample	Illumina MiSeq	1
EGAD00001010626	Targeted capture ctDNA Library CRCQV42Run041-12 from patient PBC0001413, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010627	Targeted capture ctDNA Library CRCQV42Run041-13 from patient PBC0001413, saliva sample	Illumina MiSeq	1
EGAD00001010628	Targeted capture ctDNA Library CRCQV42Run041-17 from patient PBC0002406, plasma 9 month sample	Illumina MiSeq	1
EGAD00001010629	Targeted capture ctDNA Library CRCQV42Run041-20 from patient PBC0002744, plasma baseline sample	Illumina MiSeq	1
EGAD00001010630	Targeted capture ctDNA Library CRCQV42Run041-21 from patient PBC0001335, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010631	Targeted capture ctDNA Library CRCQV42Run041-22 from patient PBC0005064, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010632	Targeted capture ctDNA Library CRCQV42Run041-23 from patient PBC0002294, plasma 21 month sample	Illumina MiSeq	1
EGAD00001010633	Targeted capture ctDNA Library CRCQV42Run041-24 from patient PBC0005116, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010634	Targeted capture ctDNA Library CRCQV42Run041-4 from patient PBC0001375, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010635	Targeted capture ctDNA Library CRCQV42Run041-5 from patient PBC0001375, saliva sample	Illumina MiSeq	1
EGAD00001010636	Targeted capture ctDNA Library CRCQV42Run041-6 from patient PBC0001404, plasma baseline sample	Illumina MiSeq	1
EGAD00001010637	Targeted capture ctDNA Library CRCQV42Run041-7 from patient PBC0001404, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010638	Targeted capture ctDNA Library CRCQV42Run041-8 from patient PBC0001404, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010639	Targeted capture ctDNA Library CRCQV42Run041-9 from patient PBC0001404, saliva sample	Illumina MiSeq	1
EGAD00001010640	Targeted capture ctDNA Library CRCQV42Run043-10 from patient PBC0001051, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010641	Targeted capture ctDNA Library CRCQV42Run043-11 from patient PBC0001310, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010642	Targeted capture ctDNA Library CRCQV42Run043-12 from patient PBC0003385, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010643	Targeted capture ctDNA Library CRCQV42Run043-13 from patient PBC0001295, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010644	Targeted capture ctDNA Library CRCQV42Run043-15 from patient PBC0001470, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010645	Targeted capture ctDNA Library CRCQV42Run043-16 from patient PBC0002255, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010646	Targeted capture ctDNA Library CRCQV42Run043-17 from patient PBC0004350, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010647	Targeted capture ctDNA Library CRCQV42Run043-18 from patient PBC0002995, plasma baseline sample	Illumina MiSeq	1
EGAD00001010648	Targeted capture ctDNA Library CRCQV42Run043-19 from patient PBC0004537, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010649	Targeted capture ctDNA Library CRCQV42Run043-20 from patient PBC0003244, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010650	Targeted capture ctDNA Library CRCQV42Run043-21 from patient PBC0005013, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010651	Targeted capture ctDNA Library CRCQV42Run043-22 from patient PBC0003448, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010652	Targeted capture ctDNA Library CRCQV42Run043-23 from patient PBC0002480, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010653	Targeted capture ctDNA Library CRCQV42Run043-24 from patient PBC0002329, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010654	Targeted capture ctDNA Library CRCQV42Run043-4 from patient PBC0002294, plasma 15 month sample	Illumina MiSeq	1
EGAD00001010655	Targeted capture ctDNA Library CRCQV42Run043-5 from patient PBC0001516, plasma 15 month sample	Illumina MiSeq	1
EGAD00001010656	Targeted capture ctDNA Library CRCQV42Run043-6 from patient PBC0001255, plasma 15 month sample	Illumina MiSeq	1
EGAD00001010657	Targeted capture ctDNA Library CRCQV42Run043-7 from patient PBC0001589, plasma 12 month sample	Illumina MiSeq	1
EGAD00001010658	Targeted capture ctDNA Library CRCQV42Run043-8 from patient PBC0001304, plasma 9 month sample	Illumina MiSeq	1
EGAD00001010659	Targeted capture ctDNA Library CRCQV42Run043-9 from patient PBC0001306, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010660	Targeted capture ctDNA Library CRCQV42Run044-10 from patient PBC0001488, plasma baseline sample	Illumina MiSeq	1
EGAD00001010661	Targeted capture ctDNA Library CRCQV42Run044-11 from patient PBC0001515, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010662	Targeted capture ctDNA Library CRCQV42Run044-12 from patient PBC0001550, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010663	Targeted capture ctDNA Library CRCQV42Run044-13 from patient PBC0001665, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010664	Targeted capture ctDNA Library CRCQV42Run044-14 from patient PBC0001667, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010665	Targeted capture ctDNA Library CRCQV42Run044-15 from patient PBC0001818, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010666	Targeted capture ctDNA Library CRCQV42Run044-16 from patient PBC0005586, plasma 12 month sample	Illumina MiSeq	1
EGAD00001010667	Targeted capture ctDNA Library CRCQV42Run044-17 from patient PBC0005602, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010668	Targeted capture ctDNA Library CRCQV42Run044-18 from patient PBC0003376, plasma 12 month sample	Illumina MiSeq	1
EGAD00001010669	Targeted capture ctDNA Library CRCQV42Run044-19 from patient PBC0005963, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010670	Targeted capture ctDNA Library CRCQV42Run044-20 from patient PBC0005209, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010671	Targeted capture ctDNA Library CRCQV42Run044-21 from patient PBC0005373, plasma baseline sample	Illumina MiSeq	1
EGAD00001010672	Targeted capture ctDNA Library CRCQV42Run044-22 from patient PBC0006040, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010673	Targeted capture ctDNA Library CRCQV42Run044-23 from patient PBC0005498, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010674	Targeted capture ctDNA Library CRCQV42Run044-24 from patient PBC0005531, plasma 9 month sample	Illumina MiSeq	1
EGAD00001010675	Targeted capture ctDNA Library CRCQV42Run044-4 from patient PBC0001048, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010676	Targeted capture ctDNA Library CRCQV42Run044-5 from patient PBC0001099, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010677	Targeted capture ctDNA Library CRCQV42Run044-6 from patient PBC0001279, plasma 18 month sample	Illumina MiSeq	1
EGAD00001010678	Targeted capture ctDNA Library CRCQV42Run044-7 from patient PBC0001314, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010679	Targeted capture ctDNA Library CRCQV42Run044-8 from patient PBC0001376, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010680	Targeted capture ctDNA Library CRCQV42Run044-9 from patient PBC0001445, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010681	Targeted capture ctDNA Library CRCQV42Run045-10 from patient PBC0002989, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010682	Targeted capture ctDNA Library CRCQV42Run045-11 from patient PBC0004156, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010683	Targeted capture ctDNA Library CRCQV42Run045-12 from patient PBC0001068, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010684	Targeted capture ctDNA Library CRCQV42Run045-13 from patient PBC0001727, plasma baseline sample	Illumina MiSeq	1
EGAD00001010685	Targeted capture ctDNA Library CRCQV42Run045-14 from patient PBC0001468, plasma baseline sample	Illumina MiSeq	1
EGAD00001010686	Targeted capture ctDNA Library CRCQV42Run045-15 from patient PBC0001714, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010687	Targeted capture ctDNA Library CRCQV42Run045-16 from patient PBC0001782, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010688	Targeted capture ctDNA Library CRCQV42Run045-17 from patient PBC0001982, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010689	Targeted capture ctDNA Library CRCQV42Run045-18 from patient PBC0001810, plasma baseline sample	Illumina MiSeq	1
EGAD00001010690	Targeted capture ctDNA Library CRCQV42Run045-19 from patient PBC0001311, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010691	Targeted capture ctDNA Library CRCQV42Run045-20 from patient PBC0002317, plasma baseline sample	Illumina MiSeq	1
EGAD00001010692	Targeted capture ctDNA Library CRCQV42Run045-21 from patient PBC0002651, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010693	Targeted capture ctDNA Library CRCQV42Run045-22 from patient PBC0001456, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010694	Targeted capture ctDNA Library CRCQV42Run045-23 from patient PBC0002481, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010695	Targeted capture ctDNA Library CRCQV42Run045-24 from patient PBC0001409, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010696	Targeted capture ctDNA Library CRCQV42Run045-4 from patient PBC0002458, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010697	Targeted capture ctDNA Library CRCQV42Run045-5 from patient PBC0002851, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010698	Targeted capture ctDNA Library CRCQV42Run045-6 from patient PBC0004124, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010699	Targeted capture ctDNA Library CRCQV42Run045-7 from patient PBC0005444, plasma 18 month sample	Illumina MiSeq	1
EGAD00001010700	Targeted capture ctDNA Library CRCQV42Run045-8 from patient PBC0006360, plasma 12 month sample	Illumina MiSeq	1
EGAD00001010701	Targeted capture ctDNA Library CRCQV42Run046-10 from patient PBC0001552, plasma 12 month sample	Illumina MiSeq	1
EGAD00001010702	Targeted capture ctDNA Library CRCQV42Run046-11 from patient PBC0001284, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010703	Targeted capture ctDNA Library CRCQV42Run046-12 from patient PBC0001135, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010704	Targeted capture ctDNA Library CRCQV42Run046-13 from patient PBC0002625, plasma baseline sample	Illumina MiSeq	1
EGAD00001010705	Targeted capture ctDNA Library CRCQV42Run046-14 from patient PBC0001065, plasma baseline sample	Illumina MiSeq	1
EGAD00001010706	Targeted capture ctDNA Library CRCQV42Run046-15 from patient PBC0001183, plasma baseline sample	Illumina MiSeq	1
EGAD00001010707	Targeted capture ctDNA Library CRCQV42Run046-16 from patient PBC0001042, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010708	Targeted capture ctDNA Library CRCQV42Run046-17 from patient PBC0001329, plasma baseline sample	Illumina MiSeq	1
EGAD00001010709	Targeted capture ctDNA Library CRCQV42Run046-18 from patient PBC0001067, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010710	Targeted capture ctDNA Library CRCQV42Run046-19 from patient PBC0001527, plasma 9 month sample	Illumina MiSeq	1
EGAD00001010711	Targeted capture ctDNA Library CRCQV42Run046-20 from patient PBC0001176, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010712	Targeted capture ctDNA Library CRCQV42Run046-21 from patient PBC0001828, plasma 9 month sample	Illumina MiSeq	1
EGAD00001010713	Targeted capture ctDNA Library CRCQV42Run046-22 from patient PBC0001486, plasma baseline sample	Illumina MiSeq	1
EGAD00001010714	Targeted capture ctDNA Library CRCQV42Run046-23 from patient PBC0002622, plasma 9 month sample	Illumina MiSeq	1
EGAD00001010715	Targeted capture ctDNA Library CRCQV42Run046-24 from patient PBC0002494, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010716	Targeted capture ctDNA Library CRCQV42Run046-4 from patient PBC0001776, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010717	Targeted capture ctDNA Library CRCQV42Run046-5 from patient PBC0001242, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010718	Targeted capture ctDNA Library CRCQV42Run046-6 from patient PBC0001666, plasma 3 month sample	Illumina MiSeq	1
EGAD00001010719	Targeted capture ctDNA Library CRCQV42Run046-7 from patient PBC0001528, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010720	Targeted capture ctDNA Library CRCQV42Run046-8 from patient PBC0001134, plasma baseline sample	Illumina MiSeq	1
EGAD00001010721	Targeted capture ctDNA Library CRCQV42Run046-9 from patient PBC0002769, plasma 6 month sample	Illumina MiSeq	1
EGAD00001010722	Targeted capture ctDNA Library CRCQV42Run047-10 from patient PBC0001051, buffy coat sample	Illumina MiSeq	1
EGAD00001010723	Targeted capture ctDNA Library CRCQV42Run047-11 from patient PBC0001310, buffy coat sample	Illumina MiSeq	1
EGAD00001010724	Targeted capture ctDNA Library CRCQV42Run047-12 from patient PBC0003385, buffy coat sample	Illumina MiSeq	1
EGAD00001010725	Targeted capture ctDNA Library CRCQV42Run047-13 from patient PBC0001295, buffy coat sample	Illumina MiSeq	1
EGAD00001010726	Targeted capture ctDNA Library CRCQV42Run047-14 from patient PBC0001470, buffy coat sample	Illumina MiSeq	1
EGAD00001010727	Targeted capture ctDNA Library CRCQV42Run047-15 from patient PBC0002255, buffy coat sample	Illumina MiSeq	1
EGAD00001010728	Targeted capture ctDNA Library CRCQV42Run047-16 from patient PBC0004350, buffy coat sample	Illumina MiSeq	1
EGAD00001010729	Targeted capture ctDNA Library CRCQV42Run047-17 from patient PBC0002995, buffy coat sample	Illumina MiSeq	1
EGAD00001010730	Targeted capture ctDNA Library CRCQV42Run047-18 from patient PBC0002458, buffy coat sample	Illumina MiSeq	1
EGAD00001010731	Targeted capture ctDNA Library CRCQV42Run047-19 from patient PBC0002851, buffy coat sample	Illumina MiSeq	1
EGAD00001010732	Targeted capture ctDNA Library CRCQV42Run047-20 from patient PBC0004124, buffy coat sample	Illumina MiSeq	1
EGAD00001010733	Targeted capture ctDNA Library CRCQV42Run047-21 from patient PBC0005444, buffy coat sample	Illumina MiSeq	1
EGAD00001010734	Targeted capture ctDNA Library CRCQV42Run047-22 from patient PBC0006360, buffy coat sample	Illumina MiSeq	1
EGAD00001010735	Targeted capture ctDNA Library CRCQV42Run047-24 from patient PBC0002989, buffy coat sample	Illumina MiSeq	1
EGAD00001010736	Targeted capture ctDNA Library CRCQV42Run047-4 from patient PBC0002294, buffy coat sample	Illumina MiSeq	1
EGAD00001010737	Targeted capture ctDNA Library CRCQV42Run047-5 from patient PBC0001516, buffy coat sample	Illumina MiSeq	1
EGAD00001010738	Targeted capture ctDNA Library CRCQV42Run047-6 from patient PBC0001255, buffy coat sample	Illumina MiSeq	1
EGAD00001010739	Targeted capture ctDNA Library CRCQV42Run047-7 from patient PBC0001589, buffy coat sample	Illumina MiSeq	1
EGAD00001010740	Targeted capture ctDNA Library CRCQV42Run047-8 from patient PBC0001304, buffy coat sample	Illumina MiSeq	1
EGAD00001010741	Targeted capture ctDNA Library CRCQV42Run047-9 from patient PBC0001306, buffy coat sample	Illumina MiSeq	1
EGAD00001010742	Targeted capture ctDNA Library CRCQV42Run048-10 from patient PBC0001488, buffy coat sample	Illumina MiSeq	1
EGAD00001010743	Targeted capture ctDNA Library CRCQV42Run048-11 from patient PBC0001515, buffy coat sample	Illumina MiSeq	1
EGAD00001010744	Targeted capture ctDNA Library CRCQV42Run048-12 from patient PBC0001550, buffy coat sample	Illumina MiSeq	1
EGAD00001010745	Targeted capture ctDNA Library CRCQV42Run048-13 from patient PBC0001665, buffy coat sample	Illumina MiSeq	1
EGAD00001010746	Targeted capture ctDNA Library CRCQV42Run048-14 from patient PBC0001667, buffy coat sample	Illumina MiSeq	1
EGAD00001010747	Targeted capture ctDNA Library CRCQV42Run048-15 from patient PBC0001818, buffy coat sample	Illumina MiSeq	1
EGAD00001010748	Targeted capture ctDNA Library CRCQV42Run048-16 from patient PBC0005586, buffy coat sample	Illumina MiSeq	1
EGAD00001010749	Targeted capture ctDNA Library CRCQV42Run048-17 from patient PBC0005602, buffy coat sample	Illumina MiSeq	1
EGAD00001010750	Targeted capture ctDNA Library CRCQV42Run048-18 from patient PBC0003376, buffy coat sample	Illumina MiSeq	1
EGAD00001010751	Targeted capture ctDNA Library CRCQV42Run048-19 from patient PBC0005963, buffy coat sample	Illumina MiSeq	1
EGAD00001010752	Targeted capture ctDNA Library CRCQV42Run048-20 from patient PBC0005209, buffy coat sample	Illumina MiSeq	1
EGAD00001010753	Targeted capture ctDNA Library CRCQV42Run048-21 from patient PBC0005373, buffy coat sample	Illumina MiSeq	1
EGAD00001010754	Targeted capture ctDNA Library CRCQV42Run048-22 from patient PBC0006040, buffy coat sample	Illumina MiSeq	1
EGAD00001010755	Targeted capture ctDNA Library CRCQV42Run048-23 from patient PBC0005498, buffy coat sample	Illumina MiSeq	1
EGAD00001010756	Targeted capture ctDNA Library CRCQV42Run048-24 from patient PBC0005531, buffy coat sample	Illumina MiSeq	1
EGAD00001010757	Targeted capture ctDNA Library CRCQV42Run048-4 from patient PBC0001048, buffy coat sample	Illumina MiSeq	1
EGAD00001010758	Targeted capture ctDNA Library CRCQV42Run048-5 from patient PBC0001099, buffy coat sample	Illumina MiSeq	1
EGAD00001010759	Targeted capture ctDNA Library CRCQV42Run048-6 from patient PBC0001279, buffy coat sample	Illumina MiSeq	1
EGAD00001010760	Targeted capture ctDNA Library CRCQV42Run048-7 from patient PBC0001314, buffy coat sample	Illumina MiSeq	1
EGAD00001010761	Targeted capture ctDNA Library CRCQV42Run048-8 from patient PBC0001376, buffy coat sample	Illumina MiSeq	1
EGAD00001010762	Targeted capture ctDNA Library CRCQV42Run048-9 from patient PBC0001445, buffy coat sample	Illumina MiSeq	1
EGAD00001010763	Targeted capture ctDNA Library CRCQV42Run049-10 from patient PBC0004156, buffy coat sample	Illumina MiSeq	1
EGAD00001010764	Targeted capture ctDNA Library CRCQV42Run049-11 from patient PBC0001068, buffy coat sample	Illumina MiSeq	1
EGAD00001010765	Targeted capture ctDNA Library CRCQV42Run049-12 from patient PBC0001727, buffy coat sample	Illumina MiSeq	1
EGAD00001010766	Targeted capture ctDNA Library CRCQV42Run049-13 from patient PBC0001468, buffy coat sample	Illumina MiSeq	1
EGAD00001010767	Targeted capture ctDNA Library CRCQV42Run049-14 from patient PBC0001714, buffy coat sample	Illumina MiSeq	1
EGAD00001010768	Targeted capture ctDNA Library CRCQV42Run049-15 from patient PBC0001782, buffy coat sample	Illumina MiSeq	1
EGAD00001010769	Targeted capture ctDNA Library CRCQV42Run049-16 from patient PBC0001982, buffy coat sample	Illumina MiSeq	1
EGAD00001010770	Targeted capture ctDNA Library CRCQV42Run049-17 from patient PBC0001810, buffy coat sample	Illumina MiSeq	1
EGAD00001010771	Targeted capture ctDNA Library CRCQV42Run049-18 from patient PBC0001311, buffy coat sample	Illumina MiSeq	1
EGAD00001010772	Targeted capture ctDNA Library CRCQV42Run049-19 from patient PBC0002317, buffy coat sample	Illumina MiSeq	1
EGAD00001010773	Targeted capture ctDNA Library CRCQV42Run049-20 from patient PBC0002651, buffy coat sample	Illumina MiSeq	1
EGAD00001010774	Targeted capture ctDNA Library CRCQV42Run049-21 from patient PBC0001456, buffy coat sample	Illumina MiSeq	1
EGAD00001010775	Targeted capture ctDNA Library CRCQV42Run049-22 from patient PBC0002481, buffy coat sample	Illumina MiSeq	1
EGAD00001010776	Targeted capture ctDNA Library CRCQV42Run049-23 from patient PBC0001409, buffy coat sample	Illumina MiSeq	1
EGAD00001010777	Targeted capture ctDNA Library CRCQV42Run049-4 from patient PBC0004537, buffy coat sample	Illumina MiSeq	1
EGAD00001010778	Targeted capture ctDNA Library CRCQV42Run049-5 from patient PBC0003244, buffy coat sample	Illumina MiSeq	1
EGAD00001010779	Targeted capture ctDNA Library CRCQV42Run049-6 from patient PBC0005013, buffy coat sample	Illumina MiSeq	1
EGAD00001010780	Targeted capture ctDNA Library CRCQV42Run049-7 from patient PBC0003448, buffy coat sample	Illumina MiSeq	1
EGAD00001010781	Targeted capture ctDNA Library CRCQV42Run049-8 from patient PBC0002480, buffy coat sample	Illumina MiSeq	1
EGAD00001010782	Targeted capture ctDNA Library CRCQV42Run049-9 from patient PBC0002329, buffy coat sample	Illumina MiSeq	1
EGAD00001010783	Targeted capture ctDNA Library CRCQV42Run050-10 from patient PBC0001552, buffy coat sample	Illumina MiSeq	1
EGAD00001010784	Targeted capture ctDNA Library CRCQV42Run050-11 from patient PBC0001284, buffy coat sample	Illumina MiSeq	1
EGAD00001010785	Targeted capture ctDNA Library CRCQV42Run050-12 from patient PBC0001135, buffy coat sample	Illumina MiSeq	1
EGAD00001010786	Targeted capture ctDNA Library CRCQV42Run050-13 from patient PBC0002625, buffy coat sample	Illumina MiSeq	1
EGAD00001010787	Targeted capture ctDNA Library CRCQV42Run050-14 from patient PBC0001065, buffy coat sample	Illumina MiSeq	1
EGAD00001010788	Targeted capture ctDNA Library CRCQV42Run050-15 from patient PBC0001183, buffy coat sample	Illumina MiSeq	1
EGAD00001010789	Targeted capture ctDNA Library CRCQV42Run050-16 from patient PBC0001042, buffy coat sample	Illumina MiSeq	1
EGAD00001010790	Targeted capture ctDNA Library CRCQV42Run050-17 from patient PBC0001329, buffy coat sample	Illumina MiSeq	1
EGAD00001010791	Targeted capture ctDNA Library CRCQV42Run050-18 from patient PBC0001067, buffy coat sample	Illumina MiSeq	1
EGAD00001010792	Targeted capture ctDNA Library CRCQV42Run050-19 from patient PBC0001527, buffy coat sample	Illumina MiSeq	1
EGAD00001010793	Targeted capture ctDNA Library CRCQV42Run050-20 from patient PBC0001176, buffy coat sample	Illumina MiSeq	1
EGAD00001010794	Targeted capture ctDNA Library CRCQV42Run050-21 from patient PBC0001828, buffy coat sample	Illumina MiSeq	1
EGAD00001010795	Targeted capture ctDNA Library CRCQV42Run050-22 from patient PBC0001486, buffy coat sample	Illumina MiSeq	1
EGAD00001010796	Targeted capture ctDNA Library CRCQV42Run050-23 from patient PBC0002622, buffy coat sample	Illumina MiSeq	1
EGAD00001010797	Targeted capture ctDNA Library CRCQV42Run050-24 from patient PBC0002494, buffy coat sample	Illumina MiSeq	1
EGAD00001010798	Targeted capture ctDNA Library CRCQV42Run050-4 from patient PBC0001776, buffy coat sample	Illumina MiSeq	1
EGAD00001010799	Targeted capture ctDNA Library CRCQV42Run050-5 from patient PBC0001242, buffy coat sample	Illumina MiSeq	1
EGAD00001010800	Targeted capture ctDNA Library CRCQV42Run050-6 from patient PBC0001666, buffy coat sample	Illumina MiSeq	1
EGAD00001010801	Targeted capture ctDNA Library CRCQV42Run050-7 from patient PBC0001528, buffy coat sample	Illumina MiSeq	1
EGAD00001010802	Targeted capture ctDNA Library CRCQV42Run050-8 from patient PBC0001134, buffy coat sample	Illumina MiSeq	1
EGAD00001010803	Targeted capture ctDNA Library CRCQV42Run050-9 from patient PBC0002769, buffy coat sample	Illumina MiSeq	1
EGAD00001010804	Targeted capture ctDNA Library CRCQV42Run051-10 from patient PBC0001665, plasma sample	Illumina MiSeq	1
EGAD00001010805	Targeted capture ctDNA Library CRCQV42Run051-11 from patient PBC0002989, plasma sample	Illumina MiSeq	1
EGAD00001010806	Targeted capture ctDNA Library CRCQV42Run051-12 from patient PBC0001516, plasma sample	Illumina MiSeq	1
EGAD00001010807	Targeted capture ctDNA Library CRCQV42Run051-13 from patient PBC0001319, plasma sample	Illumina MiSeq	1
EGAD00001010808	Targeted capture ctDNA Library CRCQV42Run051-14 from patient PBC0002989, plasma sample	Illumina MiSeq	1
EGAD00001010809	Targeted capture ctDNA Library CRCQV42Run051-15 from patient PBC0001224, buffy coat sample	Illumina MiSeq	1
EGAD00001010810	Targeted capture ctDNA Library CRCQV42Run051-16 from patient PBC0002533, buffy coat sample	Illumina MiSeq	1
EGAD00001010811	Targeted capture ctDNA Library CRCQV42Run051-17 from patient PBC0002995, buffy coat sample	Illumina MiSeq	1
EGAD00001010812	Targeted capture ctDNA Library CRCQV42Run051-19 from patient PBC0002294, buffy coat sample	Illumina MiSeq	1
EGAD00001010813	Targeted capture ctDNA Library CRCQV42Run051-20 from patient PBC0003587, buffy coat sample	Illumina MiSeq	1
EGAD00001010814	Targeted capture ctDNA Library CRCQV42Run051-21 from patient PBC0001665, buffy coat sample	Illumina MiSeq	1
EGAD00001010815	Targeted capture ctDNA Library CRCQV42Run051-22 from patient PBC0002989, buffy coat sample	Illumina MiSeq	1
EGAD00001010816	Targeted capture ctDNA Library CRCQV42Run051-23 from patient PBC0001516, buffy coat sample	Illumina MiSeq	1
EGAD00001010817	Targeted capture ctDNA Library CRCQV42Run051-24 from patient PBC0001319, buffy coat sample	Illumina MiSeq	1
EGAD00001010818	Targeted capture ctDNA Library CRCQV42Run051-4 from patient PBC0001224, plasma sample	Illumina MiSeq	1
EGAD00001010819	Targeted capture ctDNA Library CRCQV42Run051-5 from patient PBC0002533, plasma sample	Illumina MiSeq	1
EGAD00001010820	Targeted capture ctDNA Library CRCQV42Run051-6 from patient PBC0002995, plasma sample	Illumina MiSeq	1
EGAD00001010821	Targeted capture ctDNA Library CRCQV42Run051-8 from patient PBC0002294, plasma sample	Illumina MiSeq	1
EGAD00001010822	Targeted capture ctDNA Library CRCQV42Run051-9 from patient PBC0003587, plasma sample	Illumina MiSeq	1
EGAD00001010823	Targeted capture ctDNA Library CRCQV42Run42-10 from patient PBC0003643, saliva sample	Illumina MiSeq	1
EGAD00001010824	Targeted capture ctDNA Library CRCQV42Run42-12 from patient PBC0001375, plasma baseline sample	Illumina MiSeq	1
EGAD00001010825	Targeted capture ctDNA Library CRCQV42Run42-13 from patient PBC0001467, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010826	Targeted capture ctDNA Library CRCQV42Run42-14 from patient PBC0001375, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010827	Targeted capture ctDNA Library CRCQV42Run42-16 from patient PBC0001413, TNBC FFPE sample	Illumina MiSeq	1
EGAD00001010828	Targeted capture ctDNA Library CRCQV42Run42-17 from patient PBC0001375, saliva sample	Illumina MiSeq	1
EGAD00001010829	Targeted capture ctDNA Library CRCQV42Run42-4 from patient PBC0001375, plasma baseline sample	Illumina MiSeq	1
EGAD00001010830	Targeted capture ctDNA Library CRCQV42Run42-6 from patient PBC0001224, plasma 9 month sample	Illumina MiSeq	1
EGAD00001010831	Targeted capture ctDNA Library CRCQV42Run42-7 from patient PBC0001299, plasma 15 month sample	Illumina MiSeq	1
EGAD00001010832	Targeted capture ctDNA Library CRCQV42Run42-8 from patient PBC0003385, plasma 15 month sample	Illumina MiSeq	1
EGAD00001010833	Paired WGA (whole-genome amplificated) samples from 65 single cell-derived organoids and six fresh single cell-derived organoid samples both of which were established from normal mammary epithelial cells, 84 FFPE LCM (laser-capture microdissection) samples of breast cancer and related clones, 79 fresh-frozen LCM samples of non-cancer lobules of breast cancer patients, and 36 matched germline controls were subjected to whole genome sequencing using NovaSeq 6000 system (Illumina) or DNBSEQ-G400RS (MGI Tech).	unspecified	335
EGAD00001010834	RNAseq of pancreatic cancer organoids.	unspecified	74
EGAD00001010835	Whole-genome sequencing of pancreatic cancer organoids and matched germline controls.	Illumina NovaSeq 6000 unspecified	100
EGAD00001010836	Pilot study for genome-wide sequencing of cell-free DNA from nipple aspirate fluid and plasma of breast cancer patient	Illumina NovaSeq 6000 PromethION	8
EGAD00001010837	Total mononuclear cells (MNC) were isolated from peripheral blood (PB) or bone marrow (BM) samples of 105 Ph-/-/- (B-cell acute lymphoblastic leukemia [B-ALL] triple negative) and 31 Ph+ B-ALL adult patients using Lymphosep (Biowest, Nuaillé, France). A total of 15 samples from healthy subjects were processed including hematopoietic stem-progenitor cells (CD34+) from bone marrow specimens (n = 3), and bone marrow mononuclear cell samples (n = 3) from STEMCELL Technologies (Vancouver, Canada). PB MNC samples (n=5) and cord blood samples (n = 4). CD34+ cells were enriched from cord blood samples by immunomagnetic separation (CD34 MicroBead Kit, Miltenyi Biotec, Bergisch Gladbach, Germany).Libraries were prepared using the TruSight RNA Pan-Cancer Panel Kit (Illumina, San Diego, California, USA), following the manufacturer’s protocol. Sequencing was performed using the Illumina MiSeq instrument.	Illumina MiSeq	1
EGAD00001010838	DEFLeCT metadata: 1) A table containing the mutational profile for all human genes across the cohort of the 81 PROMOLE lung cancer samples. 2) A table containing the expression levels for all human genes across the cohort of the 81 PROMOLE lung cancer samples. 3) A table containing the clinical, histology data and follow-ups of the cohort of 81 PROMOLE lung cancer patients.		1
EGAD00001010839	100 bp paired-end fastq RNAseq files for 25 sarcoma samples. RNAseq data from exon capture library prep.	Illumina NovaSeq 6000	25
EGAD00001010840	NGS profiling of the entire miRNOme conducted on the plasma of 24 samples obtained from well-differentiated, advanced, metastatic and inoperable G1, G2 and G3 GEP-NET patients. Sequencing was performed on NexSeq platform	NextSeq 500	24
EGAD00001010841	Stranded RNA-seq libraries were performed for 150 ng of mRNA using the TruSeq library kit (Illumina, San Diego, CA, USA). Libraries were sequenced on a NextSeq 2000 (Illumina) in a 2x50bp length.	unspecified	12
EGAD00001010842	The mutational status of 121 genes recurrently altered in B-cell lymphoma was examined in 55 of 56 diagnostic and 10 of 12 relapse samples using a custom targeted NGS panel. Libraries were generated from 150 ng of DNA using molecular-barcoded library adapters (ThruPLEX Tag- seq kit; Takara) coupled with a custom hybridization capture-based method (SureSelectXT Target Enrichment System Capture strategy, Agilent Technologies). The quality of the libraries was determined using the Bioanalyzer high sensitivity DNA kit (Agilent) and quantified by PCR using the KAPA library quantification kit (KAPA Biosystems). Finally, the libraries were pooled and sequenced in the MiSeq instrument (Illumina).	Illumina MiSeq	65
EGAD00001010843	Single-cell RNA-sequencing of 8 patients, from primary and relapse tumour (total of 16 samples). Patients were treated with nivolumab prior to relapse surgery.	Illumina NovaSeq 6000	16
EGAD00001010844	Bulk RNA-sequencing of 30 relapse tumour samples. 20 patients were treated with nivolumab prior to relapse surgery, while 10 patients are control patients.	Illumina NovaSeq 6000	30
EGAD00001010845	Whole genome sequencing data of 36 high-grade serous carcinoma (HGSC) patients (89 samples) sequenced with HiSeq X Ten.	unspecified	77
EGAD00001010846	Dataset with 150 whole-exome sequences from Algerian Amazigh (Chaoui and Mozabite) and non-Amazgih saples.	unspecified	124
EGAD00001010847	WES sequencing of 23 samples from PCNSL tumor and blood control samples. Sequencing was performed on a NovaSeq 6000 and NextSeq 500 using Agilent SureSelectXT HS Human All Exon V7 and V8. Sequencing was always paired.	Illumina NovaSeq 6000 NextSeq 500	25
EGAD00001010848	The dataset contains 23 ovarian cancer and 2 healthy control urine cfDNA samples. Shallow WGS was performed on an Illumina Novaseq S4 PE150bp. Samples are provided as raw reads without any prior processing.	Illumina NovaSeq 6000	25
EGAD00001010849	Linked-read data from 25 medulloblastomas and their matching control. Dataset consists of 25 group 4 medulloblastomas (G4) as well as 2 sonic hedgehog medulloblastomas (SHH-MB) samples and 2 group 3 medulloblastomas. The data consists of BAM files generated by the LongRanger pipeline developed by 10x Genomics	HiSeq X Ten Illumina NovaSeq 6000	50
EGAD00001010850	RNA-Seq data from 12 medulloblastoma samples, all group 4 medulloblastomas. The data consists of BAM files aligned using STAR	unspecified	12
EGAD00001010851	Nanopore data from 3 medulloblastoma samples of which 2 are tumor-normal pairs sequenced with the MinIon and one is tumor only data sequenced on the PromethIon. The data consists of BAM files aligned using minimap2	MinION PromethION	5
EGAD00001010852	PacBio data from 5 medulloblastoma tumor-normal pairs. The data consists of BAM files aligned using NGMLR	unspecified	10
EGAD00001010856	Sequence of breast cancer bone metastases PDX obtained by 2 targeted panels	Illumina HiSeq 2500	9
EGAD00001010860	Sequence of breast cancer bone metastases PDX obtained by 2 targeted panels	Illumina HiSeq 2500	9
EGAD00001010867	The Sys4MS cohort clinical data - 419 patients		419
EGAD00001010871	Genomic and epigenomic sequencing of 5 oesphageal adenocarciomas with evidence of chromothripsis. Genomic sequencing includes: Pacbio circular consensus sequencing, Pacbio continuous long read sequencing, 10X linked read and Illumia HiSeq X Ten sequencing. Epigenomic sequencing includes: Hi-C chromosome capture, ATAC-seq, ChIP seq (for H3K27ac, H3K4me3, H3K27me3 and CTCF) and long read RNA sequencing. All data types have the bam files which have not undergone haplotype resolution (demarcated as unresolved) and some data types also have haplotype resolved reads (demarcated as resolved).		-
EGAD00001010872	Allogeneic haematopoietic cell transplantation (HCT) replaces the stem cells responsible for blood production with those harvested from a donor, and is received by 40,000 patients worldwide each year. To quantify dynamics of long-term stem cell engraftment, we sequenced whole genomes of 2,824 single-cell-derived haematopoietic colonies from blood samples of 10 donor-recipient pairs taken 9-31 years after HLA-matched sibling HCT. With younger donors, 10,000-50,000 stem cells had engrafted and were still contributing to haematopoiesis at time of sampling, but estimates were 10-fold lower with older donors. Engrafted stem cells made multilineage contributions to myeloid, B-lymphoid and T-lymphoid populations, although individual clones often showed biases towards one or other mature cell type. Recipients had lower clonal diversity than matched donors, equivalent to ~10-15 years of additional ageing, arising from up to 25-fold greater expansion of stem cell clones. An HCT-related population bottleneck alone could not explain these differences: instead, phylogenetic trees evinced two distinct modes of HCT-specific selection. In 'pruning selection', cell divisions underpinning recipient-enriched clonal expansions had occurred in the donor, preceding transplant - their selective advantage derived from preferential mobilisation, harvest, survival ex vivo or initial homing. In 'growth selection', cell divisions underpinning clonal expansion occurred through proliferative advantage in the recipient's marrow after homing - clones with multiple driver mutations especially demonstrated this pattern. Uprooting stem cells from their native environment and transplanting them to foreign soil exaggerates selective pressures, distorting and accelerating the loss of clonal diversity compared to the unperturbed haematopoiesis of donors.	Illumina HiSeq 2500 Illumina NovaSeq 6000	1
EGAD00001010874	Allogeneic haematopoietic cell transplantation (HCT) replaces the stem cells responsible for blood production with those harvested from a donor, and is received by 40,000 patients worldwide each year. To quantify dynamics of long-term stem cell engraftment, we sequenced whole genomes of 2,824 single-cell-derived haematopoietic colonies from blood samples of 10 donor-recipient pairs taken 9-31 years after HLA-matched sibling HCT. With younger donors, 10,000-50,000 stem cells had engrafted and were still contributing to haematopoiesis at time of sampling, but estimates were 10-fold lower with older donors. Engrafted stem cells made multilineage contributions to myeloid, B-lymphoid and T-lymphoid populations, although individual clones often showed biases towards one or other mature cell type. Recipients had lower clonal diversity than matched donors, equivalent to ~10-15 years of additional ageing, arising from up to 25-fold greater expansion of stem cell clones. An HCT-related population bottleneck alone could not explain these differences: instead, phylogenetic trees evinced two distinct modes of HCT-specific selection. In 'pruning selection', cell divisions underpinning recipient-enriched clonal expansions had occurred in the donor, preceding transplant - their selective advantage derived from preferential mobilisation, harvest, survival ex vivo or initial homing. In 'growth selection', cell divisions underpinning clonal expansion occurred through proliferative advantage in the recipient's marrow after homing - clones with multiple driver mutations especially demonstrated this pattern. Uprooting stem cells from their native environment and transplanting them to foreign soil exaggerates selective pressures, distorting and accelerating the loss of clonal diversity compared to the unperturbed haematopoiesis of donors.	Illumina NovaSeq 6000	-
EGAD00001010875	Whole exome sequencing of neoplastic colorectal lesions, matched normal mucosa and peripheral blood leucocytes from 7 individuals. Data is contained within FASTQ files.	Illumina HiSeq 2000	21
EGAD00001010876	Exome sequencing was performed on n=28 treatment-naïve esophageal adenocarcinoma (EACs). Three to four biopsies sampling different areas of each tumor were pooled before nucleic acid extractions to mitigate the elevated heterogeneity described for EAC. WES was performed on EAC biopsies at 120X average coverage, with autologous PBMCs used as germline controls at 80X average coverage. Libraries were prepared from 30 ng of input DNA using the SureSelect QXT Human All Exon V7 kit (Agilent Technologies) and sequenced on the NextSeq 550 (Illumina), 2x150 bp. BCL files were demultiplexed to FastQ files using bcl2fastq2 software (Illumina). Three paired end sequencing batches were analyzed independently (Batch1: samples 8, 10, 11, 12, 15, 17, 18; Batch2: samples 20, 24, 25, 26, 27, 29, 30, 31, 33, 34 ; Batch3: samples 35, 37, 39, 40, 41, 43, 45, 48, 54, 55, 57). RNA sequencing was performed on n=26 treatment-naïve esophageal adenocarcinoma (EACs). Three to four biopsies sampling different areas of each tumor were pooled before nucleic acid extractions to mitigate the elevated heterogeneity described for EAC. RNAseq libraries were prepared on 50 ng of total RNA (with RNA integrity index RIN >=7) with the TruSeq Stranded mRNA library preparation kit (Illumina) in accordance with low-throughput protocol. After PCR enrichment (15 cycles) and purification of adapter-ligated fragments, the concentration and length of DNA fragments were measured using D1000 Screen Tape System (Agilent), obtaining a median insert size of 311 nucleotides. Then, RNAseq libraries were sequenced using the Illumina NovaSeq platform, 1x100 bp, obtaining on average 100 million single reads per sample.	Illumina NovaSeq 6000 NextSeq 550	166
EGAD00001010877	Dataset for the initial melanoma PEACE paper in Cancer Discovery, March 2023.	Illumina HiSeq 4000 NextSeq 500 NextSeq 550	894
EGAD00001010878	In this study a next-generation sequencing based method was applied to comprehensively screen for recurrent, disease-relevant copy number aberrations in a cohort of Hungarian patients. Diagnostic bone marrow samples from 260 children with B-cell acute lymphoblastic leukemia as well as 72 control samples and were investigated by digital multiplex ligation-dependent probe amplification using the disease-specific D007 probemix. Whole chromosome gains and losses, as well as subchromosomal copy number aberrations were simultaneously profiled.	Illumina MiSeq	332
EGAD00001010879	MS Risk Gene RNAseq datasets of immune cell subsets (CD4, CD8, B cell, monocyte) from healthy controls and untreated MS cases.	Illumina HiSeq 2500 Illumina NovaSeq 6000	578
EGAD00001010880	The RRBS libraries of the genomic DNA from the 521 tissue samples were constructed following the standard RRBS protocol. 100-200 ng of intact genomic DNA in the volume of 21.5 µl was used as input material. Restriction digestion was done with 2.5 µl 10xCutSmart buffer and 1 µl MspI (NEB) for 18 h at 37 oC and 20 min at 65 oC. 0.5 µl 10xCutSmart buffer, 0.3 µl dACGTP mixture (100 mM dATP, 10 mM dCTP, 10 mM dGTP), 1 µl Klenow (exo-, 5U/µl, NEB) and 2.6 µl RT-PCR water, 0.6 µl 50 mM DTT (ThermoFisher) was added to the mixture for end repair and A-overhang addition with the program 30 oC for 20 min, 37 oC for 1 h and 75 oC for 20 min. Adapter ligation was then performed with 1 µl 10xThermoFisher HC T4 ligase buffer, 0.4 µl 100 mM ATP (ThermoFisher), 0.2 µl 50 mM DTT, 1 µl ThermoFisher HC T4 DNA ligase (30 Weiss Unit/µl), 30 ng home-made duplex UMI adapter with all the cytosines methylated (protocol adopted from Kennedy et al.) at 16 oC for 20 h and 65 oC for 20 min. Bisulfite conversion of the adapter-ligated product was carried out with QIAGEN EpiTect plus DNA bisulfite kit following their protocol for two rounds of conversion. The converted product was purified with Qiagen MinElute spin column and eluted with 20 µl RT-PCR water. PCR amplification was done using the NEBNext Multiplex Oligos for Illumina (2.5 µl of universal and index primer each) and 25 µl KAPA HiFi HotStart Uracil+ ReadyMix (Roche) with the following cycling conditions: 98 oC for 45 s, 9 cycles of 98 oC for 15 s, 60 oC for 30 s and 72 oC for 30 s, followed by a final extension at 72 oC for 5 min. The PCR product was purified with 1x AmpureXP beads and eluted with 30 µl EB buffer. DNA concentration was measured by Qubit 1xdsDNA HS assay. 5% TBE-UREA PAGE and bioanalyzer assay was performed as quality control on each library before sequencing.	Illumina NovaSeq X	521
EGAD00001010881	The cfMethyl-Seq libraries of the serial plasma cfDNA samples from the four NSCLC patients were constructed following the standard protocol. 10 ng of cfDNA in the volume of 25 µl was used as input material. 5’-end dephosphorylation was done with 3 µl 10xCutSmart buffer and 2 µl quick CIP from NEB (Ipswich, MA) at 37 oC for 30 min then heat-inactivated at 80 oC for 5 min. The 3’-end blocking was done with 0.5 µl 10xCutSmart buffer, 3 µl 2.5 mM CoCl2, 1 µl terminal transferase (all from NEB), and 0.5 µl 1 mM ddGTP at 37 oC for 2 h followed by 75 oC for 20 min. The mixture was then purified with 2x AmpureXP beads (Beckman Coulter, Indianapolis, IN) and eluted in 21.5 µl RT-PCR grade water (Thermo-Fisher, Waltham, MA). Restriction digestion was done with 2.5 µl 10xCutSmart buffer and 1 µl MspI (NEB) for 18 h at 37 oC and 20 min at 65 oC . 0.5 µl 10xCutSmart buffer, 0.3 µl dACGTP mixture (100 mM dATP, 10 mM dCTP, 10 mM dGTP), 1 µl Klenow (exo-, 5U/µl, NEB) and 2.6 µl RT-PCR water, 0.6 µl 50 mM DTT (ThermoFisher) was added to the mixture for end repair and A-overhang addition with the program 30 oC for 20 min, 37 oC for 1 h and 75 oC for 20 min. Adapter ligation was then performed with 1 µl 10xThermoFisher HC T4 ligase buffer, 0.4 µl 100 mM ATP (ThermoFisher), 0.2 µl 50 mM DTT, 1 µl ThermoFisher HC T4 DNA ligase (30 Weiss Unit/µl), 5 ng home-made duplex UMI adapter with all the cytosines methylated (protocol adopted from Kennedy et al.) at 16 oC for 20 h and 65 oC for 20 min. Bisulfite conversion of the adapter-ligated product was carried out with QIAGEN EpiTect plus DNA bisulfite kit following their protocol for two rounds of conversion. The converted product was purified with Qiagen MinElute spin column and eluted with 20 µl RT-PCR water. PCR amplification was done using the NEBNext Multiplex Oligos for Illumina (2.5 µl of universal and index primer each) and 25 µl KAPA HiFi HotStart Uracil+ ReadyMix (Roche) with the following cycling conditions: 98 oC for 45 s, 15 cycles of 98 oC for 15 s, 60 oC for 30 s and 72 oC for 30 s, followed by a final extension at 72 oC for 5 min. The PCR product was purified with 1x AmpureXP beads and eluted with 30 µl EB buffer. DNA concentration was measured by Qubit 1xdsDNA HS assay. 5% TBE-UREA PAGE and bioanalyzer assay was performed as quality control on each library before sequencing.	Illumina HiSeq X	12
EGAD00001010883	130 runs/ 65 samples of paired RNA-Seq data of chemo-naïve and post-chemotherapy pancreatic ductal adenocarcinoma (PDAC). The sequencing was done on Novaseq 6000 with Illumina TruSeq stranded mRNA Kit.	Illumina NovaSeq 6000	65
EGAD00001010884	Paired RNA-Seq of 32 samples of chemo-naïve and post-chemotherapy PDAC tumors (HIPO_015) to define the molecular and cellular impact of neoadjuvant chemotherapy. Transcriptome analysis combined with high resolution mapping of whole tissue sections identified GATA6 (Classical), KRT17 (Basal-like) and Cytochrome P450 3A (CYP3A) co-expressing cells that were preferentially enriched in post-CTX resected samples. The sequencing was done on HiSeq2000/HiSeq2500 using the Takara_SMARTer_Ultra_Low_Input_RNA_and_NEBNext_ChIP-Seq Kit.	Illumina HiSeq 2000 Illumina HiSeq 2500	32
EGAD00001010887	Reninomas are exceedingly rare renin-secreting kidney tumours that derive from juxtaglomerular cells, specialised smooth muscle cells that reside at the vascular inlet of glomeruli. They are the central component of the juxtaglomerular apparatus which controls systemic blood pressure through the secretion of renin. We assessed somatic changes in reninoma and found structural variants that generate canonical activating rearrangements of NOTCH1, whilst removing its negative regulator, NRARP. Accordingly, in single reninoma nuclei we observed excessive renin and NOTCH1 signalling mRNAs, with a concomitant non-excess of NRARP expression. Re-analysis of previously published reninoma bulk transcriptomes further corroborates our observation of dysregulated Notch pathway signalling in reninoma. Our findings reveal NOTCH1 rearrangements in reninoma, therapeutically targetable through existing NOTCH1 inhibitors, and indicate that unscheduled Notch signalling may be a disease-defining feature of reninoma.	Illumina NovaSeq 6000	8
EGAD00001010888	Reninomas are exceedingly rare renin-secreting kidney tumours that derive from juxtaglomerular cells, specialised smooth muscle cells that reside at the vascular inlet of glomeruli. They are the central component of the juxtaglomerular apparatus which controls systemic blood pressure through the secretion of renin. We assessed somatic changes in reninoma and found structural variants that generate canonical activating rearrangements of NOTCH1, whilst removing its negative regulator, NRARP. Accordingly, in single reninoma nuclei we observed excessive renin and NOTCH1 signalling mRNAs, with a concomitant non-excess of NRARP expression. Re-analysis of previously published reninoma bulk transcriptomes further corroborates our observation of dysregulated Notch pathway signalling in reninoma. Our findings reveal NOTCH1 rearrangements in reninoma, therapeutically targetable through existing NOTCH1 inhibitors, and indicate that unscheduled Notch signalling may be a disease-defining feature of reninoma.	Illumina NovaSeq 6000	3
EGAD00001010889	Reninomas are exceedingly rare renin-secreting kidney tumours that derive from juxtaglomerular cells, specialised smooth muscle cells that reside at the vascular inlet of glomeruli. They are the central component of the juxtaglomerular apparatus which controls systemic blood pressure through the secretion of renin. We assessed somatic changes in reninoma and found structural variants that generate canonical activating rearrangements of NOTCH1, whilst removing its negative regulator, NRARP. Accordingly, in single reninoma nuclei we observed excessive renin and NOTCH1 signalling mRNAs, with a concomitant non-excess of NRARP expression. Re-analysis of previously published reninoma bulk transcriptomes further corroborates our observation of dysregulated Notch pathway signalling in reninoma. Our findings reveal NOTCH1 rearrangements in reninoma, therapeutically targetable through existing NOTCH1 inhibitors, and indicate that unscheduled Notch signalling may be a disease-defining feature of reninoma.	Illumina HiSeq 4000	6
EGAD00001010890	The introduction of bowel cancer screening has led to a significant increase in the proportion of patients being diagnosed with asymptomatic, early-stage colorectal cancer (CRC). Although the majority of these patients are successfully treated with surgery alone, a small proportion of patients have 'born-to-be-bad' aggressive lesions with early dissemination leading to distant metastases. Current standard of care histological assessment is unable to distinguish between these aggressive versus non-aggressive early lesions which is essential to provide appropriate clinical management decisions. This study aims to carry out molecular and histological profiling of approximately 300 T1 CRCs in order to develop a molecular stratifier based on the risk of relapse in early-invasive CRC. This novel T1 cohort will represent the world's largest molecularly characterised T1 cohort of samples, with digital pathology assessment alongside whole exome sequencing, copy number variation analysis and 3' RNA-seq. This data will be used to generate a robust panel of molecular and/or histological markers applicable to formalin-fixed paraffin embedded (FFPE) archival tissue which discriminates between T1 lesions based on risk of relapse, which will ultimately be used to inform clinical management of CRC at the earliest stages of the disease.	Illumina NovaSeq 6000	255
EGAD00001010891	The dataset for Single molecule genome-wide mutation profiles of cell-free DNA for non-invasive detection of cancer includes 57 BAM files from whole genome next-generation sequencing on the Illumina HiSeq2500. The samples analyzed include plasma samples from individuals with and without cancer.	Illumina HiSeq 2500	57
EGAD00001010892	Hybrid capture sequencing was performed to 3 purified Hodgkin and Reed-Sternberg (HRS) cell samples. In brief, Probes for 177 genes were designed and synthesized by Twist Bioscience. Hybridization capture of DNA libraries was performed using Twist Hybridization and Wash Kit (Twist Bioscience). The captured library was measured using Agilent Bioanalyzer High Sensitivity chip and Qubit dsDNA HS Assay Kit and run on Illumina Nextseq550. The BAM files were generated from the raw sequencing data using Cell Ranger (v6.0.2) mkfastq and count commands	NextSeq 550	4
EGAD00001010893	Fastq.gz files for mRNA sequenced from Mtb infected and uninfected neutrophils after 1 and 6 hrs. Samples were sequenced in 2 batched as indicated per experiment. Batch 1 was SE unstranded, 100bp on an Illumina HiSeq4000 sequencer and batch 2 unstranded, 150bp paired-end on an Illumina NovaSeq6000 sequencer. Phenotypic data for the samples is also included.	Illumina HiSeq 4000 Illumina NovaSeq 6000	1
EGAD00001010894	This data set includes RNAseq from 77 follicular lymphoma tumours. All tumours were fresh frozen. Libraries were constructed by enriching for poly-A transcripts and sequenced as 75bp paired end reads on an Illumina HiSeq 2500 instrument.	Illumina HiSeq 2500	-
EGAD00001010895	transcriptome analysis of NK cells sorted from PBMCs at baseline and after addition of a CD20 (B cell)-targeted T cell dependent bispecific antibody (TDB)	Illumina NovaSeq 6000	1
EGAD00001010896	ChIP-seq has been perfomed on 4 healthy and 5 tumor fresh-frozen endometrial tissues from post-menopausal patients. Immunoprecipitation has been performed for H3K27ac and ERa. Raw single-end fastq data have been aligned using bwa-mem using Hg19 genome assebly as reference. Aligment bam files are provided.	Illumina HiSeq 2500	21
EGAD00001010897	4C-seq performed on 10 slices (30um thick) of fresh frozen endometrial tissues. These tissue include 2 healthy tissues and 4 tumor tisseus (post-menopausal patients) in replicate. The library has been performed using DpnII (primary) and NlaIII (secondary) restriction enzymes. Raw single-end fastq.gz files are provided.	Illumina MiSeq	11
EGAD00001010898	Hi-C libraries have been prepared by enzymatic digestion with MboI restriction enzyme and sonication. Illumina single-indexig primers have been use to amplify ligated fragments. Hi-C experiments have been performed on 10 slices (30um thick) of fresh-frozen tissues derived from 3 healthy and 3 tumor endometrial tissues of post-menopausal patients. Raw paired-end fastq.gz files are provided.	NextSeq 500	6
EGAD00001010899	nNGM analysis of treatment-naive MIBC (N=49) and NMIBC (N=16).	Illumina MiSeq	68
EGAD00001010900	This dataset contains genome-wide array data from Amazigh (Chaoui and Mozabite) and non-Amazigh Algerian individuals. Chaoui were sampled in Oum El Bouaghi (n=47), Batna (n=46), and Khenchela (n=37). Mozabite were sampled in Ghardaïa (n=14). Non-Imazighen were sampled in Algiers (n=34).		1
EGAD00001010901	We profiled 7 patient neuroblastoma-FOXR2 tumor samples by bulk RNA-seq. The raw fastqs are provided.	Illumina HiSeq 4000 unspecified	7
EGAD00001010902	We profiled 1 patient neuroblastoma-FOXR2 tumor sample by single-cell multiome RNA + ATAC. The raw fastqs are provided.	Illumina NovaSeq 6000	1
EGAD00001010903	We profiled 4 patient neuroblastoma-FORX2 tumor samples by single-nuclei RNA-seq using 10X Chromium 3'. The raw fastqs are provided.	Illumina NovaSeq 6000	4
EGAD00001010904	Short read whole genome sequencing analysis of the off target effect after Prime editing in IPSC line KCNQ2 R201C. Comparison of parental KCNQ2 R201C with two corrected clonal lines (3samples in total). Dataset contains CRAM files and VCF files for the respective samples.	unspecified	3
EGAD00001010905	Total RNA sequencing (SMARTer Stranded Total RNA-Seq Kit v2) data of extracellular RNA (exRNA) from liquid biopsies of neuroblastoma xenograft models.	Illumina NovaSeq 6000 NextSeq 500	67
EGAD00001010906	Human data for transcriptome (bulk RNA-Seq) in eight B-cell precursors: HSC, CLP, pro-B, pre-B, Immature B, Transitional B, Naive B CD5-, Naive B CD5+ cells	NextSeq 500	79
EGAD00001010907	Human data for transcriptome (scRNA-Seq) in CD34+ B cell precursors.	NextSeq 500	1
EGAD00001010908	Human data for chromatin accessibility (ATAC-Seq) in eight B-cell precursors: HSC, CLP, pro-B, pre-B, Immature B, Transitional B, Naive B CD5-, Naive B CD5+ cells	NextSeq 500	78
EGAD00001010909	Human data for chromatin accessibility (ATAC-Seq) in eight B-cell precursors: HSC, CLP, pro-B, pre-B, Immature B, Transitional B, Naive B CD5-, Naive B CD5+ cells	Illumina HiSeq 2500	78
EGAD00001010910	Human data for chromatin accessibility (scATAC-Seq) in CD34+ B cell precursors.	NextSeq 500	1
EGAD00001010911	10 samples sequenced in Target-sequencing of a panel of 571 genes (Illumina NovaSeq 6000) - Raw FASTQ data - Annotated VCF	Illumina NovaSeq 6000	10
EGAD00001010912	15 samples sequenced in RNA-seq. This dataset contains their raw FASTQ data and the raw count table and the TPM count table.	Illumina NovaSeq 6000	15
EGAD00001010913	While gene therapy (GT) provides a potentially curative treatment option for patients with sickle cell disease (SCD), the occurrence of myeloid malignancies in clinical trials has prompted concern. To interrogate potential mechanisms underlying increased cancer risk, we used hematopoietic stem cell (HSC) clonal tracking by whole genome sequencing (WGS) to map the somatic mutation and clonal landscape of 2,592 gene modified as well as unmodified single stem and progenitor cells from six SCD patients undergoing gene therapy (7-26 years old, average 12.7× depth). Pre-GT phylogenetic trees in SCD were highly polyclonal and mutation burdens per cell were elevated in some, but not all, patients. Post-GT, no clonal expansions were identified. However, an increased frequency of driver mutations associated with myeloid neoplasms or clonal hematopoiesis (DNMT3A- and EZH2-mutated clones in particular) were seen in both genetically modified and unmodified cells suggested positive selection of mutant clones during gene therapy. This work sheds light on the mutation landscape and HSC clonal dynamics in gene therapy for SCD and highlights enhanced fitness of some HSCs harboring pre-existing driver mutations following gene therapy. Future studies should define the long-term fate of mutant clones including any contribution to expansions associated with myeloid neoplasms.	Illumina NovaSeq 6000	3394
EGAD00001010914	While gene therapy (GT) provides a potentially curative treatment option for patients with sickle cell disease (SCD), the occurrence of myeloid malignancies in clinical trials has prompted concern. To interrogate potential mechanisms underlying increased cancer risk, we used hematopoietic stem cell (HSC) clonal tracking by whole genome sequencing (WGS) to map the somatic mutation and clonal landscape of 2,592 gene modified as well as unmodified single stem and progenitor cells from six SCD patients undergoing gene therapy (7-26 years old, average 12.7× depth). Pre-GT phylogenetic trees in SCD were highly polyclonal and mutation burdens per cell were elevated in some, but not all, patients. Post-GT, no clonal expansions were identified. However, an increased frequency of driver mutations associated with myeloid neoplasms or clonal hematopoiesis (DNMT3A- and EZH2-mutated clones in particular) were seen in both genetically modified and unmodified cells suggested positive selection of mutant clones during gene therapy. This work sheds light on the mutation landscape and HSC clonal dynamics in gene therapy for SCD and highlights enhanced fitness of some HSCs harboring pre-existing driver mutations following gene therapy. Future studies should define the long-term fate of mutant clones including any contribution to expansions associated with myeloid neoplasms.		24
EGAD00001010915	Transcriptome sequencing of three normal skeletal muscle and 15 pleomorphic rhabdomyosarcoma patient tumors.	NextSeq 500	20
EGAD00001010917	To define a transcriptomic reference of human B lymphopoiesis, bone marrow aspirates were obtained from n=4 healthy donors (study registration DRKS00023583). After immunodensity cell separation, samples were FACS-sorted into 7 established lymphopoietic differentiation stages. RNA was extracted from 5,000-320,000 cells per differentiation stage and subjected to ultra-low-input RNA sequencing after generation of stranded sequencing libraries.	Illumina NovaSeq 6000	28
EGAD00001010918	BAM files from capture-sequencing dataset described in Veilleux et al.	NextSeq 500	158
EGAD00001010919	Using RNAseq, we compare the 15% of poorest responders (PRs, n=177) as measured by proportional Ki67 changes after 2 weeks of neoadjuvant aromatase inhibitors to good responders (GRs, n=190) selected from the top 50% responders in the POETIC trial and matched for baseline Ki67 categories. In the POETIC trial, 4,480 postmenopausal women with primary ER+ BC were randomised 2:1 to receive either treatment with a non-steroidal AI (letrozole or anastrozole) for 2 weeks before and 2 weeks after surgery or to no perisurgical treatment. Only AI-treated patients with HER2- tumors, paired baseline and surgery Ki67 available, and baseline Ki67 immunohistochemistry (IHC) >10% (to minimise imprecision in proportional Ki67 falls) were included for selection. Data is baseline RNAseq and targeted exome DNA sequencing analysis of POETIC Good/Poor Responders to aromatase inhibitors based on change in Ki67.	Illumina NovaSeq 6000	365
EGAD00001010920	Transcriptome profiling of 121 high-risk paediatric cancer samples for identifying T-cell infiltration signatures using poly-A capture by Truseq and sequenced on NextSeq 500	Illumina NovaSeq 6000	240
EGAD00001010921	This dataset contains the Visium Spatial Transcriptomics data from treatment naive melanoma lymph node metastatic samples.	unspecified	6
EGAD00001010922	Phenotype data for Lassa Fever cases and population controls from Nigeria and Sierra Leone associated with genotype data generated using Illumina Omni 2.5M and 5M.		2667
EGAD00001010923	Phenotype data for Lassa Fever cases and population controls from Nigeria and Sierra Leone associated with genotype data generated using Illumina H3Africa array version 1.		1345
EGAD00001010924	Mesothelioma is an aggressive cancer associated with previous exposure to asbestos and dismal prognosis. Since a pemetrexed/cisplatin combination was introduced for treatment of mesothelioma, no new first- or second-line therapies have been discovered. Thus, to better understand what drives mesothelioma carcinogenesis and to identify potential targets for therapy, in this project we aim at performing RNAseq analysis of a panel of mesothelioma cells lines.	Illumina HiSeq 4000	5
EGAD00001010925	This dataset includes metagenomic sequencing of faecal samples from 7,190 Israeli individuals. Single-end sequencing was performed using a NovaSeq sequencing platform (Illumina).	Illumina NovaSeq 6000	7190
EGAD00001010926	Illumina NovaSeq 6000 30x WGS of 26 samples, each with up to 5 matched timepoints. Timepoints A,B,C,D,and E correspond to Pretreatment, Week 3, Week 6, Week 9, and Week 12 after treatment, respectively. Additional sample metadata (sample recurrence, treatment course, age, sex, comorbidities, etc.) are present in sample description.	Illumina HiSeq 4000 Illumina NovaSeq 6000	282
EGAD00001010927	This dataset contains raw sequencing files associated with the paper "Neutrophils and emergency granulopoiesis drive immune suppression and an extreme response endotype during sepsis" (https://doi.org/10.1038/s41590-023-01490-5). It is composed of two experiments, which are as follows: 1. Single-cell profiling of whole blood leukocytes: This experiment comprises sequencing files generated using the BD Rhapsody single-cell multi-omics profiling (RNA + protein) platform. This platform was used to profile the whole blood leukocyte population in a cohort of sepsis patients, cardiac surgery controls and healthy controls. There are 48 samples and four files per sample: two paired-end FASTQ files (R1 and R2) corresponding to the RNA-seq library, and two paired-end FASTQ files (R1 and R2) corresponding to the protein profiling (ADT-based AbSeq) library. 2. Single-cell profiling of circulating HSPCs: This experiment comprises sequencing files generated using the 10X single-cell multi-omics (RNA-seq + ATAC-seq) profiling platform. This platform was used to profile circulating HSPCs in a cohort of sepsis patients and healthy controls. There are 5 samples (or plexes), each of which consists of a pool of individuals for whom cells were multiplexed and sequenced as a single sample. There are four files per sample: two paired-end FASTQ files (R1 and R2) corresponding to the RNA-seq library, and two paired-end FASTQ files (R1 and R2) corresponding to the ATAC-seq library. Because these samples consists of multiplexed pools, index files are also provided. These can be used for sample deconvolution using the Cell Ranger pipeline. NOTE: Due to EGA's constrains in the number of files permitted per sample, index files (I1 and I2) for the HSPC data set are provided as a separate experiment named "FASTQ index files (I1 and I2) for deconvolution of 10X single-cell multiomics libraries".	Illumina NovaSeq 6000	106
EGAD00001010928	Sequencing libraries were constructed according to standard procedures from 600 ng of tumor and paired constitutional DNA. WES was captured using Agilent SureSelect V5 (50 Mb), Clinical Research Exome (54 Mb) kit, SureSelect XT human All exon CRE version 1 or 2, or Twist Human Core Exome Enrichment System. Sequencing of subsequent libraries was performed using Illumina sequencers (Next-Seq 500 or Hiseq 2000/2500/4000) in 75 bp paired-end mode, aiming for a mean depth of coverage of 100x.	Illumina HiSeq 4000	1376
EGAD00001010931	The dataset for Detecting Liver Cancer Using Cell-Free DNA Fragmentomes includes 444 BAM files from whole genome next-generation sequencing on the Illumina NovaSeq 6000. The samples analyzed include plasma samples from individuals with and without cancer.	Illumina NovaSeq 6000	444
EGAD00001010932	Total RNA paired-end sequencing was performed on whole blood samples from 74 Lupus nephritis (LN) patients and 20 healthy controls using Illumina NovaSeq 6000.	Illumina NovaSeq 6000	84
EGAD00001010934	The dataset consists of targeted sequencing data obtained from an independent cohort of 11 patients to validate discriminative DMRs (differentially methylated regions) for Acute coronary syndrome (ACS) subtypes. The cohort includes 2 healthy subjects, 4 STEMI (ST-segment elevation myocardial infarction), 3 NSTEMI (non-ST-segment elevation myocardial infarction), and 2 UA (unstable angina) patients. The sequencing panel targeted 18,831 CpGs for analysis, reaching at least 5 reads coverage for 75% of the targeted CpGs in each sample. The sequencing was performed using the NEBNext Enzymatic Methyl-seq Module and the Nonacus Cell3TMTarget: Library Preparation kit, followed by probe hybridization, capture enrichment, and sequencing on the Novaseq 6000 platform, generating 400 million reads. The dataset is in raw fastq format.	Illumina NovaSeq 6000	2
EGAD00001010935	The dataset contains valuable genomic data from a discovery cohort consisting of 29 individuals. This cohort is comprised of 8 healthy individuals (control), 8 patients with ST-segment elevation myocardial infarction (STEMI), 7 patients with non-ST-segment elevation myocardial infarction (NSTEMI), and 6 patients with unstable angina (UA). The genomic data was obtained by isolating cell-free circulating DNA (ccfDNA) and subjecting it to bisulfite conversion using a low-input BS-seq (PBAT) protocol. Sequencing was done on the Novaseq 6000 platform.	Illumina NovaSeq 6000	16
EGAD00001010936	The objective of the colonoscopy study is to carry out the 16s sequencing of colon biopsies and faecal samples provided in ExHiBITT study to compare potential fluctuations in the microbiota of different sites.	Illumina MiSeq	1021
EGAD00001010937	Low-coverage whole genome methylation sequencing of cell-free DNA (cfDNA) from healthy volunteers (n=2) and allograft transplant recipients (n=11). The cfDNA was extracted from urine and plasma and sequenced using both a single- and double-strand library preparation method (n=15 and n=18).	Illumina NovaSeq 6000	33
EGAD00001010938	Genome and transcriptome sequence data from a myoepithelioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010939	Genome and transcriptome sequence data from a leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010940	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010941	Genome and transcriptome sequence data from a squamous cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010942	Genome and transcriptome sequence data from a non-small cell lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010943	Genome and transcriptome sequence data from a melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010944	Genome and transcriptome sequence data from a carcinoma of unknown primary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010945	Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010946	Genome and transcriptome sequence data from a breast ductal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010947	Genome and transcriptome sequence data from a cervical cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010948	Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010949	Genome and transcriptome sequence data from a endometrial adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010950	Genome and transcriptome sequence data from a breast ductal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010951	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010952	Genome and transcriptome sequence data from a chromophobe renal cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010953	Genome and transcriptome sequence data from a unknown primary adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010954	Genome and transcriptome sequence data from a breast ductal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010955	Genome and transcriptome sequence data from a merkel cell carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010956	Genome and transcriptome sequence data from a uveal melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010957	Genome and transcriptome sequence data from a leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010958	Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010959	Genome and transcriptome sequence data from a metastatic cancer to the breast patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010960	Genome and transcriptome sequence data from a metastatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010961	Genome and transcriptome sequence data from a ovarian serous carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010962	Genome and transcriptome sequence data from a carcinoma of unknown primary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010963	Genome and transcriptome sequence data from a pancreas adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010964	Genome and transcriptome sequence data from a carcinoma of unknown primary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010965	Genome and transcriptome sequence data from a leiomyosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010966	Genome and transcriptome sequence data from a metastatic chondrosarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010967	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010968	Genome and transcriptome sequence data from a primary unknown patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010969	Genome and transcriptome sequence data from a metastatic pleomorphic sarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010970	Genome and transcriptome sequence data from a metastatic adenocarcinoma of pancreas patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010971	Genome and transcriptome sequence data from a invasive high-grade serous carcinoma involving tubal mucosa and ovary with serosa patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010972	Genome sequence data from a metastatic rectal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010973	Genome and transcriptome sequence data from a metastatic adenocarcinoma of pancreas patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010974	Genome and transcriptome sequence data from a metastatic adenocarcinoma of pancreas patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010975	Genome and transcriptome sequence data from a metastatic ovarian sex cord stromal tumour patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010976	Genome and transcriptome sequence data from a metastatic adenocarcinoma of breast (ductal) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010977	Genome and transcriptome sequence data from a metastatic colorectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010978	Genome and transcriptome sequence data from a metastatic follicular lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010979	Genome and transcriptome sequence data from a metastatic large cell neuroendocrine carcinoma likely of lung origin patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010980	Genome and transcriptome sequence data from a metastatic triple negative breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010981	Genome and transcriptome sequence data from a rectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010982	Genome and transcriptome sequence data from a cerebellar glioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010983	Genome and transcriptome sequence data from a metastatic uterus cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010984	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010985	Genome sequence data from a metastatic paraganglioma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010986	Genome and transcriptome sequence data from a metastatic papillary renal cell ca patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010987	Genome and transcriptome sequence data from a lung adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010988	Genome and transcriptome sequence data from a prostate adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010989	Genome and transcriptome sequence data from a unknown - likely pancreatobiliary / intrahepatic cholangiocarcinoma in liver patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010990	Genome and transcriptome sequence data from a metastatic thymic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010991	Genome and transcriptome sequence data from a colorectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010992	Genome and transcriptome sequence data from a metastatic adenocarcinoma of the salivary gland patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010993	Genome and transcriptome sequence data from a metastatic lymphoepithelial carcinoma of the parotid gland patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010994	Genome and transcriptome sequence data from a metastatic lung adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010995	Genome and transcriptome sequence data from a primary unknown patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010996	Genome and transcriptome sequence data from a metastatic duodenal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010997	Genome and transcriptome sequence data from a hepatic cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010998	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001010999	Genome and transcriptome sequence data from a breast invasive ductal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011000	Genome and transcriptome sequence data from a melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011001	Genome and transcriptome sequence data from a metastatic rectal cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011002	Genome and transcriptome sequence data from a metastatic breast carcinoma (er positive) patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011003	Genome and transcriptome sequence data from a paget disease of the vulva patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011004	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011005	Genome and transcriptome sequence data from a metastatic ovarian cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011006	Genome and transcriptome sequence data from a poorly differentiated adenocarcinoma of the pancreas patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011007	Genome and transcriptome sequence data from a inflammatory myofibroblastic lung tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011008	Genome and transcriptome sequence data from a colorectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011009	Genome and transcriptome sequence data from a metastatic pancreatic neuroendocrine tumour patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011010	Genome and transcriptome sequence data from a metastatic adenocarcinoma of lung patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011011	Genome and transcriptome sequence data from a metastatic pancreatic neuroendocrine adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011012	Genome and transcriptome sequence data from a metastatic cholangiocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011013	Genome and transcriptome sequence data from a metastatic colorectal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011014	Genome and transcriptome sequence data from a metastatic esophageal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011015	Genome and transcriptome sequence data from a metastatic adenocarcinoma of colon patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011016	Genome and transcriptome sequence data from a metastatic thymic carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011017	Genome and transcriptome sequence data from a metastatic nasopharyngeal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011018	Genome and transcriptome sequence data from a metastatic melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011019	Genome and transcriptome sequence data from a metastatic pancreatic neuroendocrine tumour patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011020	Genome and transcriptome sequence data from a metastatic carcinoma of unknown primary patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011021	Genome and transcriptome sequence data from a metastatic breast ductal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011022	Genome and transcriptome sequence data from a melanoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011023	Genome and transcriptome sequence data from a follicular lymphoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011024	Genome and transcriptome sequence data from a neuroendocrine tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011025	Genome and transcriptome sequence data from a thymic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011026	Genome and transcriptome sequence data from a small cell lung carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011027	Genome and transcriptome sequence data from a pancreatic cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011028	Genome and transcriptome sequence data from a nasopharyngeal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011029	Genome and transcriptome sequence data from a colorectal carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011030	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011031	Genome and transcriptome sequence data from a urachal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011032	Genome and transcriptome sequence data from a liposarcoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011033	Genome and transcriptome sequence data from a gastrointestinal stromal tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011034	Genome and transcriptome sequence data from a granulosa cell ovarian tumor patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011035	Genome and transcriptome sequence data from a pancreatic adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011036	Genome and transcriptome sequence data from a metastatic breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011037	Genome and transcriptome sequence data from a metastatic triple negative breast cancer patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011038	Genome and transcriptome sequence data from a papillary thyroid carcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011039	Genome and transcriptome sequence data from a thymoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011040	Genome and transcriptome sequence data from a rectal adenocarcinoma patient, generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001011041	PC-9 cells were barcoded following protocol from Chang et al. Nature Biotech. 2021, which does not affect downstream analysis. PC-9 cells were trypsinized into single-cell suspensions and processed using Chromium Single Cell Gene Expression 3’ Library and Gel Bead Kit V2.0 following the manufacturer’s instructions (10X Genomics). Cells were counted and checked for viability using Vi-CELL XR cell counter (Beckman Coulter), and then injected into microfluidic chips to form Gel Beads-in-Emulsion (GEMs) in the 10X Chromium instrument. Reverse transcription was performed on the GEMs, and RT products were purified and amplified. Expression libraries were made from the cDNA and profiled using the Bioanalyzer High Sensitivity DNA kit (Agilent Technologies) and quantified with Kapa Library Quantification Kit (Kapa Biosystems). Illumina HiSeq2500 (Illumina) was used to sequence the libraries.	Illumina HiSeq 4000	1
EGAD00001011042	This dataset contains bulk RNA sequencing at different timepoints post-BNT162b2 mRNA COVID-19 vaccination. Stimulation experiments were performed at each of the timepoints (RPMI, Influenza, R848 and Influenza stimuli), resulting in 242 libraries distributed over 4 stimulations and 4 timepoints. Libraries were sequenced on the DBNSEQ platform.	unspecified	242
EGAD00001011043	This dataset contains whole genome sequences from Illumina NovaSeq Devices sequenced at the WGGC Bonn to study effects of prolonged paternal exposure to ionizing radiation. Here we provide the reads, mapped to the hg19 reference genome of all samples.	Illumina NovaSeq 6000	103
EGAD00001011044	single cell RNA sequencing for 4 healthy controls and 2 patients having NFATc1 deficiency	Illumina NovaSeq 6000	6
EGAD00001011045	Single-cell RNA sequencing was performed on viable frozen tumor dissociated cells from three RCC patients. The raw data is available as fastq files.	Illumina NovaSeq 6000	20
EGAD00001011046	TCRab sequencing was performed on viable frozen tumor dissociated cells from three RCC patients. The raw data is available as fastq files.	Illumina HiSeq 2500	20
EGAD00001011047	Short read whole genome and long read Oxford Nanopore sequencing of matched tumor/normal material from 10 Melanoma and 1 case of TNBC.	Illumina NovaSeq 6000 PromethION	44
EGAD00001011048	Pediatric patients with recurrent and refractory cancers are in most need for new treatments. This study developed patient-derived-xenograft (PDX) models within the European MAPPYACTS cancer precision medicine trial (NCT02613962). To date, 131 PDX models were established following heterotopical and/or orthotopical implantation in immunocompromised mice: 76 sarcomas, 25 other solid tumors, 12 central nervous system tumors, 15 acute leukemias, and 3 lymphomas. PDX establishment rate was 43%. Histology, whole exome and RNA sequencing revealed a high concordance with the primary patient’s tumor profile, human leukocyte-antigen characteristics and specific metabolic pathway signatures. A detailed patient molecular characterization, including specific mutations prioritized in the clinical molecular tumor boards are provided. Ninety models were shared with the IMI2 ITCC Paediatric Preclinical Proof-of-concept Platform (IMI2 ITCC-P4) for further exploitation. This new PDX biobank of unique recurrent childhood cancers provides an essential support for basic and translational research and new treatments development in advanced pediatric malignancies.	Illumina HiSeq 4000 Illumina NovaSeq 6000 NextSeq 500	166
EGAD00001011049	This dataset contains a collection of tumour samples from high grade serous ovarian carcinoma patients with recurrent disease collected near the point of diagnosis as well as tumour samples collected after patient relapse upon or some time after study entry. The majority of diagnosis samples were preserved in neutral buffered formalin whereas the majority of post-relapse samples were preserved in universal molecular fixative (UMFIX, Sakura Finetek USA, Inc). DNA was extracted and whole genome sequencing libraries were prepared either using the TruSeq DNA Nano kit (Illumina) or the ThruPLEX DNA-Seq Kit (Takara Bio), which were respectively sequenced at low depth (~0.1-0.5X) on illumina HiSeq 2500 and HiSeq 4000 sequencing platforms. Sequenced reads were aligned to the GRCh37 reference genome (release hs37d5).	Illumina HiSeq 2500 Illumina HiSeq 4000	679
EGAD00001011050	ATAC-seq of 79 primary samples obtained from human acute leukemias, namely AML, T-ALL and mixed myeloid/lymphoid leukemias with CpG Island Methylator Phenotype (CIMP). ATAC-seq of CD34+ HSPCs from 3 healthy donors is also included.	Illumina NovaSeq 6000	4
EGAD00001011051	Hi-C of 17 primary samples obtained from human acute leukemias, namely AML, T-ALL and mixed myeloid/lymphoid leukemias with CpG Island Methylator Phenotype (CIMP). Moreover, Hi-C of CD34+ HSPCs from 3 healthy donors are included.	Illumina NovaSeq 6000	3
EGAD00001011052	MCIP-seq of 77 primary samples obtained from human acute leukemias, namely AML, T-ALL and mixed myeloid/lymphoid leukemias with CpG Island Methylator Phenotype (CIMP). Moreover, MCIP-seq of CD34+ HSPCs from 3 healthy donors is included.	Illumina HiSeq 2500	36
EGAD00001011053	Whole exome sequencing of tumor material derived from 14 mixed myeloid/lymphoid leukemias with a CpG Island Methylator Phenotype (CIMP). For 4 of these patients, normal material was also sequenced and used as control (files with the same identifiers prepended by an “h”).	Illumina NovaSeq 6000	1
EGAD00001011054	Total RNA-seq of blasts derived from 131 adult T-ALL cases, 7 AML cases and 1 mixed myeloid/lymphoid leukemia with CpG Island Methylator Phenotype (CIMP). The other RNA-seq data used in this study has been previously published and is available at EGAD00001007581 (AML) and EGAD00001007646 (CD34+ cells).	Illumina NovaSeq 6000	81
EGAD00001011055	Four different types of transcriptomic single-cell sequencing of blood from childhood B acute lymphoblastic leukemia.	Illumina NovaSeq 6000	6
EGAD00001011056	bulk RNAseq was conducted on highly purified CD45 - CD71 - CD235a - CD31 - CD271 + BMSCs isolated from a cohort of newly diagnosed 62 AML patients, uniformly treated within an intensive chemotherapy clinical trial and selected to represent the mutational landscape of AML	Illumina NovaSeq 6000	70
EGAD00001011057	To generate a cellular taxonomy of the human NBM and AML, representing both rare hematopoietic stem/progenitor cells (HSPCs) and stromal niche populations, allowing assessment of their cellular diversity and predicted intercellular signaling, we performed single cell RNA sequencing (scRNAseq) on viably frozen bone marrow (BM) aspirates from four healthy donors and 6 NPM1+ AML patients at diagnosis	Illumina NovaSeq 6000	20
EGAD00001011058	This dataset contains a collection of tumour samples from high grade serous ovarian carcinoma patients with recurrent disease collected near the point of diagnosis as well as tumour samples collected after patient relapse upon or some time after study entry. Whole blood samples were also collected from patients at study entry for the purpose of germline variant detection. The majority of diagnosis samples were preserved in neutral buffered formalin whereas the majority of post-relapse samples were preserved in universal molecular fixative (UMFIX, Sakura Finetek USA, Inc). DNA was extracted and the tagged-amplicon deep sequencing assay was applied (TAm-Seq, Forshew et al. 2002, Sci Transl Med) with the aid of fluidigm access array technology. Targeted loci were sequenced at very high depths (typically >100X) for the detection of both somatic and germline variants. Sequenced reads were aligned to the GRCh37 reference genome (release hs37d5).	Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina MiSeq	933
EGAD00001011059	CTCF ChIP-seq of 39 primary samples derived from human acute leukemias, namely AML, T-ALL and mixed myeloid/lymphoid leukemias with CpG Island Methylator Phenotype (CIMP).	Illumina NovaSeq 6000	-
EGAD00001011060	H3K27ac ChIP-seq of 79 primary samples derived from human acute leukemias, namely AML, T-ALL and mixed myeloid/lymphoid leukemias with CpG Island Methylator Phenotype (CIMP). In addition, 4 samples derived from CD34+ cord blood cells of healthy donors were included.	Illumina HiSeq 2500	3
EGAD00001011061	The dataset includes spatially-resolved and single-cell antigen receptor, as well as gene expression, data from two different HER2+ breast cancer patients. The tumor piece obtained during surgery from each patient was divided into several regions and tissue sections were used for spatial transcriptomics (Visium, 10x genomics). As indicated, some tissue sections were analyzed by a new method (Spatial VDJ) to spatially resolve antigen receptor sequences (target capture), which was developed in our publication. In parallel, tissue pieces from the same tumor were dissociated for single-cell gene expression analysis (10x genomics GEX, VDJ, and feature barcoding/Hash Tag Oligonucleotide). The deposited data is in the form of fastq files. All processed data, metadata, micrographs of the tissue sections (of those used for spatial transcriptomics), and scripts used for the analysis are publicly available at Zenodo (DOI: 10.5281/zenodo.7961605). Final libraries were sequenced on NextSeq2000 (Illumina) or NovaSeq6000 (Illumina) and analyzed with Cell Ranger, Seurat, Space Ranger, and STutility pipelines.	Illumina NovaSeq 6000 NextSeq 550 Sequel	43
EGAD00001011062	The dataset includes spatially-resolved gene expression and antigen receptor data from two Tonsil samples (1 and 2). Tissue sections from the tonsil samples were used for spatial transcriptomics (Visium, 10x genomics). Tonsil 2 tissue sections were analyzed by a new method (Spatial VDJ) to spatially resolve antigen receptor sequences (target capture), which was developed in our publication. Nearby or adjacent tissue sections (from Tonsil2) were also analyzed by a bulk antigen receptor sequencing approach (amplicon sequencing), by a method also newly developed by us in the same publication (Bulk SS3 VDJ). For Visium, the data were anonymized (all SNPs removed) using Bamboozle (Ziegenhain and Sandberg, Nature Communications 2021). The deposited data is in the form of fastq files. All remaining data, metadata, micrographs of the tissue sections (of those used for spatial transcriptomics), and scripts used for the analysis are available at Zenodo (DOI: 10.5281/zenodo.7961605). Final libraries were sequenced on NextSeq2000 (Illumina) or NovaSeq6000 (Illumina) and analyzed with Seurat, Space Ranger, and STutility pipelines.	Illumina NovaSeq 6000 NextSeq 550 Sequel	86
EGAD00001011063	We performed a global mutational landscape analysis using tumor samples from the 47 urothelial cancer patients included in MATCH-R or MOSCATO studies, with advanced metastatic disease and WES available (RNAseq is included for 38 patient samples).	Illumina HiSeq 2000 Illumina HiSeq 4000 Illumina NovaSeq 6000 NextSeq 500	133
EGAD00001011064	This dataset contains CLCNKA/CLCNKB locus alignment data from 27 patients with Bartter syndrome and structural variants encompassing the CLCNKB gene. Due to data protection regulations and in accordance with the patient consent, only relevant alignments from the following regions are shared: hg19: chr1:16,300,000-16,400,000 hg38 (linked read dataset only): chr1:16,000,000-16,100,000 Methods to generate libaries were: long-range amplicon PCR (24 samples), targeted long-fragment enrichment (Samplix/Xdrop technology, 4 samples), long-read whole genome (PacBio Sequel II HiFi reads, 3 samples), 10X linked read short read whole genome (1 sample).	Illumina NovaSeq 6000 Sequel	27
EGAD00001011065	Data supporting: "SMAD4 and KCNQ3 Alterations are Associated with Lymph Node Metastases in Oesophageal Adenocarcinoma" RNAseq (FASTQ files) 6 samples	unspecified	6
EGAD00001011066	Live CD4 T cells were sorted from inflamed and non-inflamed tissue samples of IBD patients or from healthy and IBD blood samples. ATAC-Seq libraries were generated from live CD4 T cells sorted from i) inflamed and non-inflamed tissue samples, ii) healthy and IBD blood samples, or from iii) CD4 T cell subsets polarised from healthy blood samples. After isolating crude nuclei, live CD4+ T cells were treated with Tagment DNA buffer and Tagment DNA Enzyme (Nextera DNA Library Prep Kit, Illumina), and then the DNA was purified by MinElute PCR Purification Kit (Qiagen). Transposed DNA fragments were amplified using specific adapters followed by purification with MinElute PCR Purification Kit (Qiagen). Fragments from 240-360pb were selected in the PippinHT system (Sage Science). The quality of the library and its DNA concentration were assessed by Bioanalyzer instruments (Agilent Technologies) and ultimately submitted for sequencing using Illumina HiSeq 2500 sequencer, V4 chemistry. On the other hand, single cell RNA-Seq libraries were generated exclusively from inflamed and non-inflamed tissue samples of Crohn’s disease patients. Briefly, live CD4 T cells were captured and encapsulated before cDNA amplification using the 10X Genomics Chromium Platform. Samples were prepared as outlined by 10x genomics Single Cell 3’ Reagent Kits v2 user guide. Samples were sequenced on a HiSeq 2500 with the following run parameters: Read 1 – 26 cycles, read 2 – 98 cycles, index 1 – 8 cycles.	Illumina HiSeq 2500	80
EGAD00001011067	69 OAMZL patients were sequenced on the Illumina platform. The data files are available in BAM format.	HiSeq X Ten	69
EGAD00001011068	Ovarian Carcinosarcoma DNA and RNA sequencing of patient samples in the UK cohort (n=18).	Illumina HiSeq 4000	65
EGAD00001011069	Whole blood samples collected at baseline and week 12 in PAXgene Blood RNA tubes. RNA sequencing on the Illumina NovaSeq 6000 System generated 150 base pair length paired end reads.	Illumina NovaSeq 6000	164
EGAD00001011074	This dataset contains whole-exome and RNA sequencing of biopsies obtained from patients enrolled in a phase I clinical trial investigating the UV1 vaccine in combination with pembrolizumab in patients with advanced melanoma.	Illumina NovaSeq 6000	86
EGAD00001011075	Phenotype data for 3421 Samples from Nigeria and Ghana, sequenced with the Illumina NestSeq 500.		3421
EGAD00001011076	Data supporting: "The transcriptional landscape of endogenous retroelements delineates esophageal adenocarcinoma subtypes" RNAseq for 279 samples	Illumina HiSeq 2000 unspecified	1
EGAD00001011077	FASTQ files describing paired-end RNA-sequencing of isogenic TIRM+ and TIRM- muscle biopsies from 24 FSHD patients (48 samples) and vastus lateralis muscle biopsies from 11 matched control individuals. FASTQ files are also provided describing RNA-sequencing of 15 FSHD peripheral blood mononuclear samples and 14 matched controls. For muscle biopsies sequencing was at 21.7-35.5 million reads/sample. RNA was extracted from PBMCs followed by globin depletion with sequencing at 19.7-46.5 million reads/sample.	Illumina NovaSeq 6000	88
EGAD00001011078	Single nuclei transcriptomic data from the AMBITION study.	unspecified	10
EGAD00001011079	In this study, we used RNA-sequencing gene expression profiling in order to characterize specific phenotypic traits in DIPG-derived glioma stem cell (GSC) models. Twenty-two primary GSC models derived from biopsies collected at diagnosis were cultured in an ECM-mimicking compound before for RNA extraction and subsequent rRNA-depletion and sequencing .	Illumina NovaSeq 6000	22
EGAD00001011080	In this study, we used RNA-sequencing gene expression profiling in order to characterize specific phenotypic traits in DIPG. Seventeen primary tumor samples stereotactically biopsied at diagnosis by neurosurgeons in the Necker-Enfants Malades hospital (Paris, France) and snap-frozen for RNA extraction and subsequent poly-A mRNA purification.	Illumina NovaSeq 6000	17
EGAD00001011081	whole-genome sequencing data of 168-pared samples, of which 79 samples are archived here	HiSeq X Ten	79
EGAD00001011082	For human samples, total cellular RNA was isolated from post-sorted Mito+ and Mito CD8+ cells using the RNeasy Mini Kit (Qiagen). Stranded RNA libraries were created using SMARTSeq Stranded.	NextSeq 550	9
EGAD00001011083	Targeted DNA sequencing was performed on 195 bone marrow samples to identify cases of clonal haematopoiesis, and on 99 paired peripheral blood samples. The SeqCap EZ HyperCap protocol was followed, and targeted capture performed against a panel of 97 genes recurrently mutated in myeloid malignancies and clonal hematopoiesis. One BAM file (mapped to the hg38 reference genome) is provided per sample.	NextSeq 500	294
EGAD00001011086	Whole genome sequencing data of 21 high-grade serous carcinoma (HGSC) patients (59 samples) sequenced with MGISEQ-2000.	unspecified	59
EGAD00001011087	ZPM WES Pilot consisting of 30 samples paired tumor/normal analyzed with WES at four different laboratories in Germany.	Illumina NovaSeq 6000 unspecified	30
EGAD00001011088	Our objective was to establish a liquid biopsy-based monitoring strategy for pediatric high-risk neuroblastomas that are harboring genomic TERT rearrangements at diagnosis. TERT rearrangement breakpoints are detected by a hybrid capture-based neuroblastoma DNA panel sequencing (published in PMID: 34442335) in tumor material and are reflected in cell-free tumor DNA and can serve as robust biomarkers for disease activity. Within the dataset, 5 tumors of 4 pediatric patients with a neuroblastoma were DNA sequenced. Provided are FASTQ data files, bam and bambai files, as well as breakpoint spanning and encompassing read (enspan) bam and bambai files. Sequencing bam data files are aligned to GRCh37.p13 reference genome (processed).	Illumina MiSeq	5
EGAD00001011089	Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. . This dataset contains all the data available for this study on 2023-06-22.	Illumina HiSeq 4000	41
EGAD00001011090	Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. . This dataset contains all the data available for this study on 2023-06-22.	HiSeq X Ten Illumina NovaSeq 6000	29
EGAD00001011091	Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. . This dataset contains all the data available for this study on 2023-06-22.	Illumina HiSeq 4000	17
EGAD00001011092	Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. . This dataset contains all the data available for this study on 2023-06-22.	Illumina HiSeq 4000	81
EGAD00001011093	Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. . This dataset contains all the data available for this study on 2023-06-22.	Illumina NovaSeq 6000	140
EGAD00001011094	Sequencing of tissue samples and their derived organoids from oesophageal, pancreatic and colorectal cancer patients. . This dataset contains all the data available for this study on 2023-06-22.	Illumina HiSeq 4000	91
EGAD00001011095	Data supporting: "The transcriptional landscape of endogenous retroelements delineates esophageal adenocarcinoma subtypes" WGS for 452 samples	HiSeq X Five Illumina HiSeq 2000	1
EGAD00001011097	Additional datasets linked to EGAS00001006692	Illumina NovaSeq 6000	1
EGAD00001011098	This dataset contains paired whole exome sequence data of 5 patients (control/tumor pairs) and paired whole genome sequencing data of 1 patient (control/tumor pair) with Lynch Syndrome from the INFORM registry. Paired sequencing was done mostly on Illumina HiSeq 4000, few on HiSeq2500 and NovaSeq 6000. The library preparation was either with Agilent SureSelect Human_All_Exon V5 (hg19) or with Agilent SureSelectXT HS Human_All_Exon V7 (hg19). The WGS samples were prepared with Agilent SureSelect WGS.	Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina NovaSeq 6000	13
EGAD00001011099	To further decipher the cell type composition within single spheres we performed RNA-Seq of 25 selected spheres of 4 different categories (including 8 signature-/balanced, 9 signature+/balanced, 1 signature-/aberrant, 7 signature+/aberrant). Samples were sequenced using Illumina NextSeq 2000.Fastq reads were processed using inhouse RNA-Seq workflow.	unspecified	25
EGAD00001011100	This dataset contains samples from 8 patients with Sclerosing epithelioid fibrosarcoma. All 8 samples have whole exome tumor data and tumor RNAseq data. 3 samples also have matched normal dna sequence data.	Illumina HiSeq 2000	11
EGAD00001011102	This dataset consists of RNA-seq data of 126 whole-blood samples derived from patients after abdominal surgery. Total-RNA collected preoperatively and at 3-time points postoperatively (2-6, 24 and 48hrs) were analysed using RNA-sequencing. RNA was collected in PAXGene (Qiagen) tubes and extracted using the PAXGene RNA extraction kit, with a DNase step to remove contaminating DNA. Globin and ribosomal RNA were removed via a Ribo-zero kit (Illumina) and library preparation was carried out using the TruSeq Stranded Total RNA Library Prep Kit (Illumina). Next generation sequencing was performed on the NovaSeq sequencing platform (Illumina) at the Wellcome Centre for Human Genetics (WCHG) in Oxford.	Illumina NovaSeq 6000	252
EGAD00001011103	Single-cell RNA-seq analysis of cutaneous immune cells isolated from skin biopsies of psoriasis patients undergoing IL-23 blockade	Illumina HiSeq 4000	42
EGAD00001011104	Optimizing single-cell transcriptomic discrimination of atopic dermatitis versus psoriasis vulgaris	Illumina HiSeq 4000	2
EGAD00001011105	Bulk RNA-seq on normal human CD19+ cells	Illumina HiSeq 2500	25
EGAD00001011106	A large cohort of 600 cases with familial breast cancer as classified by the Spanish Society of Medical Oncology (SEOM) Clinical Guidelines Update that were recruited during 6 years at Hospital Universitario Morales Meseguer (Murcia, Spain) were retrospectively evaluated to select 16 cases with no positive finding in NGS analysis of 20 genes implicated in this disease. These 16 cases were selected for further investigation using nanopore sequencing. This method involved the use of adaptive sampling enrichment, targeting a panel of 18 human genome regions, which contained the 20 genes (PTEN, ATM, BRCA2, PALB2, CDH1, TP53, NF1, RAD51D, BRCA1, RAD51C ,BRIP1, STK11, CHEK2, EPCAM, MSH2, MSH6, BARD1, MLH1, PMS2, NBN). In 5 samples (P1, P2, P4, P15 and P16) no selection of long reads was performed. Additionally, in 3 samples (P7, P9 and P10), both procedures were performed in two independent runs, and for the second run of P7, the DNA was previously fragmented using g-TUBE Covaris® (ref 520079) according to the protocol for 6 kb fragments.	MinION	19
EGAD00001011108	Sample sheet linking anonymized sample IDs and anonymized patient IDs.		1
EGAD00001011109	This study includes 542 of multi-region sampled whole exome sequencing data from 83 pancreatic cancer patients in 6 individual experiments : Kras wildtype, local recurrence, treatment-naive (MetomeV2), radiotherapy plus chemotherapy (Radome), chemotherapy-only group (Treatome) and Proj_B-100-478.	Illumina HiSeq 2500	706
EGAD00001011110	195 exome sequencing samples	Illumina NovaSeq 6000	195
EGAD00001011111	No. of samples: 80 (28 ULP-WGS, 26 WES, 26 RNA-SEQ) File types: FASTQ (28 ULP-WGS, 19 WES, 18 RNA) and BAM (7 WES, 8 RNA) Technology used: Sequencing - Illumina Novoseq 6000; Map/Align - Illumina DRAGEN v3.7.5; Genome assembly - GrCh38p13 Filename nomenclature: - SampleName_Passage_SampleType_TissueType_SequencingType - Passage of: PX = unknown; PZ = from patient; P0 = first passage from patient on plastic; P1 = first passage from plastic/PDX/organoid - SampleType: STN = normal; STT = tumor - SampleType STN: 00 = tissue unknown; 01 = adjacent normal; 02 = fibroblast; 03 = germline blood; 21 = cell line from patient tissue; 22 = cell line from PDX; 23 = cell line from patient fibroblast - SampleType STT: 00 = tissue unknown; 01 = primary tumor; 21 = cell line from patient; 22 = cell line from PDX - TissueType: WT = Wilm's tumor; 00 = kidney unknown; 01 = kidney left; 02 = kidney right - SequencingType: 00 = unknown; 02 = ultra-low pass whole-genome sequencing; 20 = whole-exome; 61 = bulk RNA-sequencing	Illumina NovaSeq 6000	29
EGAD00001011112	Sample count: 950 Experimentation: Illumina MiSeq amplicon sequencing	Illumina MiSeq	950
EGAD00001011113	This dataset consists of functional genomic data from 6 healthy donors taken from CD14+ monocytes upon different immune stimulations. It contains 150 paired end fastq files consisting of 30 total RNA-seq samples across 4 runs and 30 ATAC-seq samples. The samples were sequenced on Illumina HiSeq4000 platform.	Illumina HiSeq 4000	150
EGAD00001011114	Matched tumor-normal data for 6 JPAs. Generated using 10X Genomics Linked-reads	HiSeq X Ten	12
EGAD00001011115	RNA-seq of FACS sorted AT2 cells from ex-smokers with (n=6) and without (n=3) COPD at different disease stages. Fastq files are provided.	NextSeq 500	11
EGAD00001011116	WGBS of FACS sorted AT2 cells from ex-smokers with (n=6) and without (n=3) COPD at different disease stages. Fastq files are provided.	Illumina HiSeq 2500	11
EGAD00001011117	Whole Exome Sequencing Data for 10 patients for treatment with the ICI Nivolumab	Illumina HiSeq 2500	26
EGAD00001011118	The affected twins have their lymph nodes and buccal swabs sequenced with WGS. The unaffected sibling have his/her buccal swab sequenced with WGS as control too.	Illumina NovaSeq 6000	5
EGAD00001011119	RNA sequencing of 19 samples from PCNSL tumors. Sequencing was performed on a HiSeq X Ten using Illumina TruSeq Stranded mRNA Kit. Sequencing was always paired.	HiSeq X Ten	19
EGAD00001011122	We provide clinical data sets of Array CGH, targeted RNA-seq, total RNA-seq, whole genome bisulfite (WGBS) and whole genome DNA sequencing (WGS) obtained from bone marrow (BM) or peripheral blood (PB) mononuclear cells of 57 pediatric patients with dicentric chromosome dic(9;20) positive Acute lymphocytic leukemia (ALL), from which in 6 cases DNMT3B gene rearrangement was identified. This data is complemented by total RNA-seq and WGBS of samples from 4 additional ALL patients with a t(12;21) translocation and ETV6-RUNX1 gene fusion. DNA was isolated from BM or PB B-lymphocytes using Qiagen QIAamp DNA Blood Midi Kit to perform i) Array CGH of 58 dic(9;20) positive samples by hybridizing 500ng DNA using a Agilent 400K SurePrint G3 Custom CGH Human Genome Microarray (e-Array design 84704) ii) WGBS of 6 DNMT3B rearrangement positive samples and 4 ETV6-RUNX1 positive samples using Tecan TrueMethyl oxBS-Seq module for library preparation and Illumina NovaSeq 6000 platform to run 2x151 cycles iii) WGS of DNMT3B rearrangement positive samples using Illumina Lotus DNA Library Prep Kit followed by sequencing running 2x160 cycles on an Illumina NovaSeq 6000 platform. RNA was isolated from PB B-lymphocytes using the PerkinElmer Chemagic 360 instrument, followed by i) targeted RNA-seq of 56 dic(9;20) positive samples prepared using Illumina TruSight RNA Pan-Cancer Panel and sequenced on an Illumina MiSeq platform running 2x75 cycles ii) total RNA-seq of 6 DNMT3B rearrangement positive samples and 4 ETV6-RUNX1 positive samples utilizing TruSeq Stranded Total RNA Library Prep Gold kit and running 2x100 cycles on an Illumina NovaSeq 6000 platform.	Illumina MiSeq Illumina NovaSeq 6000	60
EGAD00001011123	Evaluation of somatic mutations in cervicovaginal samples as a non-invasive method for the detection and molecular classification of endometrial cancer	Illumina NovaSeq 6000	72
EGAD00001011124	Limbal stem cells obtained during penetrating keratoplasty from aniridia patients with congenital aniridia (Lagali Stage 4), limbal tissue was digested in collagenase A solution (4 mg/ml) in keratinocyte serum-free medium (KSFM) (Thermo Fisher Scientific; Waltham, MA) for 20 h at 37 °C. Cell suspensions were filtered through a use of Flowmi® micro strainer (SP Bel-Art; Wayne, NJ). LSC clusters were dissociated with trypsin-EDTA (0.05%) solution and cultivated in KSFM. Medium was refreshed every other day. Subconfluent (80–90%) limbal epithelial cells were harvested at passage 2	NextSeq 500	2
EGAD00001011125	MNC were isolated from bone of 95 AML patients at initial diagnosis via Ficoll gradient. WES has been performed on all 95 samples. Paired exome sequencing was done on a NovaSeq 6000 sequencer with Twist human core exome plus kit.	Illumina NovaSeq 6000	95
EGAD00001011126	Whole genome sequencing data of 57 single (PTA), clonally expanded and bulk human cells. Cells were obtained from bone marrow samples of patients with Fanconi Anemia (PMCFANCNN) or pediatric AML (PBNNNNN), from a clonal intestinal organoid line (STE0072/D-ORGWTNISL), from human cord blood (PMCCB15) and from a human lymphoblastoid cell line (PMCAHH1). WGS libraries were sequenced to ~15-30x genomic coverage (paired-end) on an Illumina Novaseq.	Illumina NovaSeq 6000	58
EGAD00001011127	Targeted DNA based panel of multiple ctDNA samples from 10 patients througout clinical care to assess treatment response. The panel is custom-designed and property of UGS, IC.	Illumina NovaSeq 6000	107
EGAD00001011128	The dataset contains 16 xenograft plasma cfDNA samples from mice grafted with a human colorectal cell line. Shallow WGS was performed on an Illumina Novaseq S4 PE150bp. Samples are provided as raw reads without any prior processing.	Illumina NovaSeq 6000	16
EGAD00001011129	Whole genome sequencing data for germline BRCA pancreatic cancer		1
EGAD00001011130	Sequencing data from a phase II study of nivolumab and ipilimumab in recurrent or refractory cancer of unknown primary (CheCUP trial). Panel sequencing data from baseline FFPE biopsies were used to perform a comprehensive genomic profiling of CUP metastases. Combined targeted next-generation sequencing of patient-specific hotsport mutations and shallow whole genome sequencing of baseline and follow-up liquid biopsy samples were used to analyze ctDNA and to evaluate response to immunei checkpoint inhibitor treatment. In some cases, whole exome sequencing of peripheral blood mononuclear cells was performed to screen for potential CHIP and germline mutations.	Illumina NovaSeq 6000 unspecified	53
EGAD00001011131	This study utilized blood samples collected in Lalitpur, Nepal as part of the Strategic Typhoid Alliance across Africa and Asia (STRATAA) study. The dataset comprises whole blood RNAseq of 376 febrile individuals from Nepal. Blood was collected in PAXgene tubes and sent to Monash University (Melbourne, Australia) where RNA was extracted using the PAXgene Blood RNA kit before being sent to the Wellcome Sanger Institute (Hinxton, UK) for sequencing. Library prep used NEBNext Ultra II RNA custom kits on an Agilent Bravo WS automation platform with poly(A) pulldown. After PCR, plates were purified using Agencourt AMPure XP SPRI beads and libraries were quantified using Biotium Accuclear Ultra high sensitivity dsDNA Quantitative kits. Pooled libraries were normalised to 2.8 nM. Samples were globin depleted using KAPA RNA HyperPrep with RiboErase. Libraries were then subjected to 2x100bp paired-end sequencing on Illumina NovaSeq. Each library was sequenced to an average of 80 million reads. The STRATAA study was approved by the Nepal Health Research Council (NHRC, ref 283 306/2015) and OxTREC (Oxford Tropical Research Ethics Committee, ref 39-15). All participants provided informed consent for human genetic tests.	Illumina NovaSeq 6000	376
EGAD00001011132	In this study we performed whole genome sequencing on matched tumor-normal CD138+ bone marrow mononuclear cells from 60 patients with newly diagnosed multiple myeloma treated with daratumumab, carfilzomib, lenalidomide, and dexamethasone (NCT03290950-MANHATTAN trial). In addition, we performed 5’-single-cell RNA-sequencing (10X Genomics) coupled with V(D)J sequencing and capture of the surface protein markers (TotalSeq-C, Biolegend) of the CD138- bone marrow mononuclear cells to interrogate the composition of the immune microenvironment at baseline and after eight cycles of induction therapy in 22 patients with newly diagnosed multiple myeloma. Samples were multiplexed using hashtag oligo.	Illumina HiSeq 4000 Illumina NovaSeq 6000	131
EGAD00001011134	RNA sequencing data from children with febrile illness and multisystem inflammatory syndrome in children (MIS-C). Samples used were Whole Blood. Febrile illness controls include children with bacterial and viral infections and healthy controls. This dataset contains samples from patients recruited into the DIAMONDS study.	Illumina NovaSeq 6000	46
EGAD00001011135	This dataset contains ATAC-seq data performed in MM.1S cell line in ETOH (control) or Dexamethasone condition (Treatment)	Illumina HiSeq 2500	2
EGAD00001011136	This dataset gather ChIP-seq data produced by immunoprecipitating CTCF factor in own laboratory in MM.1S cell line in EtOH and Dex conditions. It also gather ChIP-seq dataset produced by external laboratory (Active Motif) for H3K27ac mark and GR transcription factor in same cell line and conditions ( MM.1S ETOH/Dex)	Illumina MiSeq unspecified	2
EGAD00001011137	This dataset gather all RNA-sequencing data in MM.1S cell line in control and Dex condition; both in 3 biological replicates	Illumina HiSeq 2500	2
EGAD00001011138	This dataset gather HiChIP data for H3K27ac mar in MM.1S cell line in control and Dex condition, both in two biological replicates	Illumina NovaSeq 6000	2
EGAD00001011139	This dataset gather scRNAseq data performed for MM.1S cell line in control and dex conditions at 4h and 24h	Illumina NovaSeq 6000	2
EGAD00001011140	This dataset gather scMultiomic data including RNA-seq and ATAC-seq in MM.1S in control and dex condition at 1h and 4h	Illumina NovaSeq 6000	2
EGAD00001011141	BAM files from total RNA sequencing of samples from breast cancer patients in the TNT trial. Data includes 186 primary tumour samples and 13 matched recurrence samples.	Illumina HiSeq 2000	199
EGAD00001011142	TPM matrices of counts from RNA sequencing (RNAseq) from baseline CD138(–) BM fractions.		1
EGAD00001011143	Matrix of normal values from 39 marker CyTOF assay performed on longitudinal BMMC. FCS files from the CyTOF assay were normalized and concatenated using Fluidigm's CyTOF software and then de-multiplexed using Astrolabe Diagnostics, Inc., a commercial, cloud-based platform for single-cell analysis.		1
EGAD00001011144	TPM matrices of counts from RNA sequencing (RNAseq) from longitudinal CD138+ enriched BM fractions.		1
EGAD00001011145	TPM matrices of counts from RNA sequencing (RNAseq) from longitudinal CD138(–) BM fractions.		1
EGAD00001011146	Matrix of normal values from 39 marker CyTOF assay performed on baseline BMMC. FCS files from the CyTOF assay were normalized and concatenated using Fluidigm's CyTOF software and then de-multiplexed using Astrolabe Diagnostics, Inc., a commercial, cloud-based platform for single-cell analysis.		1
EGAD00001011147	Matrix of normalized values from Olink assay performed on baseline BM Plasma. The Olink Immuno-Oncology multiplex proteomic Panel included 92 proteins associated with human inflammatory conditions. Data is analyzed using real-time PCR analysis software via the Ct method and Normalized Protein Expression (NPX) manager. Data were normalized using internal controls in every single sample, inter-plate controls, negative controls and a correction factor and expressed as Log2 scale, which was proportional to the protein concentration. One NPX difference equals the doubling of the protein concentration.		1
EGAD00001011148	TPM matrices of counts from RNA sequencing (RNAseq) from baseline CD138+ enriched BM fractions.		1
EGAD00001011149	Exome sequencing data from two small cell prostate cancer patients - 4 cancer samples (FFPE) from Patient 1 collected at 3 different time points and 2 cancer samples (FFPE) from Patient 2 collected at 1 time point. Exonic DNA was enriched using the TruSeq Exome Kit (Illumina) and sequenced on the Illumina NextSeq 500 as 75bp paired end reads (total read length 150bp).	NextSeq 500	8
EGAD00001011150	Single-cell genotyping data for bone marrow samples from 9 cases with clonal hematopoiesis and 1 control sample. The TARGET-seq+ protocol was used to generate plate-based 3' transcriptome data. For details on cell sorting and the TARGET-seq+ protocol see the methods section of the manuscript. One FASTQ file is provided per cell. Cells are named with their plate and well IDs and the subject ID. Empty wells (no-cell controls) are named "blank". Corresponding transcriptome files use the same naming with the "_transcriptome" suffix.	NextSeq 500	11712
EGAD00001011151	Capture-based NGS obtained using the "all-CLL" panel (also known as SOPHiA DDM (TM) Community CLL Clonality Solution). Library preparation was performed following SOPHiA GENETICS recommendations using 200 ng genomic DNA. Libraries were sequenced on a MiSeq instrument (2x300 bp, Illumina) aiming at a mean coverage of 1,000x.	Illumina MiSeq	118
EGAD00001011153	Sample cohort (n=48) is consisted of healthy, atrophic gastritis and gastric cancer patients. Some of the gastric cancer patients samples are collected at the separate time points: -1 before the operation; -2 after the operation; -3 during the control visit. For the hybridisation capture of the genes unique 15 gastric cancer-related gene panel was developed and very deep sequencing using TruSight Oncology Unique Molecular Identifier (UMI) Reagents (Illumina) was used.	Illumina NovaSeq 6000	48
EGAD00001011154	RNASeq data from one small cell prostate cancer patient - 4 cancer samples (FFPE) from Patient 1 collected at 3 different time points. mRNA was selected using the Magnetic mRNA Isolation Module (NEB) and sequenced on the Illumina NextSeq 500 as 75bp paired end reads (total read length 150bp).	NextSeq 500	12
EGAD00001011155	This dataset contains raw FASTQ files from single cell RNA sequencing of SarBC-01 cells treated with Dexamethasone vs DMSO, with or without Matrigel, processed with MULTI-seq. Cells from 4 different culture conditions (+Mat/Dex, +Mat/DMSO, -Mat/Dex, -Mat/DMSO) were harvested, processed for multiplexing using the MULTI-seq protocol and loaded in a Chromium Single Cell 3ʹ GEM Library and Gel Bead Kit v3 (10x Genomics). Gene expression (cDNA) and MULTI-seq libraries were prepared according to the manufacturers’ protocol of Chromium Next GEM Single Cell 3’ reagents Kits v3.1 (Dual Index). Finally, cDNA and MULTI-seq libraries were analyzed using an Agilent Bioanalyzer (DNA High Sensitivity kit) and sequenced on a NovaSeq6000 (S2 flow cell) platform. MULTISEQ BARCODES: TTAGCCAG => Matrigel/DMSO CCACAATG => Matrigel/Dexamethasone GCACACGC => noMatrigel/DMSO AGAGAGAG => noMatrigel/Dexamethasone	Illumina NovaSeq 6000	2
EGAD00001011156	This dataset contains raw FASTQ files from bulk RNA sequencing of SarBC-01 organoids at different passages (Passage 6, Pssage 19, Passage 59), UroBC-01 organoids (Passage 70), UroBC-16 organoids (Passage 19) and UroBC-22 organoids (Passage 8). RNA was isolated using the Quick-DNA/RNA Miniprep kit (Zymo Research, Irvine, CA, USA, D7001) and subjected to bulk RNA sequencing. TruSeq Stranded mRNA kit (Illumina, 20020594) was used for the library preparation according to manufacturer’s guidelines. Sequencing was performed on Illumina NovaSeq 6000 using paired-end 100-bp reads.	Illumina NovaSeq 6000	6
EGAD00001011157	This dataset contains raw FASTQ files from DNA of SarBC-01-related samples (SarBC-01 patient Germline, SarBC-01 patient Urothelial tumor, SarBC-01 patient Sarcomatoid tumor, SarBC-01 Passage 6 organoids, SarBC-01 Passage 20 organoids) and UroBC-01 related samples (UroBC-01 patient Germline, UroBC-01 patient Urothelial tumor, UroBC-01 Passage 19 organoids, UroBC-01 Passage 6 organoids). DNA was extracted from FFPE tissue, using RecoverAll RNA/DNA extraction kit (Invitrogen, Carlsbad, CA, USA, AM1975) followed by incubation with Uracil-DNA Glycosylase, and fresh or flash frozen tissue, using Quick-DNA/RNA Miniprep kit (Zymo Research, Irvine, CA, USA, D7001). Twist Human Core Exome + RefSeq + Mito-Panel kit (Twist Bioscience, 102031) was used for the whole exome capturing. Sequencing was performed on Illumina NovaSeq 6000 using paired-end 100-bp reads.	Illumina NovaSeq 6000	9
EGAD00001011158	The paired-end reads were aligned to the human reference genome (hg19) with decoy sequences (as used in the 1000 Genomes Project) using BWA (v0.7.5a) software. The duplicate reads were marked and subsequently removed using Picard (v1.140) (https://broadinstitute.github.io/picard/) and SAMtools (v0.1.19), respectively. Local realignment around insertions/deletions (indels) and base quality recalibration were performed using GATK (v3.6).	Illumina HiSeq 3000	54
EGAD00001011160	The dataset contains panel sequencing data of 170 genes from 380 patients of the EORTC-26101 trial. The corresponding methylation data is available via the Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/) repository with the accession number GSE237103.		1
EGAD00001011161	Explore 1536 dataset from serum of patients enrolled in the COVACTA trial. This dataset includes limit of detection values provided by Olink.		1
EGAD00001011162	Sample metadata for the RNA-seq dataset. This dataset includes subject-level data and longitudinal visit day information for the corresponding samples.		1
EGAD00001011163	This dataset contains data for CBC counts and absolute protein abundance measurements from ELISA experiments.		1
EGAD00001011164	RNA-seq data from PAXgene extracted whole blood of patients enrolled in the COVACTA trial. This dataset includes read counts per gene. Read counts were generated for the Gencode v27 annotation using the summarize Overlaps method from bioC in mode “IntersectionStrict”		1
EGAD00001011165	Sample metadata for the olink dataset. This dataset includes subject-level data and longitudinal visit day information for the corresponding samples.		1
EGAD00001011166	Linking anonymized sample IDs and anonymized patient IDs.		1
EGAD00001011167	Olink Explore 1536 dataset from serum of patients enrolled in the COVACTA trial. This dataset includes QC warning flag provided by Olink.		1
EGAD00001011168	RNA-seq data from PAXgene extracted whole blood of patients enrolled in the COVACTA trial. This dataset includes raw FASTQ files.	unspecified	1646
EGAD00001011169	Olink Explore 1536 dataset from serum of patients enrolled in the COVACTA trial. This dataset includes the NPX values provided by Olink.		1
EGAD00001011171	In this study, we aimed to assess RNA expression and surface antigen expression of acute myeloid leukemia with complex karyotype (CK-AML) at the single-cell level. For this purpose, we performed cellular indexing of transcriptomes and epitopes (CITE-seq) of primary leukemia samples from four CK-AML patients.	Illumina NovaSeq 6000	4
EGAD00001011172	In this study, we aimed to identify somatic structural variation of acute myeloid leukemia with complex karyotype (CK-AML) at the single-cell level and to investigate its direct consequence on the nucleosome occupancy using scNOVA approach. For this purpose, we performed strand-specific single-cell sequencing of primary leukemia samples from four CK-AML patients. We also performed strand-specific single-cell sequencing of two patient-derived xenografts (PDXs).	NextSeq 500	364
EGAD00001011173	In colorectal cancers (CRC) the tumor microenvironment plays a key role for prognosis and therapy efficacy. Patient-derived tumor organoids (PDTOs) show enormous potential for preclinical testing, however, purely epithelial cultures features including the ‘consensus molecular subtypes’ (CMS) are largely eradicated. To better reflect the cell type heterogeneity, we established the CRC organoid-stroma biobank of matched PDTOs and cancer-associated fibroblasts (CAFs) from 30 patients. Whole exome sequencing and transcriptome analysis in various in vitro and in vivo contexts was performed to study the influence of the TME on the CRC phenotype.	Illumina HiSeq 2000 Illumina HiSeq 4000	258
EGAD00001011174	We performed whole genome sequencing on 42 prostate cancer samples from the prostate, seminal vesicles and regional lymph nodes of five treatment-naive patients with locally advanced disease who underwent radical prostatectomy. Whole genome sequencing was performed as 150bp paired end reads on the Illumina NovaSeq 6000 platform.	Illumina NovaSeq 6000	48
EGAD00001011175	Single-cell whole transcriptome sequencing data for bone marrow samples from 9 cases with clonal hematopoiesis and 4 control samples. The TARGET-seq+ protocol was used to generate plate-based 3' transcriptome data. For details on cell sorting and the TARGET-seq+ protocol see the methods section of the manuscript. One FASTQ file is provided per cell. Cells are named with their plate and well IDs and the subject ID. Empty wells (no-cell controls) are named "blank". Corresponding genotyping files use the same naming without the "_transcriptome" suffix.	Illumina NovaSeq 6000	14073
EGAD00001011176	This dataset represent the RNA-seq, which was done on untreated small intestinal organoids; small intestinal organoids treated with chemotherapeutic, busulfan; untreated small intestinal organoids co-cultured wth mesenchymak stromal/stem cells (MSCs; busulfan treated small intestinal organoids co-cultured with MSCs. The same set of samples was done for 3 different primary bone marrow MSC donors.	NextSeq 500	12
EGAD00001011178	This study contains methyl-binding domain sequencing and shallow whole genome sequencing from circulating cell-free DNA (cfDNA) for 143 patients with metastatic cancer of known type, 41 patients with Cancer of Unknown Primary (CUP) and 27 non-cancer controls.	Illumina NovaSeq 6000	211
EGAD00001011180	The dataset contains the methylome EM-sequencing raw data (fastq) of different spermatogenic cells from 5 human males (three controls and two crypotzoospermic). The datasets correspond to the following cell types: undifferentiated spermatogonia, differentiating spermatogonia, 4C spermatocytes, and 1C spermatids (this cell type only for the control individuals)	Illumina NovaSeq 6000	21
EGAD00001011186	Set of 19 patients afflicted with colorectal cancer with matching preoperative and postoperative blood plasma, PBMC, and tumor biopsy sequencing data. Originally referenced by Genome-wide cell-free DNA mutational integration enables ultra-sensitive cancer monitoring. Nat Med. 2020, Zviran et. al.	Illumina HiSeq 4000	72
EGAD00001011187	Data supporting: "Understanding the malignant potential of gastric metaplasia of the oesophagus and its relevance to Barrett’s Oesophagus surveillance: individual-level data analysis" Black et al (WGS OACs/BOs/normals)	HiSeq X Five Illumina HiSeq 2000 Illumina NovaSeq 6000 unspecified	63
EGAD00001011188	Data supporting: "Understanding the malignant potential of gastric metaplasia of the oesophagus and its relevance to Barrett’s Oesophagus surveillance: individual-level data analysis" Black et al (WES OACs/BOs/normals)	unspecified	170
EGAD00001011189	Data supporting: "TBC" Ganguli et al (sWGS for 75 samples)	Illumina HiSeq 2500 Illumina HiSeq 4000	1
EGAD00001011190	Data supporting: "TBC" Ganguli et al (RNA for 394 samples)	Illumina HiSeq 2000 unspecified	49
EGAD00001011191	Data supporting: "TBC" Ganguli et al (WGS for 1298 samples)	HiSeq X Five Illumina HiSeq 2000 Illumina NovaSeq 6000 unspecified	42
EGAD00001011192	This study aims to investigate the dysregulation of RNA translation and identify functional non-canonical open reading frames (ORFs) as potential targets for medulloblastoma treatment. The study involves ribosome profiling and RNAseq of medulloblastoma tissues and cell lines to observe the translation of non-canonical ORFs. Multiple CRISPR-Cas9 screens will be used to identify functional non-canonical ORFs implicated in medulloblastoma cell survival.		4
EGAD00001011193	This study aims to investigate the dysregulation of RNA translation and identify functional non-canonical open reading frames (ORFs) as potential targets for medulloblastoma treatment. The study involves ribosome profiling and RNAseq of medulloblastoma tissues and cell lines to observe the translation of non-canonical ORFs. Multiple CRISPR-Cas9 screens will be used to identify functional non-canonical ORFs implicated in medulloblastoma cell survival.	NextSeq 2000	4
EGAD00001011194	Single-cell ATAC-seq of pediatric AML tumours from patients enrolled in the clinical trial AAML1031. Obtained using the 10X Chromium NextGEM Single Cell ATAC Reagent Kit, v1.1. A total of 64 samples were obtained from biopsies at diagnosis, remission and relapse from 25 patients.	DNBSEQ-G400	64
EGAD00001011195	Single-cell RNA-seq of pediatric AML tumours from patients enrolled in the clinical trial AAML1031. Obtained using the 10X Chromium Single Cell 3’ Reagent Kit, v3.0. A total of 75 samples were obtained from biopsies at diagnosis, remission and relapse from 28 patients.	DNBSEQ-G400	62
EGAD00001011196	Data supporting: "Mutational signature dynamics shaping the evolution of oesophageal adenocarcinoma" Abbas et al (WGS for 1397 samples)	HiSeq X Five Illumina HiSeq 2000 Illumina NovaSeq 6000 unspecified	6
EGAD00001011197	Transcriptomic data generated by RNA-sequencing for adult human AMLs with STAG2 or RAD21 mutations or no cohesin mutations (CTRL-AMLs).	Illumina NovaSeq 6000	1
EGAD00001011198	High-throughput chromosome conformation capture (Hi-C) data generated for cohesin-mutated (STAG2 or RAD21) and cohesin-wildtype AMLs.	Illumina NovaSeq 6000	1
EGAD00001011199	ChIP-Seq targeting the major cohesin core subunit RAD21 to represent cohesin occupancy and binding sites in cohesin-mutated (STA2 or RAD21 mutations) and wildtpye adult AMLs.	NextSeq 500	1
EGAD00001011200	ChIP-Seq targeting CTCF in cohesin-mutated (STAG2 or RAD21 mutations) and wildtype adult AMLs (CTRL-AMLs).	NextSeq 500	1
EGAD00001011201	This data includes scRNA-seq, scTCR-seq and scBCR-seq of 21 individuals post Covid'19 vaccination. Individuals range from the ages 52 to 75. Samples were genotype multiplexed in an overlapping mixture design, pooled, and sequenced on 16 lanes (10X, 5' GEM).	Illumina NovaSeq 6000	16
EGAD00001011202	This dataset contains paired-end (151x2) RNA sequencing data (fastq files) from 25 human blood samples, some with sex chromosome aneuploidies. Data were generated using Illumina technology (Novaseq 6000) and a median of approximately 174 million pairs of reads per sample were obtained using two sequencing batchs. Samples come from: six 46,XX individuals, six 46,XY individuals, nine 47,XXY individuals, and four 47,XYY individuals.	Illumina NovaSeq 6000	25
EGAD00001011204	ChIP-seq targeting the H3K27ac histone modification in cohesin-mutated (STAG2 or RAD21 mutation) and cohesin wildtype (CTRL-AMLs) AMLs.	NextSeq 500	9
EGAD00001011205	ChIP-Seq targeting the cohesin subunit STAG2 in STAG2-mutant or cohesin wildtype adult AMLs.	NextSeq 500	5
EGAD00001011206	ChIPseq targeting the cohesin subunit STAG1 in STAG2-mutated AMLs or cohesin wildtype AMLs	NextSeq 500	1
EGAD00001011207	Low-pass whole-genome DNA sequencing of cohesin-mutated (STAG2 or RAD21 mutations) and wildtype (CTRL-AML) adult AMLs generated generated from ultrasound-fragmented genomic DNA. Samples were only sequenced shallow (20-40 Mio reads) Used for digital karyotyping and ChIP-seq background/copy-nuber normalization/correction.	NextSeq 500	9
EGAD00001011208	The dataset contains methylation values of all SNP-filtered CpG sites for all samples from the air pollution study (total n=60). Nasal lavage samples were collected from n=29 moderately exposed (residing in Stuttgart) and n=31 lowly exposed (residing in Simmerath) individuals. For methods and study details, please see PMID 37343754.		1
EGAD00001011209	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0063_000 for Triple negative breast cancer patient-derived xenograft SA609X3XB01584	NextSeq 500	1
EGAD00001011210	10x Single Cell Gene Expression library TENX068 for Triple negative breast cancer patient-derived xenograft SA609X4XB03080	NextSeq 500	1
EGAD00001011211	10x Single Cell Gene Expression library TENX069 for Triple negative breast cancer patient-derived xenograft SA609X4XB03083	NextSeq 500	1
EGAD00001011212	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0150_001 for Triple negative breast cancer sample SA609X5XB03230	Illumina HiSeq X	1
EGAD00001011213	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0150_002 for Triple negative breast cancer sample SA609X5XB03231	Illumina HiSeq 2500 Illumina HiSeq X	1
EGAD00001011214	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0146_002 for Triple negative breast cancer patient-derived xenograft SA609X5XB03223	Illumina HiSeq 2500	1
EGAD00001011215	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0152_001 for Triple negative breast cancer patient-derived xenograft SA609X6XB03401	Illumina HiSeq 2500	1
EGAD00001011216	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0152_002 for Triple negative breast cancer sample SA609X6XB03404	Illumina HiSeq 2500	1
EGAD00001011217	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0172_001 for Triple negative breast cancer sample SA609X6XB03447	Illumina HiSeq 2500	1
EGAD00001011218	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0163_002 for Triple negative breast cancer sample SA609X7XB03510	Illumina HiSeq 2500	1
EGAD00001011219	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0163_001 for Triple negative breast cancer patient-derived xenograft SA609X7XB03505	Illumina HiSeq 2500	1
EGAD00001011220	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0172_002 for Triple negative breast cancer sample SA609X7XB03554	Illumina HiSeq 2500	1
EGAD00001011221	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0148_001 for Triple negative breast cancer patient-derived xenograft SA535X4XB02498	Illumina HiSeq 2500	1
EGAD00001011222	10x Single Cell Gene Expression library TENX048 for Triple negative breast cancer sample SA535X5XB02895	NextSeq 500	1
EGAD00001011223	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0146_006 for Triple negative breast cancer sample SA535X6XB03099	Illumina HiSeq 2500	1
EGAD00001011224	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0175_001 for Triple negative breast cancer patient-derived xenograft SA535X6XB03101	BGISEQ-500 Illumina HiSeq 2500	1
EGAD00001011225	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0182_001 for Triple negative breast cancer patient-derived xenograft SA535X7XB03304	Illumina HiSeq 2500	1
EGAD00001011226	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0189_001 for Triple negative breast cancer patient-derived xenograft SA535X7XB03448	Illumina HiSeq 2500	1
EGAD00001011227	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0184_001 for Triple negative breast cancer sample SA535X7XB03305	Illumina HiSeq 2500	1
EGAD00001011228	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0146_004 for Triple negative breast cancer sample SA535X7XB03305	Illumina HiSeq 2500	1
EGAD00001011229	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0189_002 for Triple negative breast cancer patient-derived xenograft SA535X8XB03663	Illumina HiSeq 2500	1
EGAD00001011230	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0182_002 for Triple negative breast cancer sample SA535X8XB03431	Illumina HiSeq 2500	1
EGAD00001011231	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0184_002 for Triple negative breast cancer patient-derived xenograft SA535X8XB03434	Illumina HiSeq 2500	1
EGAD00001011232	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0173_002 for Triple negative breast cancer patient-derived xenograft SA535X9XB03617	Illumina HiSeq 2500	1
EGAD00001011233	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0173_001 for Triple negative breast cancer patient-derived xenograft SA535X9XB03616	Illumina HiSeq 2500	1
EGAD00001011234	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0195_001 for Triple negative breast cancer patient-derived xenograft SA535X8XB03664	Illumina HiSeq 2500	1
EGAD00001011235	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0206_002 for Triple negative breast cancer patient-derived xenograft SA535X10XB03696	Illumina HiSeq 2500	1
EGAD00001011236	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0206_001 for Triple negative breast cancer sample SA535X10XB03693	Illumina HiSeq 2500	1
EGAD00001011237	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0208_002 for Triple negative breast cancer patient-derived xenograft SA535X9XB03776	Illumina HiSeq 2500	1
EGAD00001011238	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0142_002 for Triple negative breast cancer sample SA1035X4XB02879	Illumina HiSeq 2500	1
EGAD00001011239	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0071_000 for Triple negative breast cancer sample SA1035X5XB03015	Illumina HiSeq 2500	1
EGAD00001011240	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0076_000 for Triple negative breast cancer sample SA1035X5XB03021	Illumina HiSeq 2500	1
EGAD00001011241	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0079_001 for Triple negative breast cancer sample SA1035X6XB03216	HiSeq X Ten Illumina HiSeq 2500	1
EGAD00001011242	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0079_002 for Triple negative breast cancer patient-derived xenograft SA1035X6XB03211	HiSeq X Ten Illumina HiSeq 2500	1
EGAD00001011243	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0142_004 for Triple negative breast cancer sample SA1035X6XB03209	Illumina HiSeq 2500	1
EGAD00001011244	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0145_001 for Triple negative breast cancer patient-derived xenograft SA1035X7XB03338	NextSeq 500	1
EGAD00001011245	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0145_002 for Triple negative breast cancer patient-derived xenograft SA1035X7XB03340	NextSeq 500	1
EGAD00001011246	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0162_002 for Triple negative breast cancer sample SA1035X7XB03502	Illumina HiSeq 2500	1
EGAD00001011247	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0151_001 for Triple negative breast cancer patient-derived xenograft SA1035X8XB03425	Illumina HiSeq 2500	1
EGAD00001011248	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0162_001 for Triple negative breast cancer sample SA1035X8XB03420	Illumina HiSeq 2500	1
EGAD00001011249	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0175_002 for Triple negative breast cancer patient-derived xenograft SA1035X8XB03631	Illumina HiSeq 2500	1
EGAD00001011250	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0164_002 for Triple negative breast cancer sample SA530X3XB03295	Illumina HiSeq 2500	1
EGAD00001011251	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0014_001 for Triple negative breast cancer patient-derived xenograft SA604X6XB01979	Illumina HiSeq 2500	1
EGAD00001011252	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0015_001 for Triple negative breast cancer patient-derived xenograft SA604X6XB01979	Illumina HiSeq 2500	1
EGAD00001011253	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0019_001 for Triple negative breast cancer patient-derived xenograft SA604X7XB02089	NextSeq 500	1
EGAD00001011254	10x Single Cell Gene Expression library SCRNA10X_SA_CHIP0020_002 for Triple negative breast cancer patient-derived xenograft SA604X8XB02164	Illumina HiSeq 2500	1
EGAD00001011255	Data supporting: "Understanding the malignant potential of gastric metaplasia of the oesophagus and its relevance to Barrett’s Oesophagus surveillance: individual-level data analysis" Black et al (WGS BOs/normals)	unspecified	28
EGAD00001011256	Dataset contains WGS sequencing data from bulk sorted therapy-related myeloid neoplasms (t-MN) and reference cells (MSCs/B cells/T cells). In addition, from 4 patients, also WGS data from single hematopoietic stem and progenitor cells, obtained from samples of t-MN diagnosis, are included. These are either clonally expanded before WGS, or DNA was directly amplified via the primary template-directed amplification (PTA) protocol (mentioned in the sample name).	Illumina NovaSeq 6000	72
EGAD00001011257	WGS data of clonally expanded HSPCs from a Li-Fraumeni patient at the time of second cancer (Burkitt lymphoma and <5% t-MN) after primary osteosarcoma diagnosis and a reference MSC bulk.	Illumina NovaSeq 6000	10
EGAD00001011258	We identified a T-cell receptor (TCR) reactive to the recurrent FLT3 D835Y mutation in the tyrosine-kinase domain. To validate the elimination efficacy of leukemia cells, we transplanted human acute myeloid leukemia (AML) cells with FLT3 D835Y mutations into NSG-SGM3 mice and treated either with TCR FLT3 D835Y redirected T cells, or control TCR (TCR 1G4). After treatment, we performed flow sorting of AML blasts (CD3-CD19-) and primary T cells (CD3+CD8+orCD4+CD19-CD33-) and performed whole-exome sequencing.	NextSeq 550	4
EGAD00001011259	10x genomics single-cell RNAseq of an isogenic human iPSC model for SMA and control. The transcriptomic analysis was performed at 3 timepoints, day 4, day 20 and day 40. The analysis of this dataset was reported in the manuscript "An isogenic human iPSC model unravels neurodevelopmental abnormalities in SMA" from Grass et al.:	Illumina NovaSeq 6000	8
EGAD00001011260	A prospective study of individuals with suspicion of a hereditary cancer syndrome for whom previous clinical targeted genetic testing was either not informative or was not available. To identify pathogenic disease-causing variants explaining participant presentation, germline whole-genome sequencing and a comprehensive cancer virtual gene panel analysis were undertaken.	Illumina NovaSeq 6000	182
EGAD00001011264	This dataset contains the count matrices and corresponding metadata for our study on bronchial epithelial cells response to RSV in healthy and in asthma. This scRNAseq data is from primary cells, that have been differentiated in ALI cultures and infected with RSV.	Illumina NovaSeq 6000	8
EGAD00001011265	Data from sequencing of microbiopsies of keratinocytes isolated via laser capture microscopy from lesional and non-lesonal skin biopsies from psoriasis patients.	Illumina NovaSeq 6000	1211
EGAD00001011267	ATAC-seq (Illumina TDE1 Transposase) to profile accessible chromatin regions of cohesin-mutated (STAG2 or RAD21 mutations) and -wildtype adult AMLs.	NextSeq 500	1
EGAD00001011268	10x Single Cell Gene Expression library TENX063 for Triple negative breast cancer patient-derived xenograft SA609X3XB01584	NextSeq 500	1
EGAD00001011269	Data supporting: "Mutational signature dynamics shaping the evolution of oesophageal adenocarcinoma" Abbas et al (RNA for 197 samples)	Illumina HiSeq 2000	1
EGAD00001011271	Whole genome sequencing on DNA from snap frozen tumour sample and matched whole blood of patient #130. Analysis used the QIAamp DNA Mini Kit. Libraries were prepared using the Illumina TruSeq Nano library method using 200ng of DNA. Extracted DNA was sheared using the Covaris M220 Focused-ultrasonicator with a target fragment length of 550bp through bead size selection. The libraries were sequenced at depth of 40x for germline DNA and 80-100x for tumour DNA using paired 150bp reads.	Illumina NovaSeq 6000	2
EGAD00001011272	Whole Exome Sequencing of blood and FFPE tumour of patient #368.  150–300 ng of DNA was fragmented to approximately 200 bp using a focal acoustic device. Libraries were prepared with the Kapa Hyper Prep Kit and SureSelectXT adaptors. Hybridisation capture was performed with SureSelect Clinical Research Exome V2 baits following the SureSelectXT recommended protocol (Agilent). Indexed libraries were sequenced on an Illumina NovaSeq 6000 to generate paired-end 150 bp reads with average of 70-fold base coverage for the germline sample and 330-fold coverage for tumour sample.	Illumina NovaSeq 6000	2
EGAD00001011273	Single RNA-Seq of CD11b Beads selected tumor associated macrophages (TAMs) of 3 gliomablastoma patients treated with small molecule inhibitors. The sequencing was done on HiSeq 4000 with the SmarTer Ultra Low Input RNA v4 and NEBNext ChIP-Seq Kit. The TAMs were treated with GW2580, BLZ945 and PLX3397 and DMSO as control.	Illumina HiSeq 4000	12
EGAD00001011274	Tumor Organoids from glioblastoma, 2 patients, treated with different small molecule inhibitors. Paired RNA-Seq was done on NovaSeq 6000 with the Illumina TruSeq stranded mRNA Kit. The small inhibitors GW2580, BLZ945 and PLX3397 were used. DMSO was used as control.	Illumina NovaSeq 6000	8
EGAD00001011275	This dataset contains scRNA-seq data from 8 co-cultures of GSCCs and macrophages, and 1 monoculture of macrophages. Samples were individually labeled and pooled using MULTI-seq technology and processed with 10x Genomics Technology.	Illumina NovaSeq 6000	1
EGAD00001011276	This dataset contains DNA sequencing data from 7 GSCCs (data for BT569 is not available). We performed focused exome sequencing (43 most mutated genes in GBM), combined with OneSeq analysis (Agilent) on the GSCCs and identified both shared and GSCC-specific mutations. For CME038, whole genome sequencing was performed.	unspecified	7
EGAD00001011277	Single cell RNA paired-end sequencing was performed on CD34+ progenitor cells derived from 4 SLE patients, 6 healthy control samples and 2 umbilical cord blood samples using Illumina MiSeq/NextSeq 500.	Illumina MiSeq NextSeq 500	426
EGAD00001011278	This dataset contains paired-end whole-exome sequencing data (2x50 bp) from the normal samples, the primary tumors and the recurrences/metastases of 8 head and neck cancer patients.	Illumina NovaSeq 6000	27
EGAD00001011279	This dataset contains paired-end RNA sequencing data (2x50 bp) from the primary tumors and the recurrences/metastases of 6 head and neck cancer patients.	Illumina HiSeq 2500	15
EGAD00001011281	Single-cell CITE(cellular indexing of transcriptomes and epitopes)-seq from MDS (n = 2, MDS02 in 2 replicates). cDNA from 10x Genomics 3' V3.	Illumina NovaSeq 6000	6
EGAD00001011282	Single-cell long-read (ONT) transcriptome sequencing from CH (n = 3), MDS (n = 5, MDS02 in 2 replicates) and AML (n =1, in 2 replicates) samples with mutations in splicing factors (SF3B1 - n = 8, U2AF1 n = 1) or transcription regulators (DNMT3A - CH04). Full length cDNA from 10x Genomics 3' V3.	MinION PromethION	11
EGAD00001011283	Single-cell RNA sequencing from CH (n = 2), MDS (n = 6, MDS02 in 2 replicates) and AML (n = 1, in 2 replicates) samples with mutations in splicing factors (SF3B1 - n = 8, U2AF1 n = 1). cDNA from 10x Genomics 3' V3.	Illumina NovaSeq 6000	19
EGAD00001011284	Single-cell targeted amplicon RNA sequencing from CH (n = 3), MDS (n = 6, MDS02 in 2 replicates) and AML (n =1, in 2 replicates) samples with mutations in splicing factors (SF3B1 - n = 8 [2 -CH, 6 MDS], U2AF1 n = 1) or transcription regulators (DNMT3A - CH04). The transcripts are targeted to detect a particular CH mutation in the listed genes. U2AF1 GoT are full-length cDNA, while the rest follow the 10x 3' V3 protocol.	Illumina NovaSeq 6000 MinION NextSeq 500	21
EGAD00001011285	Multiple large-scale genomic profiling efforts have been undertaken in osteosarcoma to define the genomic drivers of tumorigenesis, therapeutic response, and disease recurrence. The spatial and temporal intratumor heterogeneity could also play a role in promoting tumor growth and treatment resistance. Here, we conducted longitudinal whole-genome sequencing of 37 tumor samples from eight patients with osteosarcoma that relapsed or became refractory to initial therapy. We found that the chemoresistant population in recurrent osteosarcoma is subclonal at diagnosis, emerges at the time of primary resection due to selective pressure from neoadjuvant chemotherapy, and is characterized by unique oncogenic amplifications.	Illumina NovaSeq 6000	37
EGAD00001011286	AD Samples 1 and 2. AD1 has 2 runs of additional sequencing, AD2 has 1 run of additional sequencing.	Illumina HiSeq 4000	2
EGAD00001011288	Single cell transcriptomics (10x 3') of human adrenal gland. The adrenal gland of this particular individual is characterized by the presence of mutant clone characterized by aneuploidy in chromosomes 8, 9, 13 and 22, occupying the zona glomerulosa and zona fasciculata. This dataset contains the primary sequencing data generated from the 10x genomics 3' library in fastq format.	Illumina HiSeq 4000	1
EGAD00001011289	RNA seq of iPSC-derived cardiomyocytes cultured in lipid-enriched maturation medium (MM), in MM on nanopattern-surfaces (NP) and under the influence of MM, NP and electrical Stimulation (ES, 2Hz, 14 days). 9 Samples from 3 Independent batches/differentiations of cardiomyocytes from 2 different iPSC-lines (iWTD2.1 and isWT7.22).	Illumina NovaSeq 6000	9
EGAD00001011290	We performed Whole Exome (WXS) and RNASeq sequencing on samples obtained from the phase 3 randomized, double-blinded, placebo-controlled study in patients with locally advanced squamous cell carcinoma of the head and neck JAVELIN Head and Neck 100 (NCT02952586). There are 471 WXS Tumor samples and 346 RNASeq tumor samples. All tumor samples have matching normal samples.	Illumina NovaSeq 6000	346
EGAD00001011291	This dataset contains sequencing data from VLP-enriched fecal microbiome from LLNEXT project	HiSeq X Ten	206
EGAD00001011293	This dataset contains sequencing data from fecal microbiome from LLNEXT project	HiSeq X Ten	321
EGAD00001011294	RNASeq files for paper titled "Proposal of a new genomic framework for categorization of pediatric acute myeloid leukemia associated with prognosis"	Illumina HiSeq 2000	307
EGAD00001011295	WGS files for paper titled "Proposal of a new genomic framework for categorization of pediatric acute myeloid leukemia associated with prognosis"	Illumina HiSeq 2000	264
EGAD00001011296	WXS files for paper titled "Proposal of a new genomic framework for categorization of pediatric acute myeloid leukemia associated with prognosis"	Illumina HiSeq 2000	215
EGAD00001011297	Dataset consists of 216 samples, with each sample having a BAM, BAI and VCF file.	Ion Torrent S5	216
EGAD00001011300	116 single cell ATAC runs and 23 single cell multiome (ATAC+RNA) runs on various brain regions during human first-trimester development.	Illumina NovaSeq 6000	116
EGAD00001011301	Dataset is described in doi: https://doi.org/10.1101/2022.12.13.22283363	Illumina HiSeq 2500	108
EGAD00001011302	We performed Whole Exome (WXS) and RNASeq sequencing on samples obtained from the same site before and during therapy from our prospective clinical trial (CA209-153, NCT02066636) of nivolumab in advanced Non-small cell lung cancer (NSCLC) patients that progressed on chemotherapy. There are 58 pre and 42 on therapy WXS samples and 24 pre and 12 on therapy RNASeq samples. All WXS tumor samples have matching normal samples.	Illumina HiSeq 4000	200
EGAD00001011303	In this study nanopore sequencing was applied to obtain sparse DNA methylation profiles from pediatric CNS tumor samples. A neural network was used to classify the tumor based on the obtained methylation profile.	MinION PromethION	62
EGAD00001011304	Whole genome sequencing data of 26 high-grade serous carcinoma (HGSC) patients (87 samples) sequenced with MGISEQ-2000 and HiSeq X Ten.	HiSeq X Ten unspecified	84
EGAD00001011305	As part of the study "Systematic analysis of paralogous regions in 41,755 exomes uncovers clinically relevant variation" we wanted to identiffy the SNVs that are located in STRC and which ones in STRCP1. For this we applied targeted long-read sequencing.	Ion Torrent Proton	7
EGAD00001011309	Diffuse large B-cell lymphoma (DLBCL) is the most common non-Hodgkin lymphoma (NHL), comprising 25-30% of all NHL in developed countries with an annual incidence in the USA of 7 cases/100000 persons/year. Collectively, DLBCL is classified based on a common morphological appearance of diffuse growth of large transformed B-cells, immunophenotype, high proliferation rate and aggressive behaviour. Despite these similarities, DLBCLs are a heterogeneous collection of malignancies with distinct clinical and molecular characteristics that do not always correlate with immunohistological features. This gene expression dataset includes transcriptomes of ABC-DLBCLs and of GCB-DLBCLs where cell of origin is determined by the HTG-EdgeSeq quantitative nuclease protection assay. Also included are clonality results from BCR profiling from high-grade B-cell lymphomas sequenced using a NOVA sequencer	unspecified	8
EGAD00001011311	We performed single cell RNA- and TCR-sequencing (10x Genomics) on immune infiltrates (CD45+ cells) from 18 HNSCC patients enrolled in the IMCISION trial (Vos et al. 2021). Viable immune cells were isolated from pre-treatment and post-treatment primary tumor biopsies of 10 patients responding (1 partial pathological response and 9 major pathological responses) and 7 patients non-responding to anti-PD-1 and anti-CTLA4 combination immunotherapy. One patient treated with anti-PD-1 monotherapy (1 major pathological response) was included in the dataset. Bulk TCR-seq was performed on the PBMCs of responding patients, pre- and post-treatment.	unspecified	137
EGAD00001011312	This dataset contains raw sequencing data from multi-timepoint cell-free methylated DNA immunoprecipitation and sequencing (cfMeDIP-seq) of plasma samples from the INSPIRE study (NCT02644369). Details about the study, including inclusion/exclusion criteria and interventions are available at https://clinicaltrials.gov/study/NCT02644369. Briefly, five cohorts were included, INS-A (head & neck squamous cell carcinoma), INS-B (triple-negative breast cancer), INS-C (high-grade serous ovarian cancer), INS-D (melanoma), and INS-E (mixed solid tumors). All patients in the study were diagnosed with advanced cancer and received treatment with pembrolizumab. Plasma samples were collected at baseline and every 3 cycles until disease progression, death, loss of follow-up, or completion of the study. Plasma samples were first processed for cell-free DNA mutations and remaining samples were then processed by cfMeDIP-seq, prioritizing baseline and post-cycle 3 samples. In total, data from 204 timepoints from 87 distinct patients are deposited.	Illumina NovaSeq 6000	204
EGAD00001011315	We profiled 16 patient tumor samples by single-cell or single-nuclei RNA-seq using 10X Chromium 3'. It includes 4 low-grade gliomas and 12 ependymomas. The raw fastqs are provided.	Illumina HiSeq 4000 Illumina NovaSeq 6000	16
EGAD00001011317	Total RNA sequencing of olfactory mucosa (OM) cells derived from cognitively healthy individuals exposed to traffic-related ultrafine particles (UFPs) for 24h and 72h in submerged cultures. The UFPs used for exposures were: A0, A20 and Euro6. Exposures were compared to the corresponding blank samples.		1
EGAD00001011318	Multiomics data for a cohort of 20 COVID-19 patients (10 patients mild, 3 patients moderate, 4 patients severe, 3 patients critical) obtained from peripheral blood mononuclear cells (PBMCs) from longitudinally sampled at hospital admission, discharge, and 1 month thereafter. The data has been obtained a multiwell-based single-cell technology (BD Rhapsody) that includes the analysis of PBMCs' whole transcriptome and a set of 52 surface proteins. The samples of the different patients at different collection times were labelled using a cell hashing strategy with the BD Single-Cell Multiplexing Kit (6 samples for each run).	Illumina NovaSeq 6000	13
EGAD00001011319	Multiomics data for a cohort of 20 COVID-19 patients (10 patients mild, 3 patients moderate, 4 patients severe, 3 patients critical) obtained from peripheral blood mononuclear cells (PBMCs) from longitudinally sampled at hospital admission, discharge, and 1 month thereafter. The data has been obtained a multiwell-based single-cell technology (BD Rhapsody) that includes the targeted expression of BD Immune Response Targeted Panel, the TCR/BCR profiling and a set of 52 surface proteins. The samples of the different patients at different collection times were labelled using a cell hashing strategy with the BD Single-Cell Multiplexing Kit (6 samples for each run).	Illumina NovaSeq 6000	13
EGAD00001011320	This dataset includes WGS data of 12 ancient individuals (97–688 years BP) from Zambia and South Africa, presented in Fortes-Lima et al. Nature 2023. Further details like C14-dates and archaeological descriptions were reported elsewhere (Meyer et al. Azania 2021; Steyn et al. African Archaeological Review 2022).	Illumina NovaSeq 6000	12
EGAD00001011321	We performed Whole Exome (WXS) and RNASeq sequencing on samples obtained from the phase 3 randomized, double-blinded, placebo-controlled study in patients with locally advanced squamous cell carcinoma of the head and neck JAVELIN Head and Neck 100 (NCT02952586). There are 471 WXS Tumor samples and 346 RNASeq tumor samples. All tumor samples have matching normal samples.	Illumina HiSeq 4000 Illumina NovaSeq 6000	421
EGAD00001011322	Raw molecular data of Vd2 T cells upon treatment with different stimuli and mevalonate pathway inhibitors.	unspecified	20
EGAD00001011323	The whole study comprises of two patient cohorts. Screening cohort: 40 patients of Germany; validation cohort: 40 patients from Asia. Further, bile duct and CCA cell lines have been analyzed. This dataset contains targeted DNA sequencing data of 40 tumor/normal pairs from the validation cohort plus 10 tumor/normal pairs of patients from the screening cohort for technical validation. Data was generated on Illumina HiSeq 2000 device in paired-end mode and is stored in BAM file format.	Illumina HiSeq 2000	100
EGAD00001011324	The whole study comprises of two patient cohorts. Screening cohort: 40 patients of Germany; validation cohort: 40 patients from Asia. Further, bile duct and CCA cell lines have been analyzed. This dataset contains whole exome sequencing data of 37 tumor/normal pairs from the screening cohort plus an additional relapse tumor of one of those 37 patients. Data was generated on Illumina HiSeq 2000 device in paired-end mode and is stored in BAM file format.	Illumina HiSeq 2000	75
EGAD00001011325	The whole study comprises of two patient cohorts. Screening cohort: 40 patients of Germany; validation cohort: 40 patients from Asia. Further, bile duct and CCA cell lines have been analyzed. This dataset contains 44 samples derived from RNA-sequencing data of bile duct and CCA cell lines. Data was generated on Illumina NovaSeq 6000 device in paired-end mode and is stored in compressed FASTQ file format.	Illumina NovaSeq 6000	44
EGAD00001011326	The whole study comprises of two patient cohorts. Screening cohort: 40 patients of Germany; validation cohort: 40 patients from Asia. Further, bile duct and CCA cell lines have been analyzed. This dataset contains fusion gene analysis using multiplex single primer extension-based RNA-sequencing for a subset of 25 patients of the screening cohort. Data was generated on Illumina NexSeq 550 device in paired-end mode and is stored in compressed FASTQ file format.	NextSeq 550	25
EGAD00001011327	10x Genomics 5' library scRNA-seq data for 85 B-ALL patients with different subtypes	Illumina NovaSeq 6000	85
EGAD00001011328	scWGS-seq of flow sorted blast and normal cells from SJE2A063 with 80 high quality cells sequenced (75 blast and 5 normal)	Illumina NovaSeq 6000	80
EGAD00001011329	scWGS-seq of flow sorted blast and normal cells from SJE2A066 with 69 high quality cells sequenced (62 blast and 7 normal)	Illumina NovaSeq 6000	69
EGAD00001011330	scWGS-seq of flow sorted blast and normal cells from SJE2A067 with 78 high quality cells sequenced (73 blast and 5 normal)	Illumina NovaSeq 6000	78
EGAD00001011331	HGSC cases were selected for which matched fresh frozen and FFPE samples were available from the same tissue specimens. Fresh frozen, FFPE and normal blood were subject to WGS for the purpose of assessing the possibility of using FFPE WGS in place of fresh frozen for somatic mutation calling.	Illumina HiSeq X	6
EGAD00001011333	This dataset consists of RNA sequencing data (FASTQs) from intestinal mucosal biopsies from 9 IBD patients. All patients endoscopically active disease and were not receiving immunosuppressive or biologic therapies. All biopsies (6 per donor) were collected from a single inflamed site. Biopsies were cultured for 18 hours at an air-liquid interface in media containing either DMSO (vehicle control), PD-0325901 (0.5uM) or infliximab (10ug/ml; MSD) - two biopsies per condition. Sequencing was performed on a NovaSeq 6000 (100bp, PE reads). After 18 hours, biopsies were harvested and snap frozen. After lysis, RNA was extracted using an AllPrep DNA/RNA Mini Kit (Qiagen). Sequencing libraries were prepared from 10ng RNA using the SMARTer Stranded Total RNA-Seq Kit v2 - Pico Input Mammalian (Takara) following the manufacturer’s instructions. The quality and molarity of all libraries was assessed using a BioAnalyzer 2100 and the libraries were sequenced on a NovaSeq 6000 (100bp, PE reads).	Illumina NovaSeq 6000	27
EGAD00001011334	CTD-ILD BALF and blood scRNA-seq dataset consists of single-cell transcriptome data of bronchoalveolar lavage fluid (BALF) and blood derived from 30 connective tissue disease-associated interstitial lung disease (CTD-ILD) and 12 idiopathic interstitial pneumonia (IIP) patients.	Illumina HiSeq 1000	161
EGAD00001011335	RNA-Seq and ATAC-Seq of iPSC derived neurons under baseline and KCl stimulation conditions from 10 distinct donors, including 5 healthy controls and 5 schizophrenic individuals. scATAC of human post mortem prefrontal cortex from 4 adult individuals including 2 neurotypical individuals and 2 schizophrenic individuals.	Illumina HiSeq 4000 Illumina NovaSeq 6000	44
EGAD00001011336	Instestinal organoids treated with either busulfan, fludarabine or clofarabine for 24h	NextSeq 500	12
EGAD00001011337	This dataset comprises raw RNA sequencing from inflammatory (TPP) macrophages that were treated with the MEK inhibitor PD-0325901 (100nM or 500nM) or vehicle control (n=3 donors). MEK inhibitor or vehicle control was added on day 4 and cells were harvested on day 6 and lysed. RNA was extracted from cell lysates using an AllPrep DNA/RNA Mini Kit (Qiagen). Sequencing libraries were prepared from 10ng RNA using the SMARTer Stranded Total RNA-Seq Kit v2 - Pico Input Mammalian (Takara) following the manufacturer’s instructions. The quality and molarity of all libraries was assessed using a BioAnalyzer 2100 and the libraries were sequenced on a NextSeq500. Raw data are provided as 50 bp paired-end Illumina reads.	NextSeq 500	9
EGAD00001011338	This dataset comprises raw RNA sequencing data (FASTQs) from primary inflammatory (TPP) macrophages that were unedited (non-targeting control, NTC), edited to delete the disease-associated chr21q22 enhancer region (n=5), or edited to disrupt ETS2 with 1 of 2 independent gRNAs (n=9). We also performed RNA-sequencing in NTC or ETS2-edited TPP macrophages that were treated for 12 hours with vehicle (DMSO) or roxadustat (30 uM, n=3). Macrophages were detached using Accutase and lysed. RNA was extracted from cell lysates using an AllPrep DNA/RNA Mini Kit (Qiagen). Sequencing libraries were prepared from 10ng RNA using the SMARTer Stranded Total RNA-Seq Kit v2 - Pico Input Mammalian (Takara) following the manufacturer’s instructions. The quality and molarity of all libraries was assessed using a BioAnalyzer 2100 and the libraries were sequenced on a NextSeq500. Raw data are provided as 50 bp paired-end Illumina reads.	NextSeq 500	38
EGAD00001011339	Fastq files are deposited for single-cell transcriptomes of patient H3-K27M diffuse midline gliomas, generated using the SMART-seq2 method.	NextSeq 500	13
EGAD00001011340	The dataset consists of 12 samples of monocytes, which were analyzed using RNA-sequencing (RNA-seq). These samples were categorized into four distinct groups, each comprising three samples. The dataset was designed to investigate the transcriptomic and metabolic profiles of monocytes across different age groups and stimulation conditions. 3 neonatal controls, 3 neonatal LPS stimulated, 3 adult controls and 3 adult LPS stimulated samples.	Illumina HiSeq 1500	12
EGAD00001011341	This dataset consists of RNA sequencing data from an ETS2 overexpression experiment in M0 macrophages. Controlled overexpression of ETS2 mRNA or control mRNA (an equivalent amount of mRNA encoding the reverse complement of ETS2 – thereby controlling for the quantity, length and purine/pyrimidine composition of the transfected RNA but with a transcript that would not be translated) was induced in resting, non-activated (M0) macrophages by transfecting predefined amounts of in vitro transcribed mRNA. To minimise non-specific activation due to the transfected RNA, in vitro transcription was performed using co-transcriptional capping (to minimise uncapped products), and incorporating modified, minimally immunogenic nucleotides (replacing uridine with N1-methyl-pseudouridine and cytidine with methylcytidine). After 18 hours, transfected cells were were activated with low dose LPS and harvested for RNA-sequencing 6 hours later. Sequencing libraries were prepared from 10ng RNA using the SMARTer Stranded Total RNA-Seq Kit v2 - Pico Input Mammalian (Takara) following the manufacturer’s instructions. The quality and molarity of all libraries was assessed using a BioAnalyzer 2100 and the libraries were sequenced on a NovaSeq6000. Raw data are provided as 100 bp paired-end Illumina reads from n = 8 donors.	Illumina NovaSeq 6000	32
EGAD00001011342	Single-cell RNA sequencing of 43 bronchoalveolar lavage fluid (BALF) samples from 15 CAPA and 22 COVID-19 mechanically ventilated patients.	Illumina NovaSeq 6000	43
EGAD00001011343	Whole genome sequencing for tumour/normal matched pairs from 78 samples	Illumina HiSeq 2500	-
EGAD00001011345	scRNAseq and scTCRseq of serial peripheral blood mononuclear cell (PBMC) samples (n=72) taken at various timepoints before and during treatment (Week 0 (W0), Week 3 (W3), Week 6 (W6)). PBMC samples were pooled together into 37 pools, loading 2 or three samples per lane in the 10X Genomics chip, in equal proportions, according to a pre-designed pooling matrix.	Illumina NovaSeq 6000	70
EGAD00001011346	Single cell TotalSeqC protein data serial peripheral blood mononuclear cell (PBMC) samples taken from advanced HCC patients. TotalSeqC is available for 41 out of 72 PBMC samples included in the study, and were combined into 20 pools.	Illumina NovaSeq 6000	20
EGAD00001011347	Single-cell RNA and TCR sequencing of 40 advanced HCC pre-treatment biopsies from 38 patients.	Illumina NovaSeq 6000	80
EGAD00001011348	TrypanoGEN+ phenotype data containing 183 samples from DRC, Malawi and Uganda.		1
EGAD00001011349	This dataset consists of sequencing data from an ETS2 CUT&RUN experiment in primary inflammatory (TPP) macrophages (n = 2). Pre-cultured TPP macrophages were harvested and processed immediately using the CUT&RUN Assay kit (Cell Signaling) according to the manufacturer’s instructions with the following modifications (essentially, avoiding the use of ConA-coated beads). Anti-ETS2 (ThermoFisher) or IgG control (Cell Signaling) antibodies were used for targeted digestion of chromatin. For each donor, 5x10^5 cells were pelleted, washed in Wash Buffer, and resuspended in Antibody Binding buffer. Cells were incubated with antibodies (1:100 dilution for anti-ETS2) for 2h at 4°C. After washing in Digitonin Buffer, cells were incubated with pA/G-MNase for 1h at 4°C. Cells were washed twice in Digitonin Buffer, resuspended in the same buffer and cooled for 5 minutes on ice. Calcium chloride was added to activate pA/G-MNase digestion and cells were incubated for 30 minutes at 4°C before Stop Buffer was added, and cells were incubated for 10 min at 37°C to release cleaved chromatin fragments. Supernatants were collected by centrifugation and DNA extracted using DNA Purification Buffers and Spin Columns (Cell Signaling). Library preparation was performed according to a protocols.IO protocol (dx.doi.org/10.17504/protocols.io.bagaibse) using the NEBNext Ultra II DNA Library Prep Kit. Size selection was performed using AMPure XP beads (Beckman Coulter) and fragment sizes were determined using an Agilent 2100 Bioanalyzer (High Sensitivity DNA kit). Equimolar pools of indexed libraries were sequenced on an Illumina NovaSeq 6000 (100bp PE reads). Raw data are provided as 100 bp paired-end Illumina reads.	Illumina NovaSeq 6000	4
EGAD00001011350	We performed scRNA-seq on 15 fresh lymph node core biopsies from 7 Lymphoma patients treated with CD20xCD3 bispecific antibodies. We also performed whole-exome sequencing (WES) and bulk RNA-seq on formalin-fixed paraffin embedded (FFPE) tumor samples from these patients. WES was additionally performed on matched germline samples (blood) for each patient.	Illumina NovaSeq 6000	29
EGAD00001011351	This dataset consists of H3K27ac ChIP-sequencing in inflammatory (TPP) macrophages from 2 minor allele homozygotes and 2 major allele homozygotes at rs2836882. Monocytes were positively selected from PBMC using CD14 Microbeads and inflammatory macrophage differentiation performed using conditions that model chronic inflammation (TPP): 3 days GM-CSF (50ng/mL) followed by 3 days GM-CSF, TNFa (50ng/mL), PGE2 (1mg/mL), and Pam3CSK4 (1mg/mL). After harvesting, cells were cross-linked, quenched, lysed, and sheared. Immunoprecipitation of histone-DNA complexes was performed overnight at 4C with rotation using an anti-H3K27ac antibody and the SimpleChIP Plus Sonication ChIP kit (Cell Signaling Technology). Following reverse cross-linking, 50ng of immunoprecipitated DNA or input DNA were used to prepare sequencing libraries using the iDeal Library Preparation kit (Diagenode), according to manufacturer instructions. 10 PCR cycles were used for the amplification step and size selection was not performed. The quality and molarity of all libraries was assessed using a BioAnalyzer 2100 (Agilent) and the libraries were sequenced in pools of 8, with each pool being sequenced in 2 lanes of an Illumina HiSeq2500 high output flow-cell (50bp, single-end reads). Raw data are provided as raw and aligned single-end sequencing reads from H3K27ac-bound DNA and the input chromatin. Raw reads were trimmed using Trim Galore and aligned to the reference human genome (hg19) using Burrows-Wheeler Aligner (v0.7.12) with default parameters. Aligned reads were converted to BAM files, sorted, and technical duplicates merged before indexing – all using SAMtools (v1.4). PCR duplicates were identified using Picard tools (v2.18.1) and removed together with unmapped reads using SAMtools (v1.4). The resulting BAM files were re-sorted and indexed after filtering.	Illumina HiSeq 2500	8
EGAD00001011352	Includes 4 Datasets: 1. 35 Control Plasma WGS samples sequenced on NovaSeq V1.0 Chemistry at the New York Genome Center. Denoted by CTRL-2XX naming scheme. 2. Plasma WGS from 17 patients with Small Cell Lung Cancer. Samples extracted at either Pretreatment or Postoperative at weeks 2 or 3. Denoted by SCLC-XX naming scheme. 3. Plasma WGS from a synthetic mixing study of a high-burden melanoma plasma sample with plasma bag, at estimated tumor concentrations of 10e-3, 10e-4, 10e-5, and 10e-6 for 2 replicates. Denoted by SM-repX naming scheme. 4. Assorted Plasma, Tumor, and Normal WGS from patients with NSCLC expressing high-burden for use in model training. Denoted by NSCLC-2XX naming scheme.	Illumina NovaSeq 6000	83
EGAD00001011353	Whole genome sequencing data of 6 high-grade serous carcinoma (HGSC) patients (11 samples) sequenced with MGISEQ-2000	unspecified	11
EGAD00001011354	RNAseq fastq files for 254 samples for the neoALTTO study of lapatinib, trastuzumab or combination in HER2+ breast cancer patients. Those are pre-treatment baseline samples.	Illumina HiSeq 2500	254
EGAD00001011358	Whole-genome sequencing of 60 individuals from 15 Himalayan populations using the Illumina-B HiSeq X platform. These data are part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/.	HiSeq X Ten	1
EGAD00001011359	The dataset for the study "Early ctDNA molecular response captures therapeutic response in the first stage of CCTG BR.36 ctDNA-directed, multi-center phase II study of molecular response adaptive immunotherapy in non-small cell lung cancer", includes 134 bam files from hybrid capture targeted error-correction next-generation sequencing (PGDx Elio plasma resolve) from plasma cell-free DNA and matched white blood cell DNA from 35 individuals with non-small cell lung cancer on the BR.36 trial, alongside 11 bam files from targeted next generation sequencing (PGDx Elio tissue complete) of tumor DNA from 11 individuals with non-small cell lung cancer on the BR.36 trial.	NextSeq 550	153
EGAD00001011360	Abstract: Classic Hodgkin lymphoma (cHL) is a largely MHC class I-negative tumor with recurrent 9p24.1/ PD-L1/ PD-L2 copy gains and the highest reported response rates to PD-1 blockade. We utilized scRNA sequencing to characterize the peripheral immune response to PD-1 blockade and more broadly define non-CD8+ dependent mechanisms of immune evasion in cHL. Peripheral blood mononuclear cells were obtained from 20 patients with relapsed/refractory (R/R) cHL treated with PD-1 blockade (nivolumab) on the CheckMate 205 clinical trial (paired samples from cycle 1 day 1 [C1D1] and cycle 4 day 1 [C4D1]), 11 patients with newly diagnosed, previously untreated cHL and 13 healthy donors.	Illumina NovaSeq 6000	128
EGAD00001011361	Single-cell RNA-seq data of angioimmunoblastic T-cell lymphoma		-
EGAD00001011362	Single-cell dataset of human airways after in vitro regeneration at the Air-Liquid Interface. Samples from 2 donors were used for the establishment of Human Bronchial Epithelial Cell cultures (HBECs) and were analyzed during proliferation, at the onset of differentiation (ALI0) and after full differentiation (ALI28), in 4 distinct culture media. Single-cell RNAseq was performed with 10x Genomics 3’ v2 RNA-seq using hashtag antibodies for multiplexing, and sequenced on a NextSeq 500. Samples from 3 additional donors were used for the establishment of Human Nasal Epithelial Cell cultures (HNECs) and were analyzed during proliferation in three distinct culture media. Single-cell RNAseq was performed with 10x Genomics 3’ v3 RNA-seq without multiplexing, and sequenced on a NextSeq 2000.	NextSeq 500 unspecified	11
EGAD00001011363	We generated a dataset consisting of 79 VCF files, and respective FASTQ and CRAM files, methodically generated using the GLIMPSE1 imputation algorithm leveraging the 1000 Genomes Project Phase 3 dataset as the reference panel of haplotypes. In total this dataset is composed of approximately 325 GB of FASTQ data, 156 GB of CRAM data, and 6 GB of VCF data. Our samples were specifically derived from sequenced DNA from a highly selective cohort of patients, mostly comprised of Iberian Populations in Spain (IBS) individuals but also containing some individuals with other genetic backgrounds, who presented severe COVID-19 symptoms during the initial wave of the SARS-CoV-2 pandemic in Madrid, Spain. On average, each VCF file in this rich dataset contains 9.49 million high-confidence single nucleotide variants [95%CI: 9.37 million - 9.61 million].	unspecified	80
EGAD00001011364	This dataset includes the 2 extra normal samples adjacent to PTC tumors that were multiplexed and profiled using kits from different batches.	Illumina NovaSeq 6000	2
EGAD00001011365	This dataset include all spatial transcriptomics experiments. Samples coming from the same patient were sequenced on the same flowcell. Patients PTC4 to PTC9 were also sequenced on the same flowcell, as well as ATC1 and ATC2 on another one, and ATC3A and ATC3B on another one.	Illumina NovaSeq 6000	37
EGAD00001011366	This dataset includes the first 9 PTC samples and 6 ATC samples profiled on the same sequencing flowcell.	Illumina NovaSeq 6000	15
EGAD00001011367	This dataset contains metagenomic sequencing of stool samples of babies from CS Baby Biome project, and their mothers	Illumina HiSeq 2000	195
EGAD00001011368	Whole genome sequencing data of 35 high-grade serous carcinoma (HGSC) patients (112 samples) sequenced with Illumina Novoseq 6000	Illumina NovaSeq 6000	112
EGAD00001011369	221 patient samples including 164 initial tumor (TI) samples (53/164 fresh frozen TI samples, and 111/164 formalin-fixed paraffin-embedded (FFPE) TI samples), 22 paired matched normal samples, and 35 unpaired normal samples from healthy donors; FASTQ file format, Agilent SureSelect Human All Exon V6 Kit (Agilent Technologies, Inc., Santa Clara, California, USA)	Illumina NovaSeq 6000	1
EGAD00001011370	We profiled 31 osteosarcoma tumor patient samples, 18 blood and 1 saliva control samples by exome sequencing, for a total of 50 samples. The raw fastqs are provided.	unspecified	50
EGAD00001011371	We profiled 23 patient osteosarcoma tumor samples by bulk RNA-seq. The raw fastqs are provided.	unspecified	23
EGAD00001011372	We profiled 2 patient osteosarcoma tumor samples as well as the blood of the same patients by whole genome sequencing, for a total of 4 samples. The raw fastqs are provided.	unspecified	4
EGAD00001011373	BAM files for two families recruited to the HICF2 genome sequencing project due to craniosynostosis. One family is a singleton and the other is an affected mother-daughter duo.	Illumina HiSeq 2500 Illumina HiSeq 4000	3
EGAD00001011374	HNF1A haploinsufficiency causes decreased insulin expression, dysregulation of pancreatic progenitor signature genes and affects chromatin accessibility	Illumina NovaSeq 6000 NextSeq 500	-
EGAD00001011376	RNA-seq data	unspecified	7
EGAD00001011377	Whole-genome sequencing data	unspecified	23
EGAD00001011378	Paired tumor-normal whole genome sequencing data from primary tumors of patients diagnosed with neuroblastoma, Ewing sarcoma, Wilms tumor, hepatoblastoma and rhabdomyosarcoma		1
EGAD00001011379	HLA sequence data and final calls for VaccGene and 1000Gp3 African populations		1
EGAD00001011581	Exome sequencing data for study of the microenviroment of angioimmunoblastic T-cell lymphoma		1
EGAD00001011645	Embryogenesis is a vulnerable time. Mutations in developmental cells can result in the seeding of cells predisposed to disease within mature organs, creating a field effect. We characterise an embryonic cancer mutation that drives multifocal, multiphenotypic renal tumours in a 14-year-old girl. Their shared MTOR mutation, absent from normal tissues, increases protein flexibility which enables a FAT domain hinge to dramatically increase mTORC1 activity. Developmental mutations, not usually detected in traditional genetic screening, have vital clinical importance in guiding prognosis, targeted treatment, and family screening decisions for paediatric tumours.	Illumina NovaSeq 6000	1
EGAD00001011646	Embryogenesis is a vulnerable time. Mutations in developmental cells can result in the seeding of cells predisposed to disease within mature organs, creating a field effect. We characterise an embryonic cancer mutation that drives multifocal, multiphenotypic renal tumours in a 14-year-old girl. Their shared MTOR mutation, absent from normal tissues, increases protein flexibility which enables a FAT domain hinge to dramatically increase mTORC1 activity. Developmental mutations, not usually detected in traditional genetic screening, have vital clinical importance in guiding prognosis, targeted treatment, and family screening decisions for paediatric tumours.	Illumina HiSeq 2500	1
EGAD00001011647	Embryogenesis is a vulnerable time. Mutations in developmental cells can result in the seeding of cells predisposed to disease within mature organs, creating a field effect. We characterise an embryonic cancer mutation that drives multifocal, multiphenotypic renal tumours in a 14-year-old girl. Their shared MTOR mutation, absent from normal tissues, increases protein flexibility which enables a FAT domain hinge to dramatically increase mTORC1 activity. Developmental mutations, not usually detected in traditional genetic screening, have vital clinical importance in guiding prognosis, targeted treatment, and family screening decisions for paediatric tumours.	Illumina NovaSeq 6000	1
EGAD00001011648	Bam files of IBC whole exome sequencing, including tumor and matching normal	Illumina HiSeq 2000 Illumina HiSeq 2500	38
EGAD00001011649	RNA sequencing of IBC patients	Illumina HiSeq 2500	38
EGAD00001011666	TrypanoGEN+ data containing fastq files of 183 samples from DRC, Malawi and Uganda using NextSeq500.	NextSeq 500	183
EGAD00001011667	297 members of the LBC1921 were sequenced using the Illumina HiSeq X platform. This dataset contains the bam files.	HiSeq X Ten	1
EGAD00001011676	Germline BAMs from blood/saliva samples from patients diagnosed with both uveal and cutaneous patients. Reads have been aligned, deduplicated and recalibrated.	Illumina HiSeq 2000 Illumina NovaSeq 6000	81
EGAD00001011677	RNA-sequencing data of 5 human thyroid cancer cell lines cultured in control conditions	Illumina NovaSeq 6000	5
EGAD00001011678	RNA-sequencing data: 5 normal thyroid tissues, 14 papillary thyroid carcinomas, 2 lymph node metastases, 19 poorly differentiated thyroid carcinomas and 17 anaplastic thyroid carcinomas; Targeted DNA-sequencing of the 165 genes included in the “Solid and Haematological tumors” panel (BRIGHTCore, Brussels, Belgium): 2 normal thyroid tissues, 2 poorly differentiated thyroid carcinomas and 7 anaplastic thyroid carcinomas; 2 FASTQ files for each sample (paired).	Illumina NovaSeq 6000	68
EGAD00001011679	Blood plasma samples (n=168) and matched diagnostic formalin-fixed paraffin-embedded (FFPE) tissue samples (n=69) of DLBCL patients, PMBCL patients and healthy controls were collected between 2016-2021. Plasma samples were collected at diagnosis, at interim evaluation, after treatment, and in case of refractory or relapsed disease. RNA was extracted from 200 µl plasma using the miRNeasy serum/plasma kit and from FFPE tissue using the miRNeasy FFPE kit. RNA was subsequently sequenced on a NovaSeq 6000 instrument using the SMARTer Stranded Total RNA-seq pico v3 library preparation kit.	Illumina NovaSeq 6000	172
EGAD00001011680	We performed Whole Exome (WXS) and RNASeq sequencing on samples obtained from the phase 3 randomized, double-blinded, placebo-controlled study in patients with locally advanced squamous cell carcinoma of the head and neck JAVELIN Head and Neck 100 (NCT02952586). There are 471 WXS Tumor samples and 346 RNASeq tumor samples. All tumor samples have matching normal samples.	Illumina HiSeq 4000	421
EGAD00001011812	The dataset includes 43 high coverge (30x) whole genome samples mostly from the Sahelian belt.		1
EGAD00001011813	WES of LUAD	Illumina NovaSeq 6000	32
EGAD00001011815	Primary sclerosing cholangitis (PSC) is a T-cell mediated, chronic inflammatory condition of the biliary tree that is strongly associated with inflammatory bowel disease. Genome-wide association studies have identified 22 non-HLA genetic risk variants associated PSC. Identifying the genes impacted by these variants has proven difficult as the majority lie in non-coding regions of the genome. Knowledge of the genes and biological pathways these non-coding variants are perturbing is vital to understanding the disease biology. One means of assessing the impact of non-coding variants within disease associated loci upon genes is via colocalisation with eQTL. Many eQTL are cell-type specific, requiring the analysis of disease relevant cell types to detect colocalisation. We have collected PSC-relevant T-cell-subtypes from the peripheral blood of PSC patients via fluorescence activated cell sorting in preparation for RNA sequencing and mapping of eQTL. Samples were collected at the Norfolk and Norwich University Hopital, for which local ethical approval has been granted. Lysed cell samples will be transferred to WTSI and DNA/RNA will be extracted from lysed cell samples by T143 before genotyping (DNA) and custom library preparation and sequencing (RNA). This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2500	12
EGAD00001011816	This data set includes serial biopsies from 75 patients with DLBCL. Tumour tissue was preserved either in FFPE or frozen. Each biopsy was sequenced with either whole genome or exome. A custom targeted sequencing data set is available to match most whole genome samples. RNAseq data were available for a subset of biopsies.	unspecified	237
EGAD00001011817	Plasma samples from patients with melanoma (stage II/III/IV) and breast cancer (stage IV) as well as healthy individuals were subjected to low-coverage whole-genome sequencing (less than 10x average depth). This dataset contains raw fastq files from 39 breast cancer, 127 melanoma and 42 healthy control plasma samples.	Illumina NovaSeq 6000	208
EGAD00001011819	Study describing the dynamics of chromatin organization within malignant rhabdoid tumors. This study describes how this chromatin organization changes upon SMARCB1 rescue within patient-derived organoid models from malignant rhabdoid tumors. Identification of a novel super-enhancer of MYC and identification of patient-specific enhancer utilization to activate MYC expression in these tumors.	NextSeq 550	8
EGAD00001011820	Study describing the dynamics of chromatin organization within malignant rhabdoid tumors. This study describes how this chromatin organization changes upon SMARCB1 rescue within patient-derived organoid models from malignant rhabdoid tumors. Identification of a novel super-enhancer of MYC and identification of patient-specific enhancer utilization to activate MYC expression in these tumors.	Illumina HiSeq 2500 Illumina NovaSeq 6000 NextSeq 550	8
EGAD00001011821	Study describing the dynamics of chromatin organization within malignant rhabdoid tumors. This study describes how this chromatin organization changes upon SMARCB1 rescue within patient-derived organoid models from malignant rhabdoid tumors. Identification of a novel super-enhancer of MYC and identification of patient-specific enhancer utilization to activate MYC expression in these tumors.	Illumina HiSeq 2500 Illumina NovaSeq 6000 NextSeq 2000	5
EGAD00001011822	The dataset for the study “Circulating tumor DNA, pathological and immunologic responses to neoadjuvant nivolumab or nivolumab plus relatlimab and chemoradiotherapy in resectable esophageal/gastroesophageal junction cancer” includes 173 bam files from hybrid capture targeted error-correction next-generation sequencing (TEC-Seq) from plasma cell-free DNA and matched white blood cell DNA from 32 individuals with esophageal/gastroesophageal junction cancer, who received immunotherapy-containing regimens.	Illumina HiSeq 2500 NextSeq 550	173
EGAD00001011989	Dataset contains one paired-end Whole Exome sequencing sample. One normal blood sample is also included.		2
EGAD00001011990	Dataset contains one paired-end RNA-seq sample.		1
EGAD00001011991	The dataset for the study Elucidating the heterogeneity of immunotherapy response and immune-related toxicities by longitudinal ctDNA and immune cell compartment tracking in lung cancer includes 207 bam files from hybrid capture targeted error-correction next-generation sequencing (TEC-Seq) from plasma cell-free DNA and matched white blood cell DNA from 30 individuals with non-small cell lung cancer, alongside 46 bam files from whole exome sequencing of tumor and matched normal DNA for 21 individuals with non-small cell lung cancer who received immunotherapy-containing regimens.	Illumina HiSeq 2500	253
EGAD00001011992	CRAM files of 340 human genomes from Angola and Mozambique. Paired-end reads were generated in an Illumina-X Ten and were mapped against the human reference genome build hg19/GRCh37. More details about the sequencing and the samples in Tallman et al. 2023. Nature Communications.	HiSeq X Ten	1
EGAD00001011994	Shallow-whole genome sequencing for copy numbers in resectable gastric cancer treated with surgery alone	Illumina HiSeq 4000	269
EGAD00001011997	Visium spatial transcriptomics (10X Genomics) performed on 4 CCA samples. Each sample has two paired-end sequencing runs: the first (I1 & I2) are a pair reading indexes; the second (R1 & R2) are a pair reading inserts, with R1 additionally reading 10X barcodes. For histology images, please contact authors.	Illumina NovaSeq 6000	4
EGAD00001011998	Fastq files for RNA-seq for 60 CCAs, 6 normal bile duct tissues, 14 CCA cell-lines (including replicates), and 2 normal cholangiocyte cell-lines (including replicates). RNA was extracted using the Qiagen RNeasy Mini kit. Illumina Tru-Seq Stranded Total RNA kit (Illumina, San Diego, California, USA) was used to prepare RNA libraries from 1 µg of total RNA. Paired-end 150 bp sequencing was performed using Illumina HiSeq4000 sequencer with the paired-end 150 bp read option.	Illumina HiSeq 4000	82
EGAD00001012100	Fastq files for WGS for 16 CCAs (with matched normal tissues), and 4 cell lines in an AA-treatment experiment (0, 10, 20, 40ug AA treatment after 180 days; 2 runs each for 10ug and 40ug experiments). Genomic DNA was extracted using DNeasy Blood and Tissue Kit (Qiagen). Sequencing libraries were prepared from DNA extracted using the SureSelect XT2 Target Enrichment System for the Illumina Multiplexed Sequencing platform (Illumina) according to the manufacturer’s instructions. Whole genome sequencing was performed on Illumina HiSeq4000 sequencer with paired-end sequencing.	Illumina HiSeq 4000	36
EGAD00001012101	Telomere fusions (TFs) can trigger the accumulation of oncogenic alterations leading to malignant transformation and drug resistance. Despite their relevance in tumour evolution, our understanding of the patterns and consequences of TFs in human cancers remains limited. Here, we characterize the rates and spectrum of somatic TFs across >30 cancer types using whole-genome sequencing data. TFs are pervasive in human tumours with rates varying markedly across and within cancer types. In addition to end-to-end fusions, we find novel patterns of TFs that we mechanistically link to the activity of the alternative lengthening of telomeres (ALT) pathway. We show that TFs can be detected in the blood of cancer patients, which enables cancer detection with high specificity and sensitivity even for early-stage tumours and cancers of high unmet clinical need. Overall, we report a novel genomic footprint that enables characterization of the telomere maintenance mechanism of tumours and liquid biopsy analysis.	Illumina NovaSeq 6000	1
EGAD00001012102	Geographic variation of mutagenic exposures in kidney cancer genomes – sequence data (Mutographs)		1
EGAD00001012103	Fastq files for H3K27ac ChIP-seq (with matched input-DNA control) for 63 CCAs, 8 normal bile duct tissues, 16 CCA cell-lines (including replicates), and 3 normal cholangiocyte cell-lines (including replicates). Library prep was performed with the NEBNext ChIP-seq library preparation kit. Each library (including matching input DNA) was sequenced to an average depth of 20 to 30 million raw reads on Illumina HiSeq4000 sequencer, with paired-end sequencing (except for 3 normal bile tissues done with single-end sequencing).	Illumina HiSeq 4000	180
EGAD00001012116	The dataset includes RNA sequencing data on PRE- treatment biopsies of lymph node metastasis (n=80) The technology used for sequencing is llumina HiSeq 2500	Illumina NovaSeq 6000	80
EGAD00001012117	Viably frozen blood MNCs from 9 cGVHD patients were sorted with BD Influx Cell sorter, to enrich for CD45+ cells. Gene and V(D)J transcript profiles were studied with 10x Genomics Chromium Single Cell Immune Profiling platform. The Chromium Single Cell 5’RNAseq run and library preparation were done using the Chromium Next GEM Single Cell Immune Profiling version 1.1 chemistry. The raw data was processed using Cell Ranger v3.1 pipelines.	Illumina NovaSeq 6000	96
EGAD00001012120	This dataset contains 8 batches of bone marrow or peripheral blood cells of patients with aplastic anemia and bone marrow cells of healthy controls. The samples from different experimental conditions have been multiplexed with Totalseq-C antibodies (feature barcoding). To demultiplex the samples, see instructions in 10.5281/zenodo.2590196 and https://github.com/janihuuh/aa_manu. Each sample is analysed with Chromium V(D)J and 5' Gene Expression Platform v2 (10X Genomics). The raw data includes fastq files for Gene expression, feature barcodes and V(D)J Expression. The processed data have been deposited in the ArrayExpress database at EMBL-EBI (www.ebi.ac.uk/arrayexpress).	Illumina NovaSeq 6000	174
EGAD00001012121	This dataset contains 10 samples from 9 patients with chronic graft-versus-host disease (GVHD). Each sample is analysed with Chromium V(D)J and 5' Gene Expression Platform v1.1 (10X Genomics). The raw data includes fastq files for Gene expression and fastq files for V(D)J Expression. The processed data have been deposited in the ArrayExpress database at EMBL-EBI (www.ebi.ac.uk/arrayexpress) under accession number E-MTAB-13419.	Illumina NovaSeq 6000	120
EGAD00001012222	Geographic variation of mutagenic exposures in kidney cancer genomes – filtered vcf files (Mutographs)		1
EGAD00001012223	Geographic variation of mutagenic exposures in kidney cancer genomes – patient metadata files (Mutographs)		1
EGAD00001012227	Single-cell profiling of sero-negative and sero-positive humans that were inoculated with SARS-CoV-2. The cellular response during SARS-CoV-2 is profiled using single-cell transcriptomics, CITE-seq and single cell immune profiling, by sampling PBMCs and nasal swabs before and at multiple time points during SARS-CoV-2 infection. This one-of-a-kind cellular map will give unique temporal resolution of how nasal and immune cells respond to SARS-CoV-2 exposure and infection.	Illumina NovaSeq 6000	1
EGAD00001012228	Data from Representation of genomic intratumor heterogeneity in multi-region non-small cell lung cancer patient-derived xenograft models	Illumina HiSeq 2000	237
EGAD00001012229	1 sample is pure plasmid DNA and 8 samples are cell pellets for genomic DNA extraction. CRISPR PCR1 and PCR2 indexing - Please use standard Kozuke primers.	Illumina HiSeq 2500	1
EGAD00001012230	1 sample is pure plasmid DNA and 10 samples are cell pellets for genomic DNA extraction. CRISPR PCR1 and PCR2 indexing - Please use standard Kozuke primers.	Illumina HiSeq 2500	1
EGAD00001012231	1 sample is pure plasmid DNA and 10 samples are cell pellets for genomic DNA extraction. CRISPR PCR1 and PCR2 indexing - Please use standard Kozuke primers.	Illumina HiSeq 2500	10
EGAD00001012232	8 cell pellet samples for genomic DNA extraction. CRISPR PCR1 and PCR2 indexing - Please use standard Kosuke primers.	Illumina HiSeq 2500	8
EGAD00001012233	8 cell pellet samples for genomic DNA extraction. CRISPR PCR1 and PCR2 indexing - Please use standard Kosuke primers.	Illumina HiSeq 2500	1
EGAD00001012234	8 samples are cell pellets for genomic DNA extraction. CRISPR PCR1 and PCR2 indexing - Please use standard Kosuke primers.	Illumina HiSeq 2500	1
EGAD00001012235	This dataset contains RNA Seq, 10x scRNA Seq and Exome sequencing of glioblastoma samples. Sequencing was performed on a Illumina NovaSeq 6000 and Illumina HiSeq 4000. The sequencing was always paired.	Illumina HiSeq 4000 Illumina NovaSeq 6000	61
EGAD00001012437	Fresh peripheral blood mononuclear cells of four human donors were cultured together with either lung adenocarcinoma A549 cancer cells or A549-expressing H1N1 Sialidase cancer cells. These treatments induced the differentiation of donor cells into immunosuppressive MDSC-like cells, which were further subjected to bulk RNA sequencing. RNA-seq TruSeq libraries were generated from polyA-enriched mRNA isolated from the samples, and sequenced in paired-end mode on 4 lanes of an Illumina NextSeq 500 flow-cell	NextSeq 500	32
EGAD00001012638	Re-aligned BAM files for manuscript titled Discrepancies in Tumour Mutation Burden (TMB) reporting from sequential Endobronchial ultrasound trans bronchial needle aspiration (EBUS TBNA) samples within single lymph node stations for Copy Number Variant Calling.	NextSeq 550	45
EGAD00001012639	Stitched BAM files for manuscript titled Discrepancies in Tumour Mutation Burden (TMB) reporting from sequential Endobronchial ultrasound trans bronchial needle aspiration (EBUS TBNA) samples within single lymph node stations for SNP and INDEL Variant Calling.	NextSeq 550	45
EGAD00001012841	This dataset contains 15 TCRab sequencing samples from 6 CML patients before and after TKI-cessation. The raw data is available as fastq files.	Illumina HiSeq 2500	240
EGAD00001012842	This dataset contains 15 single-cell RNA sequencing samples from 6 CML patients before and after TKI-cessation. The raw data is available as fastq files.	Illumina NovaSeq 6000	272
EGAD00001013726	Geographic variation of mutagenic exposures in kidney cancer genomes – structural variation vcf files ( Mutographs )		-
EGAD00001013727	Geographic variation of mutagenic exposures in kidney cancer genomes – copy number variants (Mutographs)		-
EGAD00001014787	Cutaneous leiomyoma (cLM) and leiomyosarcoma (cLMS) are rare benign and malignant soft tissue neoplasms showing smooth muscle differentiation, respectively, that arise from mesenchymal cells in the dermis and subcutis. Through whole exome sequencing of cLM and cLMS cases, we observed distinct differences between the somatic mutational profile of these tumour types. FH was identified as a driver gene in cLM with genetic alterations of FH occurring via somatic point mutation, somatic copy number loss, biallelic inactivation and germline point mutations. TP53 and RB1 were identified as driver genes in the cLMS cohort, with genetic alterations of TP53 occurring via somatic and germline point mutations, copy number loss and biallelic inactivation. Using RNA-sequencing, we identified recurrent gene fusions, including CRTC1/3-MAML2 in cLMS and a novel MYLK-MAP3K2 fusion. Analysis of the cell types present in the tumour microenvironment revealed a significantly increased presence of macrophages and decreased presence of myeloid dendritic cells in the cLMS cohort relative to the cLM cohort. Additionally, we identified common driver genes between cLMS and LMS from other sites. Thus, we provide the first in-depth profile of the genetic landscape of cLM and cLMS.	Illumina NovaSeq 6000	89
EGAD00001014788	Cutaneous leiomyoma (cLM) and leiomyosarcoma (cLMS) are rare benign and malignant soft tissue neoplasms showing smooth muscle differentiation, respectively, that arise from mesenchymal cells in the dermis and subcutis. Through whole exome sequencing of cLM and cLMS cases, we observed distinct differences between the somatic mutational profile of these tumour types. FH was identified as a driver gene in cLM with genetic alterations of FH occurring via somatic point mutation, somatic copy number loss, biallelic inactivation and germline point mutations. TP53 and RB1 were identified as driver genes in the cLMS cohort, with genetic alterations of TP53 occurring via somatic and germline point mutations, copy number loss and biallelic inactivation. Using RNA-sequencing, we identified recurrent gene fusions, including CRTC1/3-MAML2 in cLMS and a novel MYLK-MAP3K2 fusion. Analysis of the cell types present in the tumour microenvironment revealed a significantly increased presence of macrophages and decreased presence of myeloid dendritic cells in the cLMS cohort relative to the cLM cohort. Additionally, we identified common driver genes between cLMS and LMS from other sites. Thus, we provide the first in-depth profile of the genetic landscape of cLM and cLMS.	Illumina NovaSeq 6000	52
EGAD00001014789	Cutaneous leiomyoma (cLM) and leiomyosarcoma (cLMS) are rare benign and malignant soft tissue neoplasms showing smooth muscle differentiation, respectively, that arise from mesenchymal cells in the dermis and subcutis. Through whole exome sequencing of cLM and cLMS cases, we observed distinct differences between the somatic mutational profile of these tumour types. FH was identified as a driver gene in cLM with genetic alterations of FH occurring via somatic point mutation, somatic copy number loss, biallelic inactivation and germline point mutations. TP53 and RB1 were identified as driver genes in the cLMS cohort, with genetic alterations of TP53 occurring via somatic and germline point mutations, copy number loss and biallelic inactivation. Using RNA-sequencing, we identified recurrent gene fusions, including CRTC1/3-MAML2 in cLMS and a novel MYLK-MAP3K2 fusion. Analysis of the cell types present in the tumour microenvironment revealed a significantly increased presence of macrophages and decreased presence of myeloid dendritic cells in the cLMS cohort relative to the cLM cohort. Additionally, we identified common driver genes between cLMS and LMS from other sites. Thus, we provide the first in-depth profile of the genetic landscape of cLM and cLMS.	Illumina NovaSeq 6000	72
EGAD00001014790	Cutaneous leiomyoma (cLM) and leiomyosarcoma (cLMS) are rare benign and malignant soft tissue neoplasms showing smooth muscle differentiation, respectively, that arise from mesenchymal cells in the dermis and subcutis. Through whole exome sequencing of cLM and cLMS cases, we observed distinct differences between the somatic mutational profile of these tumour types. FH was identified as a driver gene in cLM with genetic alterations of FH occurring via somatic point mutation, somatic copy number loss, biallelic inactivation and germline point mutations. TP53 and RB1 were identified as driver genes in the cLMS cohort, with genetic alterations of TP53 occurring via somatic and germline point mutations, copy number loss and biallelic inactivation. Using RNA-sequencing, we identified recurrent gene fusions, including CRTC1/3-MAML2 in cLMS and a novel MYLK-MAP3K2 fusion. Analysis of the cell types present in the tumour microenvironment revealed a significantly increased presence of macrophages and decreased presence of myeloid dendritic cells in the cLMS cohort relative to the cLM cohort. Additionally, we identified common driver genes between cLMS and LMS from other sites. Thus, we provide the first in-depth profile of the genetic landscape of cLM and cLMS.	Illumina NovaSeq 6000	41
EGAD00001015012	BAMs from deep sequencing using a custom panel for the study 'Early evolutionary branching across spatial domains predisposes to clonal replacement under chemotherapy in neuroblastoma'	Illumina NovaSeq 6000	69
EGAD00001015157	Molecular characterization of 41 tumors from 17 individuals with CMMRD to gain a better understandig of mutational processes driving subsequent tumor development. The molecular characterization includes the investigation of tumor mutational load and mutational signatures.		1
EGAD00001015158	Molecular characterization of 41 tumors from 17 individuals with CMMRD to gain a better understandig of mutational processes driving subsequent tumor development. The molecular characterization includes the investigation of tumor mutational load and mutational signatures.		1
EGAD00001015178	This dataset contains whole genome sequencing data from 22 samples of FACS-purified bone marrow CD34+ haematopoietic stem and progenitor cells and matched hair follicle controls collected from individuals undergoing hip replacement surgery. Additionally, it contains whole genome sequencing data from unseparated bone marrow mononuclear cells, CD3-CD34- bone marrow mononuclear cells, and peripheral blood granulocytes of the same 14 samples. Haematopoietic samples were sequenced to a target coverage of 90-120x; 19 of them were re-sequenced to a total target coverage of 270x. Hair follicle controls were sequenced to a target coverage of 30x.	Illumina NovaSeq 6000	59
EGAD00001015241	Shallow whole genome sequencing of 196 formalin-fixed paraffin-embedded p53abn endometrial cancers.	Illumina HiSeq 4000	196
EGAD00001015249	Background: Ultraviolet radiation (UV) is used as a treatment for psoriasis, but UV can also induce mutations which may lead to development of skin cancer. Information on the mutagenicity of narrowband UVB (NBUVB) would help inform clinicians and patients who are concerned about the potential risks of this treatment.	Illumina NovaSeq 6000	48
EGAD00001015250	Background: Ultraviolet radiation (UV) is used as a treatment for psoriasis, but UV can also induce mutations which may lead to development of skin cancer. Information on the mutagenicity of narrowband UVB (NBUVB) would help inform clinicians and patients who are concerned about the potential risks of this treatment.	Illumina NovaSeq 6000	15
EGAD00001015251	The mutational landscape of haematopoietic cells will be characterized by WGS following amplification of DNA and preparation of libraries by PTA(primary template-directed amplification). Samples have been sourced from the Cambridge Biobank.	Illumina NovaSeq 6000	145
EGAD00001015252	The transcriptional landscape of haematopoietic cells will be characterized by RNA Seq following amplification of RNA/cDNA and preparation of libraries by PTA(primary template-directed amplification). Samples have been sourced from the Cambridge Biobank.	Illumina NovaSeq 6000	111
EGAD00001015255	T-cell lymphoblastic lymphoma (T-LBL) is a common pediatric malignancy accounting for approximately 20% of the non-Hodgkin lymphomas during childhood. Survival rates of T-LBL are ~80%, but outcome after relapse is dismal, with salvage rates reaching only ~15. Considering the extremely poor prognosis after relapse and absence of clinically relevant high-risk genetics, there is an urgent need for the identification of molecular risk factors and new prognostic biomarkers in T-LBL, as well as identification of new therapeutic strategies. In this study we present a novel entity of high-risk pediatric T-LBL patients characterized by previously unknown NOTCH1 gene fusions and highly elevated blood TARC levels		1
EGAD00001015256	Authors: Charlotte King1, Emilie Abbie1, Joanna C. Fowler1, Irina Abnizova1, Roshan K. Sood1, Swee Hoe Ong1, Michael W. J. Hall1,2, Faye Lynch-Williams3, Benjamin A. Hall4, Philip H. Jones1,2,5 Abstract: In cancer evolution, genome alterations often occur in a specific order, implying selection depends on the prior clonal genotype 1-3. It is unknown if similar constraints operate in normal epithelia. Here, we mapped mutations in normal mid-esophagus of aged UK subjects. Mutant NOTCH1 clones colonized most of the epithelium by age 60 and above 70 tissue was saturated with mutants under strong competitive selection. Mutant TP53 was more strongly selected in donors over 60 years of age. Samples predominantly mutant for NOTCH1 showed increased selection of NOTCH2 mutants and weaker selection of mutant TP53 compared with samples that were mostly NOTCH1 wild type. In mouse esophagus lacking Notch1 we observed strong selection of mutant Notch2 and other genes not selected in wild type esophagus. In normal ageing esophagus, the first driver mutation may change the trajectory of subsequent somatic evolution by altering mutant selection.	Illumina HiSeq 2500	-
EGAD00001015257	Authors: Charlotte King1, Emilie Abbie1, Joanna C. Fowler1, Irina Abnizova1, Roshan K. Sood1, Swee Hoe Ong1, Michael W. J. Hall1,2, Faye Lynch-Williams3, Benjamin A. Hall4, Philip H. Jones1,2,5 Abstract: In cancer evolution, genome alterations often occur in a specific order, implying selection depends on the prior clonal genotype 1-3. It is unknown if similar constraints operate in normal epithelia. Here, we mapped mutations in normal mid-esophagus of aged UK subjects. Mutant NOTCH1 clones colonized most of the epithelium by age 60 and above 70 tissue was saturated with mutants under strong competitive selection. Mutant TP53 was more strongly selected in donors over 60 years of age. Samples predominantly mutant for NOTCH1 showed increased selection of NOTCH2 mutants and weaker selection of mutant TP53 compared with samples that were mostly NOTCH1 wild type. In mouse esophagus lacking Notch1 we observed strong selection of mutant Notch2 and other genes not selected in wild type esophagus. In normal ageing esophagus, the first driver mutation may change the trajectory of subsequent somatic evolution by altering mutant selection.	HiSeq X Ten	1
EGAD00001015258	Authors: Charlotte King1, Emilie Abbie1, Joanna C. Fowler1, Irina Abnizova1, Roshan K. Sood1, Swee Hoe Ong1, Michael W. J. Hall1,2, Faye Lynch-Williams3, Benjamin A. Hall4, Philip H. Jones1,2,5 Abstract: In cancer evolution, genome alterations often occur in a specific order, implying selection depends on the prior clonal genotype 1-3. It is unknown if similar constraints operate in normal epithelia. Here, we mapped mutations in normal mid-esophagus of aged UK subjects. Mutant NOTCH1 clones colonized most of the epithelium by age 60 and above 70 tissue was saturated with mutants under strong competitive selection. Mutant TP53 was more strongly selected in donors over 60 years of age. Samples predominantly mutant for NOTCH1 showed increased selection of NOTCH2 mutants and weaker selection of mutant TP53 compared with samples that were mostly NOTCH1 wild type. In mouse esophagus lacking Notch1 we observed strong selection of mutant Notch2 and other genes not selected in wild type esophagus. In normal ageing esophagus, the first driver mutation may change the trajectory of subsequent somatic evolution by altering mutant selection.	Illumina HiSeq 2500	1
EGAD00001015259	Authors: Charlotte King1, Emilie Abbie1, Joanna C. Fowler1, Irina Abnizova1, Roshan K. Sood1, Swee Hoe Ong1, Michael W. J. Hall1,2, Faye Lynch-Williams3, Benjamin A. Hall4, Philip H. Jones1,2,5 Abstract: In cancer evolution, genome alterations often occur in a specific order, implying selection depends on the prior clonal genotype 1-3. It is unknown if similar constraints operate in normal epithelia. Here, we mapped mutations in normal mid-esophagus of aged UK subjects. Mutant NOTCH1 clones colonized most of the epithelium by age 60 and above 70 tissue was saturated with mutants under strong competitive selection. Mutant TP53 was more strongly selected in donors over 60 years of age. Samples predominantly mutant for NOTCH1 showed increased selection of NOTCH2 mutants and weaker selection of mutant TP53 compared with samples that were mostly NOTCH1 wild type. In mouse esophagus lacking Notch1 we observed strong selection of mutant Notch2 and other genes not selected in wild type esophagus. In normal ageing esophagus, the first driver mutation may change the trajectory of subsequent somatic evolution by altering mutant selection.	Illumina NovaSeq 6000	1
EGAD00001015260	Authors: Charlotte King1, Emilie Abbie1, Joanna C. Fowler1, Irina Abnizova1, Roshan K. Sood1, Swee Hoe Ong1, Michael W. J. Hall1,2, Faye Lynch-Williams3, Benjamin A. Hall4, Philip H. Jones1,2,5 Abstract: In cancer evolution, genome alterations often occur in a specific order, implying selection depends on the prior clonal genotype 1-3. It is unknown if similar constraints operate in normal epithelia. Here, we mapped mutations in normal mid-esophagus of aged UK subjects. Mutant NOTCH1 clones colonized most of the epithelium by age 60 and above 70 tissue was saturated with mutants under strong competitive selection. Mutant TP53 was more strongly selected in donors over 60 years of age. Samples predominantly mutant for NOTCH1 showed increased selection of NOTCH2 mutants and weaker selection of mutant TP53 compared with samples that were mostly NOTCH1 wild type. In mouse esophagus lacking Notch1 we observed strong selection of mutant Notch2 and other genes not selected in wild type esophagus. In normal ageing esophagus, the first driver mutation may change the trajectory of subsequent somatic evolution by altering mutant selection.	Illumina NovaSeq 6000	31
EGAD00001015261	Authors: Charlotte King1, Emilie Abbie1, Joanna C. Fowler1, Irina Abnizova1, Roshan K. Sood1, Swee Hoe Ong1, Michael W. J. Hall1,2, Faye Lynch-Williams3, Benjamin A. Hall4, Philip H. Jones1,2,5 Abstract: In cancer evolution, genome alterations often occur in a specific order, implying selection depends on the prior clonal genotype 1-3. It is unknown if similar constraints operate in normal epithelia. Here, we mapped mutations in normal mid-esophagus of aged UK subjects. Mutant NOTCH1 clones colonized most of the epithelium by age 60 and above 70 tissue was saturated with mutants under strong competitive selection. Mutant TP53 was more strongly selected in donors over 60 years of age. Samples predominantly mutant for NOTCH1 showed increased selection of NOTCH2 mutants and weaker selection of mutant TP53 compared with samples that were mostly NOTCH1 wild type. In mouse esophagus lacking Notch1 we observed strong selection of mutant Notch2 and other genes not selected in wild type esophagus. In normal ageing esophagus, the first driver mutation may change the trajectory of subsequent somatic evolution by altering mutant selection.	HiSeq X Ten Illumina NovaSeq 6000	6
EGAD00001015262	spatial transcriptomics	Illumina NovaSeq 6000	1
EGAD00001015263	Genome and transcriptome sequence data from a infantile fibrosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015264	Genome and transcriptome sequence data from a neuroblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015265	Genome and transcriptome sequence data from a neuroblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015266	Genome and transcriptome sequence data from a neurofibromatosis type 1 (NF1) patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015267	Genome and transcriptome sequence data from a neuroblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015268	Genome and transcriptome sequence data from a CNS sarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015269	Genome and transcriptome sequence data from a ocular Melanoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015270	Genome and transcriptome sequence data from a osteosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015271	Genome and transcriptome sequence data from a fibrovascular brain tumor patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015272	Genome and transcriptome sequence data from a angiosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015273	Genome and transcriptome sequence data from a craniopharyngioma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015274	Genome and transcriptome sequence data from a NHL large B Cell patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015275	Genome and transcriptome sequence data from a malignant granular cell tumor patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015276	Genome and transcriptome sequence data from a papillary thyroid carcinoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015277	Genome and transcriptome sequence data from a neuroblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015278	Genome and transcriptome sequence data from a osteosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015279	Genome and transcriptome sequence data from a aggressive fibromatosis patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015280	Genome and transcriptome sequence data from a pineoblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015281	Genome and transcriptome sequence data from a multifocal glioblastoma multiforme patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015282	Genome and transcriptome sequence data from a progressive facial plexiform neurofibroma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015283	Genome and transcriptome sequence data from a plexiform neurofibroma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015284	Genome and transcriptome sequence data from a diffuse Intrinsic Pontine Glioma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015285	Genome and transcriptome sequence data from a acute lymphoblastic leukemia patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015286	Genome and transcriptome sequence data from a ewing sarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015287	Genome and transcriptome sequence data from a ependymoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015288	Genome and transcriptome sequence data from a glioblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015289	Genome and transcriptome sequence data from a NUT midline carcinoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015290	Genome and transcriptome sequence data from a angiosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015291	Genome and transcriptome sequence data from a pre-B all (2nd relapse in CNS) patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015292	Genome and transcriptome sequence data from a gliomatosis cerebri anaplastic astrocytoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015293	Genome and transcriptome sequence data from a osteosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015294	Genome and transcriptome sequence data from a neuroblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015295	Genome and transcriptome sequence data from a metastatic alveolar rhabdomyosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015296	Genome and transcriptome sequence data from a neuroblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015297	Genome and transcriptome sequence data from a minimally invasive adenocarcinoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015298	Genome and transcriptome sequence data from a aggressive fibromatosis patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015299	Genome and transcriptome sequence data from a aggressive fibromatosis patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015300	Genome and transcriptome sequence data from a glioblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015301	Genome and transcriptome sequence data from a neurofibromatosis type 1 patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015302	Genome and transcriptome sequence data from a neuroblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015303	Genome and transcriptome sequence data from a papillary thyroid carcinoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015304	Genome and transcriptome sequence data from a relapsed Wilms tumor patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015305	Genome and transcriptome sequence data from a plexiform neurofibroma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015306	Genome and transcriptome sequence data from a metastatic osteosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015307	Genome and transcriptome sequence data from a embryonal rhabdomyosarcoma of the nasopharynx patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015308	Genome and transcriptome sequence data from a diffuse large B-cell lymphoma (relapse) patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015309	Genome and transcriptome sequence data from a neuroblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015310	Genome and transcriptome sequence data from a malignant rhabdoid tumour patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015311	Genome and transcriptome sequence data from a relapsed osteosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015312	Genome and transcriptome sequence data from a relapsed blastic plasmacytoid dendritic cell neoplasm patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015313	Genome and transcriptome sequence data from a synovial sarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015314	Genome and transcriptome sequence data from a recurrence nasopharyngeal rhabdomyosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015315	Genome and transcriptome sequence data from a ewing sarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015316	Genome and transcriptome sequence data from a osteosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015317	Genome and transcriptome sequence data from a pineal parenchymal tumor patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015318	Genome and transcriptome sequence data from a rhabdomyosarcoma, alveolar patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015319	Genome and transcriptome sequence data from a anaplastic astrocytoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015320	Genome and transcriptome sequence data from a rosette-forming glioneuronal tumor (RGNT) patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015321	Genome and transcriptome sequence data from a choroid plexus carcinoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015322	Genome and transcriptome sequence data from a metastatic malignant peripheral nerve sheath tumor patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015323	Genome and transcriptome sequence data from a malignant rhabdoid tumor patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015324	Genome and transcriptome sequence data from a embryonal rhabdomyosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015325	Genome and transcriptome sequence data from a NUT midline carcinoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015326	Genome and transcriptome sequence data from a diffuse midline glioma, H3K27 mutant patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015327	Genome and transcriptome sequence data from a CNS non-germinoma germ cell tumour patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015328	Genome and transcriptome sequence data from a GBM (H3 K27M mutant) patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015329	Genome and transcriptome sequence data from a alveolar rhabdomyosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015330	Genome and transcriptome sequence data from a embryonal rhabdomyosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015331	Genome and transcriptome sequence data from a acute myeloid leukemia patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015332	Genome and transcriptome sequence data from a osteosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015333	Genome and transcriptome sequence data from a pIlomyxoid astrocytoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015334	Genome and transcriptome sequence data from a wilms tumor patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015335	Genome and transcriptome sequence data from a rhabdomyosarcoma, alveolar patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015336	Genome and transcriptome sequence data from a high-grade glioma, glioblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015337	Genome and transcriptome sequence data from a neuroblastoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015338	Genome and transcriptome sequence data from a osteosarcoma patient, generated as part of the BC Cancer Agency's Pediatric Personalized Onco-Genomics study		1
EGAD00001015339	In developed countries, ~10% of individuals are exposed to systemic chemotherapy for cancer and other diseases. Many chemotherapeutic agents act by increasing DNA damage in cancer cells, hence triggering cell death. However, there is limited understanding of the extent and consequences of collateral DNA damage to normal tissues. To investigate the impact of chemotherapy on mutation burdens and cell population structure of a normal tissue we sequenced blood cell genomes from 23 individuals, aged 3-80 years, treated with a range of chemotherapy regimens. Substantial additional mutation loads with characteristic mutational signatures were imposed by some chemotherapeutic agents, but there were differences in burden between different classes of agent, different agents of the same class and different blood cell types. Chemotherapy also induced premature changes in the cell population structure of normal blood, similar to those of normal ageing. The results constitute an initial survey of the long-term biological consequences of cytotoxic agents to which a substantial fraction of the population is exposed during the course of their disease management, raising mechanistic questions and highlighting opportunities for mitigation of adverse effects.	HiSeq X Ten Illumina NovaSeq 6000	1
EGAD00001015340	In developed countries, ~10% of individuals are exposed to systemic chemotherapy for cancer and other diseases. Many chemotherapeutic agents act by increasing DNA damage in cancer cells, hence triggering cell death. However, there is limited understanding of the extent and consequences of collateral DNA damage to normal tissues. To investigate the impact of chemotherapy on mutation burdens and cell population structure of a normal tissue we sequenced blood cell genomes from 23 individuals, aged 3-80 years, treated with a range of chemotherapy regimens. Substantial additional mutation loads with characteristic mutational signatures were imposed by some chemotherapeutic agents, but there were differences in burden between different classes of agent, different agents of the same class and different blood cell types. Chemotherapy also induced premature changes in the cell population structure of normal blood, similar to those of normal ageing. The results constitute an initial survey of the long-term biological consequences of cytotoxic agents to which a substantial fraction of the population is exposed during the course of their disease management, raising mechanistic questions and highlighting opportunities for mitigation of adverse effects.	Illumina NovaSeq 6000	1
EGAD00001015341	The atlas provides a comprehensive exploration of genetic and transcriptomic landscapes within HCC, offering insights into key genomic alterations and gene expression patterns. The integration of DNA sequencing and RNAseq data enhances our understanding of the molecular complexity underlying HCC, potentially paving the way for targeted therapeutic strategies and biomarker discovery in the context of hepatocellular carcinoma.	Illumina HiSeq 2000 Illumina HiSeq 4000 Illumina NovaSeq 6000	41
EGAD00001015342	The atlas provides a comprehensive exploration of genetic and transcriptomic landscapes within HCC, offering insights into key genomic alterations and gene expression patterns. The integration of exome sequencing and RNAseq data enhances our understanding of the molecular complexity underlying HCC, potentially paving the way for targeted therapeutic strategies and biomarker discovery in the context of hepatocellular carcinoma.	Illumina HiSeq 2000 Illumina HiSeq 4000 Illumina NovaSeq 6000	23
EGAD00001015343	The atlas aims to unravel the intricate genetic landscape of HCC, providing a detailed characterization of genomic alterations, transcriptomic profiles, and key mutations associated with liver tumors. The integration of WGS, RNAseq, and WES data offers a holistic perspective, facilitating a deeper understanding of the molecular mechanisms driving HCC pathogenesis. The resulting atlas serves as a valuable resource for researchers, clinicians, and the broader scientific community, contributing to advancements in HCC diagnostics, prognostics, and therapeutic interventions.	HiSeq X Five Illumina NovaSeq 6000	42
EGAD00001015344	This dataset included cfMethyl-Seq data of 15 plasma cfDNA samples from 15 lung cancer patients and RRBS data of 58 lung tumor tissue samples from 58 lung cancer patients. The data were generated following the standard protocols.	Illumina NovaSeq 6000	73
EGAD00001015345	In this study, we investigate differences in the cellular landscape and functionality of ex vivo cultured nasal epithelial cells in response to SARS-CoV-2 infection across different age groups: paediatric (<12y), adult (30-50y), and older adults (>70y). We unravel, that while ciliated cells serve as primary sites for viral replication consistently across all age groups, a distinctive goblet inflammatory subtype emerges in infected paediatric cultures, characterized by heightened expression of interferon-stimulated genes and incomplete viral replication. Conversely, older adult cultures infected with SARS-CoV-2 exhibit a proportional surge in basaloid-like cells, which not only facilitate viral dissemination but also demonstrate associations with altered epithelial repair pathways.	Illumina NovaSeq 6000	17
EGAD00001015347	Contains fast5 data for each of the 10 samples sequenced.	PromethION	10
EGAD00001015349	Targeted panel sequencing of 188 formalin-fixed paraffin-embedded p53abn endometrial cancer samples.	NextSeq 2000	188
EGAD00001015350	Each gene contains individual nanopore long-read amplicon sequencing FASTQ files for: individual (IND) 01-05 and brain regions: Brodmann Area (BA): 10, 24, 9, 46, caudate (CAUD), cerebellum (CBM) and temporal cortex (TCX).	GridION	31
EGAD00001015351	The landscapes of somatic mutation in normal cells inform on the processes of mutation and selection operative throughout life, permitting insight into embryogenesis, normal ageing and the earliest stages of cancer development. Here, by whole-genome sequencing and targeted panel sequencing of microdissections from 30 individuals, including 18 with gastric cancer, we elucidate the developmental trajectories of normal and malignant gastric epithelium.	HiSeq X Ten Illumina NovaSeq 6000	18
EGAD00001015352	The landscapes of somatic mutation in normal cells inform on the processes of mutation and selection operative throughout life, permitting insight into embryogenesis, normal ageing and the earliest stages of cancer development. Here, by whole-genome sequencing and targeted panel sequencing of microdissections from 30 individuals, including 18 with gastric cancer, we elucidate the developmental trajectories of normal and malignant gastric epithelium.	Illumina HiSeq 4000	72
EGAD00001015353	We sequence >1000 whole genomes from 9 patients with CML, providing the largest sequencing dataset for this cancer. We reconstruct phylogenetic trees using somatic mutations and infer BCR::ABL1 timing and tumour growth rates. We correlate mutation landscapes and clonal trajectories with clinical features.	Illumina NovaSeq 6000	-
EGAD00001015356	This data set includes 27 full-length transcript sequence generated from PacBio IsoSeq that were used for verify the cancer-specific exons identified in three genes: FN1, COL6A3 and TNC. The data were generated from PDX models of osteosarcoma patients.	PacBio RS	1
EGAD00001015357	Due to the lower incidence of T-LBL and difficulties in obtaining diagnostic T-LBL material, extensive research on T-LBL has been hampered whereas genetic aberrations in T-ALL are thoroughly characterized. Given the similarities and differences between T-LBL and T-ALL, the question has been raised whether T-LBL and T-ALL represent two different diseases or different manifestations of the same disease. This study aims to identify the genomic and transcriptomic landscape of T-LBL and compare the findings to what is found T-ALL. Comparison of the molecular aberrations between T-LBL and T-ALL can provide insights into the overlap and differences in malignant development between the two entities, which could lead to improved risk stratification in T-LBL in order to eventually adapt T-LBL treatment protocols based on molecular-genetic prognostic factors.		1
EGAD00001015358	CEBPA/PU.1/TCF7 ChIP-seq of 6 primary samples derived from human acute leukemias, namely AML, T-ALL and mixed myeloid/lymphoid leukemias with CpG Island Methylator Phenotype (CIMP). Low-coverage whole genome sequencing (ChIP input) of the same samples is also included as a control to be used in peak calling.	Illumina NovaSeq 6000	1
EGAD00001015360	This batch is a subset of the full DETECT-A dataset, containing 25210 fastq files generated from 3040 subjects. All sequencing was conducted using Illumina HiSeq 4000 and Illumina MiSeq platforms. Note that the division into batches follows no specific criteria and that the sequencing data for each subject has multiple files which may span multiple datasets. Thus, for a comprehensive analysis, it is recommended to request access to all datasets that comprise this study.	Illumina HiSeq 4000 Illumina MiSeq	1
EGAD00001015361	STAG1 and STAG2-ChIP-seq in RAD21-mutant adult acute myeloid leukemia	NextSeq 500	1
EGAD00001015362	In 43 patients pretreatment tumor biopsies, resected tumors and normal tissue of sufficient quality and quantity were obtained to longitudinally explore the mutational profiles of a comprehensive set of cancer-related genes. For tumor samples, one to four FFPE sections (10 µm thickness, number depending on sample size) were lysed for genomic DNA isolation. Isolation was performed semi-automatically on the Maxwell purification system (Maxwell RSC DNA FFPE Kit, AS1450, Promega) as specified by the manufacturer. DNA was eluted in 50 µl RNase-free water and quantified fluorescently for library preparation using a Qubit 2.0 fluorometer (Life Technology) with its appertaining DNA broad-range assay. Corresponding normal DNA was isolated from blood or PBMCs using routinely available QIAGEN technology. DNA was stored at -20°C before use. Whole-exome sequencing (WES) was performed using the Twist Human Core + RefSeq + Mitochondrial Panel (Twist Bioscience), and 2 x 100 bp fragment sizes were sequenced using a NovaSeq6000 (Illumina). Demultiplexing of sequenced reads was achieved using bcl2fastq (version 2.2).	Illumina NovaSeq 6000	145
EGAD00001015363	Repli-seq data for "Replication timing alterations are associated with mutation acquisition during tumour evolution in breast and lung cancer"	Illumina HiSeq 4000	10
EGAD00001015364	Colorectal cancer – unmapped reads (Mutographs)		1
EGAD00001015365	Wnt signalling must be 'just right' to promote tumour growth. Basal cell adenoma (BCA) and basal cell adenocarcinoma (BCAC) of the salivary gland are rare tumours that can be difficult to distinguish from each other and other salivary gland tumour subtypes. Due to their rarity, the genomic profiles of BCA and BCAC have not been explored. Using whole-exome and transcriptome sequencing of BCA and BCAC cohorts, we identify a novel recurrent FBXW11 missense mutation (p.F517S) in BCA, that was mutually exclusive with the previously reported CTNNB1 p.I35T gain-of-functon (GoF) mutation. These driver events collectively accounted for 94% of BCAs. In vitro, mutant FBXW11 had a dominant negative affect, characterised by defective binding to β-catenin and the accumulation of β-catenin in cells. This was consistent with the nuclear expression of β-catenin observed in BCA cases harbouring the FBXW11 p.F517S mutation and activation of the Wnt/β-catenin pathway and defines a novel mechanism of Wnt pathway control. The genomic profiles of BCAC were distinct from BCA, with hotspot DICER1 and HRAS mutations and putative driver mutations affecting PI3K/AKT and NF-κB signalling pathway genes. A single BCAC, which may represent a malignant transformation of BCA, harboured the recurrent FBXW11 mutation. These findings have important implications for the diagnosis and treatment of BCA and BCAC, which, despite histopathologic overlap, may be unrelated entities.	Illumina NovaSeq 6000	124
EGAD00001015366	Wnt signalling must be 'just right' to promote tumour growth. Basal cell adenoma (BCA) and basal cell adenocarcinoma (BCAC) of the salivary gland are rare tumours that can be difficult to distinguish from each other and other salivary gland tumour subtypes. Due to their rarity, the genomic profiles of BCA and BCAC have not been explored. Using whole-exome and transcriptome sequencing of BCA and BCAC cohorts, we identify a novel recurrent FBXW11 missense mutation (p.F517S) in BCA, that was mutually exclusive with the previously reported CTNNB1 p.I35T gain-of-functon (GoF) mutation. These driver events collectively accounted for 94% of BCAs. In vitro, mutant FBXW11 had a dominant negative affect, characterised by defective binding to β-catenin and the accumulation of β-catenin in cells. This was consistent with the nuclear expression of β-catenin observed in BCA cases harbouring the FBXW11 p.F517S mutation and activation of the Wnt/β-catenin pathway and defines a novel mechanism of Wnt pathway control. The genomic profiles of BCAC were distinct from BCA, with hotspot DICER1 and HRAS mutations and putative driver mutations affecting PI3K/AKT and NF-κB signalling pathway genes. A single BCAC, which may represent a malignant transformation of BCA, harboured the recurrent FBXW11 mutation. These findings have important implications for the diagnosis and treatment of BCA and BCAC, which, despite histopathologic overlap, may be unrelated entities.	Illumina NovaSeq 6000	68
EGAD00001015367	Sebaceous tumours are a rare cutaneous cancer with potential for aggressive behaviour. However, limited information is available on these cancers with few published cases. Here we wish to exome sequence these cancers to define the first genomic landscape for this malignancy. We will extract DNA from formalin-fixed, paraffin-embedded (FFPE) cores. Cores may be obtained from lesional and non-lesional tissues of primaries as well as matching metastases. The extracted DNA will be used for exome sequencing. Please note that the dataset was last revised on October 16, 2025.	Illumina HiSeq 4000 Illumina NovaSeq 6000	384
EGAD00001015368	Sebaceous tumours are a rare cutaneous cancer with potential for aggressive behaviour. However, limited information is available on these cancers with few published cases. Here we wish to exome sequence these cancers to define the first genomic landscape for this malignancy. We will extract RNA from formalin-fixed, paraffin-embedded (FFPE) cores. Cores may be obtained from lesional and non-lesional tissues of primaries as well as matching metastases. The extracted RNA will be used for RNA sequencing.	Illumina NovaSeq 6000	319
EGAD00001015369	This cohort comprises a subset of patients enrolled in the Genomic Advances in Sepsis (GAinS) study, an established biobank of adult sepsis patients. Sepsis is defined as life-threatening organ dysfunction caused by a dysregulated host response to infection. Patients with sepsis due to community acquired pneumonia or faecal peritonitis were recruited from 35 hospitals across the UK from 2005-2018, with samples for functional genomics and detailed clinical information collected over the first five days of ICU admission to investigate how host genetics affects the individual repsonse to sepsis. DNA was extracted from buffy coat or whole blood samples using the Qiagen DNA extraction protocol, the automated Maxwell Blood purification kit (Promega), or the QIAamp Blood Midi kit protocol (Qiagen). Genotyping data were generated using the Illumina HumanOmniExpress BeadChip (295 patients), the Infinium CoreExome BeadChip (655 patients), and the Infinium Global Screening Array BeadChip (307 patients). Genotyping QC and imputation into the Haplotype Reference Consortium was perfomed within each batch. The datasets were combined and following post-imputation filtering data were available on 1168 samples.		1
EGAD00001015370	Birth cohort studies involve repeated surveys of large numbers of individuals from birth and throughout their lives. They collect information useful for a wide range of life course research domains, and biological samples which can be used to derive data from an increasing collection of omic technologies. This rich source of longitudinal data, when combined with genomic data, offers the scientific community valuable insights from population genetics to rare disease associations. Avon Longitudinal Study of Parents and Children (ALSPAC)recruited 14,775 babies of predominantly White ethnicity in the Avon county of south-west England between 1991 and 1992. Born in Bradford (BiB) is similarly focused on a particular local area, the city of Bradford in the north of England, and recruited 13,858 babies between 2007 and 2011, of whom ~41% self-report as white British and ~59% as other ethnicities, predominantly Pakistani. Millennium Cohort Study (MCS) is a national cohort that recruited 18,827 children born between 2000 and 2002, intentionally over-sampling areas with high child poverty, large ethnic minority populations, and smaller UK nations (Wales, Scotland and Northern Ireland) Available here is a subset of exome-sequenced parents and children from these studies (CRAMS and post-QC VCFs) as detailed in https://doi.org/10.12688/wellcomeopenres.22697 [doi.org]. Phenotypic data is also available by submitting an application to the corresponding cohort: https://borninbradford.nhs.uk/ [borninbradford.nhs.uk]	Illumina NovaSeq 6000	-
EGAD00001015371	Birth cohort studies involve repeated surveys of large numbers of individuals from birth and throughout their lives. They collect information useful for a wide range of life course research domains, and biological samples which can be used to derive data from an increasing collection of omic technologies. This rich source of longitudinal data, when combined with genomic data, offers the scientific community valuable insights from population genetics to rare disease associations. Avon Longitudinal Study of Parents and Children (ALSPAC)recruited 14,775 babies of predominantly White ethnicity in the Avon county of south-west England between 1991 and 1992. Born in Bradford (BiB) is similarly focused on a particular local area, the city of Bradford in the north of England, and recruited 13,858 babies between 2007 and 2011, of whom ~41% self-report as white British and ~59% as other ethnicities, predominantly Pakistani. Millennium Cohort Study (MCS) is a national cohort that recruited 18,827 children born between 2000 and 2002, intentionally over-sampling areas with high child poverty, large ethnic minority populations, and smaller UK nations (Wales, Scotland and Northern Ireland) Available here is a subset of exome-sequenced parents and children from these studies (CRAMS and post-QC VCFs) as detailed in https://doi.org/10.12688/wellcomeopenres.22697 [doi.org]. Phenotypic data is also available by submitting an application to the corresponding cohort: https://www.bristol.ac.uk/alspac/researchers/our-data/[bristol.ac.uk]	Illumina NovaSeq 6000	-
EGAD00001015372	Birth cohort studies involve repeated surveys of large numbers of individuals from birth and throughout their lives. They collect information useful for a wide range of life course research domains, and biological samples which can be used to derive data from an increasing collection of omic technologies. This rich source of longitudinal data, when combined with genomic data, offers the scientific community valuable insights from population genetics to rare disease associations. Avon Longitudinal Study of Parents and Children (ALSPAC)recruited 14,775 babies of predominantly White ethnicity in the Avon county of south-west England between 1991 and 1992. Born in Bradford (BiB) is similarly focused on a particular local area, the city of Bradford in the north of England, and recruited 13,858 babies between 2007 and 2011, of whom ~41% self-report as white British and ~59% as other ethnicities, predominantly Pakistani. Millennium Cohort Study (MCS) is a national cohort that recruited 18,827 children born between 2000 and 2002, intentionally over-sampling areas with high child poverty, large ethnic minority populations, and smaller UK nations (Wales, Scotland and Northern Ireland) Available here is a subset of exome-sequenced parents and children from these studies (CRAMS and post-QC VCFs) as detailed in https://doi.org/10.12688/wellcomeopenres.22697 [doi.org]. Phenotypic data is also available by submitting an application to the corresponding cohort: https://cls.ucl.ac.uk/cls-studies/millennium-cohort-study/ [cls.ucl.ac.uk]	Illumina NovaSeq 6000	15173
EGAD00001015373	Dataset for manuscript titled: Spatial Intra-Tumour Heterogeneity and Treatment-Induced Genomic Evolution in Oesophageal Adenocarcinoma: Implications for Prognosis and Therapy	HiSeq X Ten	1
EGAD00001015374	This dataset includes raw nanopore, base-called, and 6mA frequency data for EcoGII-treated NA12878 and MCF7 chromatin samples. It also includes raw nanopore data for the HG002 EcoGII-treated DNA.	PromethION	3
EGAD00001015376	Eccrine poroma (EP) and porocarcinoma (EPC) are rare benign and malignant adnexal neoplasms of the terminal sweat gland duct, respectively. Both can arise de novo, however, EPCs can also arise from a pre-existing EP. To-date, genetic investigation of these tumors has involved studies with small sample sizes and/or limited analyses. To comprehensively compare the driver events and mutational landscape of these tumors, we performed a retrospective multi-institutional whole-exome sequencing and RNA sequencing study on the largest cohort of EPs and EPCs to-date (n=90). We uncovered novel events and delineated different pathways of tumorigenesis underlying these tumors, with EPs driven largely by fusion genes, and EPCs driven largely by somatic mutations, with rare YAP1 and frequent PAK gene novel fusions.	Illumina NovaSeq 6000	1
EGAD00001015377	Eccrine poroma (EP) and porocarcinoma (EPC) are rare benign and malignant adnexal neoplasms of the terminal sweat gland duct, respectively. Both can arise de novo, however, EPCs can also arise from a pre-existing EP. To-date, genetic investigation of these tumors has involved studies with small sample sizes and/or limited analyses. To comprehensively compare the driver events and mutational landscape of these tumors, we performed a retrospective multi-institutional whole-exome sequencing and RNA sequencing study on the largest cohort of EPs and EPCs to-date (n=90). We uncovered novel events and delineated different pathways of tumorigenesis underlying these tumors, with EPs driven largely by fusion genes, and EPCs driven largely by somatic mutations, with rare YAP1 and frequent PAK gene novel fusions.	Illumina NovaSeq 6000	1
EGAD00001015378	Eccrine poroma (EP) and porocarcinoma (EPC) are rare benign and malignant adnexal neoplasms of the terminal sweat gland duct, respectively. Both can arise de novo, however, EPCs can also arise from a pre-existing EP. To-date, genetic investigation of these tumors has involved studies with small sample sizes and/or limited analyses. To comprehensively compare the driver events and mutational landscape of these tumors, we performed a retrospective multi-institutional whole-exome sequencing and RNA sequencing study on the largest cohort of EPs and EPCs to-date (n=90). We uncovered novel events and delineated different pathways of tumorigenesis underlying these tumors, with EPs driven largely by fusion genes, and EPCs driven largely by somatic mutations, with rare YAP1 and frequent PAK gene novel fusions.	Illumina NovaSeq 6000	1
EGAD00001015379	Eccrine poroma (EP) and porocarcinoma (EPC) are rare benign and malignant adnexal neoplasms of the terminal sweat gland duct, respectively. Both can arise de novo, however, EPCs can also arise from a pre-existing EP. To-date, genetic investigation of these tumors has involved studies with small sample sizes and/or limited analyses. To comprehensively compare the driver events and mutational landscape of these tumors, we performed a retrospective multi-institutional whole-exome sequencing and RNA sequencing study on the largest cohort of EPs and EPCs to-date (n=90). We uncovered novel events and delineated different pathways of tumorigenesis underlying these tumors, with EPs driven largely by fusion genes, and EPCs driven largely by somatic mutations, with rare YAP1 and frequent PAK gene novel fusions.	Illumina NovaSeq 6000	1
EGAD00001015380	Immune cells sense and respond to external stimuli, initiating an inflammatory response, with genetic variants potentially altering these responses. Traditional QTL mapping often uses naïve cells, missing condition-specific variants detectable post-stimulation. Our study uses a high-throughput platform to map eQTLs across 24 conditions using iPSC-derived macrophages, identifying that 76% of eQTLs in stimulated conditions were also present in naïve cells. We found response eQTLs (reQTLs) vary widely across conditions, with rare single-condition reQTLs being overrepresented among disease-colocalizing eQTLs. This study nominates 21.7% more disease effector genes at GWAS loci through reQTL colocalization, with 38.6% not found in the GTEx catalogue, highlighting the importance of condition-specific regulatory variation in understanding disease risk alleles. Our findings underscore the value of condition-specific studies in elucidating the genetic mechanisms underlying complex diseases. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/		1
EGAD00001015381	Refractory cancers may arise either through the acquisition of resistance mechanisms by cancer cells or represent distinct diseases. The origin of childhood T-cell acute lymphoblastic leukaemia (T-ALL) that does not respond to initial treatment, i.e. refractory disease, is unknown. Refractory T-ALL carries a poor prognosis and cannot be predicted at diagnosis. Here, we performed single cell mRNA sequencing of T-ALL from children who did or did not respond to initial treatment. We identified a transcriptionally distinctive blast population, exhibiting features of innate-like lymphocytes, as the major source of refractory disease. Evidence of such blasts at diagnosis heralded refractory disease across independent datasets and was associated with survival in a large, contemporary trial cohort. Our findings portray refractory T-ALL as a distinct disease. They may have immediate clinical utility.	Illumina NovaSeq 6000	1
EGAD00001015382	Exome capturing was performed using xGen Exome Research Panel v1.0 based on standard protocols. Paired-end sequencing (2 x 151 bp) was performed using Illumina NovaSeq6000.	Illumina NovaSeq 6000	2
EGAD00001015383	Single cell encapsulation and DNA libraries were prepared by Chromium™ Single Cell DNA Reagent Kits and Chromium™ Single Cell C and D Chip Kits.	Illumina NovaSeq 6000	1
EGAD00001015384	T cells develop from circulating precursor cells, which enter the thymus and migrate through specialised sub-compartments that support their maturation and selection. In humans, this process starts in early fetal development and is highly active until thymic involution in adolescence. To map the micro-anatomical underpinnings of this process in pre- and early postnatal stages, we established a novel quantitative morphological framework for the thymus, the Cortico-Medullary Axis, and used it to perform a spatially resolved analysis. By applying this framework to a curated multimodal single-cell atlas, spatial transcriptomics, and high-resolution multiplex imaging data, we demonstrate establishment of the lobular cytokine network, canonical thymocyte trajectories and thymic epithelial cell distributions within the first trimester of fetal development. We pinpoint tissue niches of thymic epithelial cell progenitors and distinct subtypes associated with Hassall’s corpuscles and uncover divergence in the timing of medullary entry between CD4 vs. CD8 T cell lineages. These findings provide a basis for a detailed understanding of T lymphocyte development and are complemented with a holistic toolkit for cross-platform imaging data analysis, annotation, and Organ Axis construction (TissueTag), which can be applied to any tissue.	Illumina NovaSeq 6000	-
EGAD00001015386	The complexity of tobacco smoke induced mutagenesis in head and neck cancer - sequence data (Mutographs)		1
EGAD00001015387	The complexity of tobacco smoke induced mutagenesis in head and neck cancer - patient metadata files (Mutographs)		1
EGAD00001015388	The complexity of tobacco smoke induced mutagenesis in head and neck cancer - filtered vcf files (Mutographs)		1
EGAD00001015389	The complexity of tobacco smoke induced mutagenesis in head and neck cancer - structural variation vcf files (Mutographs)		1
EGAD00001015390	The complexity of tobacco smoke induced mutagenesis in head and neck cancer - copy number variants (Mutographs)		1
EGAD00001015391	Patient-matched normal kidney organoid (103H) and MRT tumoroid (103T2) models were treated for 24h with either DMSO (ctrl), 400nM MTX, or 50nM BAY to investigate the direct effects of drug treatment on the expression of key metabolic enzymes in the nucleotide biosynthesis pathways.	Illumina NovaSeq X	1
EGAD00001015393	Short-read WGS datasets of 15 eHHV-6B-positive Japanese subjects (Illumina WGS) and long-read WGS datasets of 3 eHHV-6B-positive Japanese subjects with SLE (PacBio 30x HiFi long-read sequencing).	HiSeq X Ten Illumina NovaSeq 6000 Sequel II	18
EGAD00001015395	DNA Whole Exome Sequence for manuscript titled: Evaluation of Endobronchial Ultrasound-Guided Transbronchial Needle Aspiration (EBUS-TBNA) Samples from Advanced Non-Small Cell Lung Cancer for Whole Genome, Whole Exome and Comprehensive Panel Sequencing	Illumina NovaSeq 6000	1
EGAD00001015396	Illumina TSO500 DNA Dataset for Manuscript titled: Evaluation of Endobronchial Ultrasound-Guided Transbronchial Needle Aspiration (EBUS-TBNA) Samples from Advanced Non-Small Cell Lung Cancer for Whole Genome, Whole Exome and Comprehensive Panel Sequencing	NextSeq 550	1
EGAD00001015397	Illumina TSO500 RNA Dataset for Manuscript titled: Evaluation of Endobronchial Ultrasound-Guided Transbronchial Needle Aspiration (EBUS-TBNA) Samples from Advanced Non-Small Cell Lung Cancer for Whole Genome, Whole Exome and Comprehensive Panel Sequencing	NextSeq 550	1
EGAD00001015398	Cancer predisposition syndromes mediated by recessive cancer genes generate tumours via somatic variants (second hits) in the unaffected allele. Second hits may or may not be sufficient for neoplastic transformation. Here, we performed whole genome and exome sequencing on 479 tissue biopsies from a child with neurofibromatosis type 1, a multi-system cancer-predisposing syndrome mediated by constitutive monoallelic NF1 inactivation. We identified multiple independent NF1 driver variants in histologically normal tissues, but not in 610 biopsies from two non-predisposed children. We corroborated this finding using targeted duplex sequencing, including a further nine adults with the same syndrome. Overall, truncating NF1 mutations were under positive selection in normal tissues from individuals with neurofibromatosis type 1. We demonstrate that normal tissues in neurofibromatosis type 1 commonly harbour second hits in NF1, the extent and pattern of which may underpin the syndrome's cancer phenotype.	Illumina NovaSeq 6000	-
EGAD00001015399	DNA WGS Short Read Sequence (Illumina NovaSeq) for manuscript titled: "Performance of Somatic Structural Variant Calling in Lung Cancer using Oxford Nanopore Sequencing Technology"	Illumina NovaSeq 6000	1
EGAD00001015400	DNA WGS Long Read Sequence (PromethION) for manuscript titled: "Performance of Somatic Structural Variant Calling in Lung Cancer using Oxford Nanopore Sequencing Technology"	PromethION	1
EGAD00001015401	Pediatric acute lymphoblastic leukemia (ALL) is marked by low mutational load at initial diagnosis, which increases at relapse. The elevated mutational load at relapse can partly be explained by at least two therapy-related effects and a combination of therapy and underlying mismatch repair deficiency. However, our understanding of the type and timing of mutational mechanisms in relapsed ALL is limited, and it is unclear to what extent mutational processes contribute to disease progression. We collected a cohort of 29 Dutch ALL patients across multiple treatment protocols who had multiple relapses. Using whole genome sequencing of the sequential tumor samples of each patient we were able to distinguish the mutational processes active in relapsed ALL and could track the activity of mutational processes over time. This allowed us to investigate whether subtype-specific mutational processes at diagnosis can continue in relapse or emerge at relapse if absent in initial diagnosis. Furthermore, we assessed whether the activity of mutational processes contributed to disease development and relapse.	Illumina NovaSeq 6000	135
EGAD00001015403	HSPCs were sorted as CD34+ cells from bone marrow of patients with Hodgkin Lymphoma (HL). Control samples were isolated before initiating treatment.These were compared to samples from two additional HL patients who had relapsed and were being treated with nivolumab. Nivolumab was administered at a dose of 240 mg every two weeks, with BM samples collected the day before the scheduled dose. The patients receiving nivolumab were in remission at the time of BM collection. RNA was extracted using Arcturus™ PicoPure™ RNA Isolation Kit (Thermo Fisher Scientific, #12204-01) ` instructions. Paired-end sequencing was reads were generated for each sample in the Illumina Novaseq 6000 system.	Illumina NovaSeq 6000	4
EGAD00001015404	We provide a single-nuclei RNA-sequencing (snRNA-seq) dataset derived from four COVID-19 patients, generated using the 10x Chromium Next GEM Single Cell v3.1 kit. For our study, these were integrated with snRNA-seq data with 12 publicly available sc/snRNA-seq datasets, comprising organ donor lung samples (n=89) and COVID-19 lung tissue samples (n=51). Additionally, we provide a spatial transcriptomic dataset characterizing different histopathological stages of diffuse alveolar damage, across a cohort of 33 COVID-19 patients. This was generated using the Nanostring Whole Transcriptome Assay (WTA) with regions of interest (ROIs) sized 400 µm² each. This integrated multi-omics analysis of snRNA-seq, histology, and spatial transcriptomics data, is available for exploration and download via our web portal: https://covid19-multiomicatlas.cellgeni.sanger.ac.uk/.	Illumina NovaSeq 6000	4
EGAD00001015405	We have collected RNA samples from whole blood of Kenyan children exposed to malaria in the Kilifi region of Kenya. Collections were performed each year from 2015 until 2018. This is a follow-up study to that described in Bediako et al. (in preparation). The SIMS consortium is seeking to identifying the underlying reasons why some children are more susceptible to malaria than others. In this study we hope to track changes in children’s immune systems over time which relate to the number of malaria episodes they experience. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 2500	651
EGAD00001015406	A paradigm of childhood cancers is that they have a low mutation burden, with some ostensibly bearing fewer mutations than the normal tissues from which they derive. We set out to resolve this paradox by examining paediatric renal cancers with exceptionally few mutations using high resolution, high depth sequencing approaches. We found that apparent hypomutation was the result of unusual clonal architecture due to a normal tissue-like mode of tumour evolution, raising the possibility that the mutation burden of some cancers has been systematically misjudged.	HiSeq X Ten Illumina NovaSeq 6000	1
EGAD00001015407	This is a study of 100 patients with the aim to provide transcriptomic sequencing data analysis of cells isolated from bronchioalveolar lavage and blood samples from critically ill patients with pneumonia on an intensive care unit in order to investigate the host contribution to disease. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 4000	1
EGAD00001015408	Pleural mesothelioma (PM) requires new treatments. Drug repurposing offers a potential approach to finding effective therapies. This study aimed to explore new therapeutic options for PM using RNA sequencing. RNA sequencing was performed on 58 patient-derived PM cell lines. Drug screening of over 1300 compounds was conducted on 11 lines, with further testing of selected drugs on 48 lines. Top candidates were validated in 3D culture and mouse models.	Illumina HiSeq 4000 Illumina NovaSeq 6000	55
EGAD00001015409	We performed exome sequencing on the largest cohort of patient-derived pleural mesothelioma (PM) cell lines (n=58) to explore their molecular heterogeneity. The analysis recapitulated key features of PM tumors and uncovered novel mutations specific to this cohort. These findings provide insight into the genetic landscape of PM, supporting the identification of potential therapeutic targets and advancing the understanding of its molecular diversity.	Illumina NovaSeq 6000	55
EGAD00001015410	Paired-end fastq files from whole genome sequencing on Illumina HiSeq 4000 generated using bcl2fastq.	Illumina HiSeq 4000	21
EGAD00001015411	Characterising the evolutionary dynamics of cancer proliferation in single-cell clones with SPRINTER		1
EGAD00001015412	WGS, RNAseq, Methylation, variant calling analysis for a Novel paediatric case of a spinal high-grade astrocytoma with piloid features in a patient with Noonan Syndrome	Illumina NovaSeq 6000	3
EGAD00001015413	Neuroblastoma (NB) is one of the most lethal childhood cancers due its propensity to treatment resistance. By spatial mapping of subclone geographies before and after chemotherapy across 89 tumor regions from 12 NBs, we find that densely packed territories of closely related subclones present at diagnosis are replaced under effective treatment by islands of distantly related survivor subclones, originating from a different most recent ancestor compared to lineages dominating before treatment. Conversely, in tumors that progressed under treatment, ancestors of subclones dominating later in disease are present already at diagnosis. Chemotherapy treated xenografts and cell culture models replicates these two contrasting scenarios and shows branching evolution to be a constant feature of proliferating NB cells. Phylogenies based on whole genome sequencing of 505 individual NB cells indicate that a rich repertoire of parallel subclones, emerges already with the first oncogenic mutations and lays the foundation for clonal replacement under treatment.	Illumina NovaSeq 6000	1
EGAD00001015414	Neuroblastoma (NB) is one of the most lethal childhood cancers due its propensity to treatment resistance. By spatial mapping of subclone geographies before and after chemotherapy across 89 tumor regions from 12 NBs, we find that densely packed territories of closely related subclones present at diagnosis are replaced under effective treatment by islands of distantly related survivor subclones, originating from a different most recent ancestor compared to lineages dominating before treatment. Conversely, in tumors that progressed under treatment, ancestors of subclones dominating later in disease are present already at diagnosis. Chemotherapy treated xenografts and cell culture models replicates these two contrasting scenarios and shows branching evolution to be a constant feature of proliferating NB cells. Phylogenies based on whole genome sequencing of 505 individual NB cells indicate that a rich repertoire of parallel subclones, emerges already with the first oncogenic mutations and lays the foundation for clonal replacement under treatment.	Illumina NovaSeq 6000	1416
EGAD00001015416	cell-free RNA from the maternal plasma obtained at round the 12, 20, 28 and 36 week gestational age from the cases of preeclampsia combined with fetal growth restriction (n=39) and their matched controls (n=156). All samples are from the Pregnancy Outcome Prediction (POP) study.	Illumina NovaSeq 6000	755
EGAD00001015417	Creation of living organoid biobank for ewing and ewing-like sarcomas		1
EGAD00001015418	Creation of living organoid biobank for ewing and ewing-like sarcomas	Illumina NovaSeq 6000	11
EGAD00001015419	Creation of living organoid biobank for ewing and ewing-like sarcomas		1
EGAD00001015420	Whole exome sequencing data generated from organoid cultures established from normal gastrointestinal organoids, long-term, spheroid and blood leukocyte DNA.	HiSeq X Ten Illumina HiSeq 1500 Illumina NovaSeq 6000	107
EGAD00001015421	RNASeq data generated from organoid cultures established from normal gastrointestinal organoids, long-term and paired tumor frozen tissues.	Illumina HiSeq 1500 Illumina NovaSeq 6000	109
EGAD00001015422	Single cell encapsulation and cDNA libraries were prepared by Chromium™ Single Cell 5’ Reagent Kits and Chromium™ Single Cell A Chip Kit.	Illumina NovaSeq 6000	25
EGAD00001015423	Single cell encapsulation and DNA libraries were prepared by Chromium™ Single Cell DNA Reagent Kits and Chromium™ Single Cell C and D Chip Kits.	Illumina NovaSeq 6000	1
EGAD00001015424	Sequencing of tissue samples and their derived organoids. This dataset contains a subset of colorectal and colorectal liver metastasis samples.	HiSeq X Ten Illumina NovaSeq 6000	1
EGAD00001015425	Sequencing of tissue samples and their derived organoids. This dataset contains a subset of colorectal and colorectal liver metastasis samples.	Illumina HiSeq 4000	1
EGAD00001015426	This batch is a subset of the full DETECT-A dataset, containing 12740 fastq files generated from 3211 subjects. All sequencing was conducted using Illumina HiSeq 4000 and Illumina MiSeq platforms. Note that the division into batches follows no specific criteria and that the sequencing data for each subject has multiple files which may span multiple datasets. Thus, for a comprehensive analysis, it is recommended to request access to all datasets that comprise this study.	Illumina HiSeq 4000 Illumina MiSeq	1
EGAD00001015427	This work aims to identify homogeneous and robust molecular subtypes in HCC based on a large, homogenous and well-annotated cohort. We performed RNA sequencing (RNA-seq) on 529 HCC from 461 patients collected mainly in France. Based on the consistency between genomic and transcriptomic data, we identified 9 robust HCC subtypes mainly based on driver mutations. We further characterized HCC subtypes using transcriptomic and clinicopathological features. Our study provided a robust molecular classification based on a large HCC dataset from a Western country, improving our understanding of the mechanisms of carcinogenesis and facilitating the development of genome-based precision medicine in HCC.	Illumina HiSeq 4000 Illumina NovaSeq 6000	45
EGAD00001015428	This work aims to identify homogeneous and robust molecular subtypes in HCC based on a large, homogenous and well-annotated cohort. We performed whole genome/exome sequencing (WGS/WES) on 529 HCC from 461 patients collected mainly in France. Based on the genomic data, we identified 9 robust HCC subtypes primarily driven by key mutations. We further characterized these subtypes using genomic and clinicopathological features. Among the 9 subtypes, 5 belonged to chromosome instable tumors, while 3 belonged to chromosome stable tumors. Our subtypes were associated with prognosis and showed distinct distributions across features like etiology and gender. This study offers a comprehensive molecular classification of HCC, enhancing our understanding of hepatocarcinogenesis and supporting the development of genome-based precision medicine for liver cancer.	Illumina HiSeq 2000 Illumina HiSeq 4000 Illumina NovaSeq 6000	67
EGAD00001015429	Bacteriophage Immunoprecipitation Sequencing (PhIP-Seq) datasets of three eHHV-6B-positive SLE patients (three technical replicates), six eHHV-6B-negative SLE patients (three technical replicates), and a negative control lacking sera or plasma (12 technical replicates; Illumina NextSeq) and sequencing data of HHV-6 peptide phage library (two technical replicates; Illumina NextSeq).	NextSeq 2000	43
EGAD00001015430	Somatic variants accumulate in non-malignant tissues with age. Functional variants leading to clonal advantage of hepatocytes accumulate in the liver from patients with acquired chronic liver disease (CLD). Whether somatic variants are common to CLD from differing aetiologies is unknown. We analysed liver somatic variants in patients with genetic CLD from alpha-1 anti-trypsin (A1AT) deficiency or haemochromatosis. We show that somatic variants in SERPINA1, the gene encoding A1AT, are strongly selected for in A1AT deficiency, with evidence of convergent evolution. Acquired SERPINA1 variants are clustered leading to truncation or coding change at the C-terminus of A1AT. In vitro and in vivo, C-terminal truncation variants reduce disease-associated Z-A1AT polymer accumulation and disruption of the endoplasmic reticulum, supporting the C-terminal domain swap mechanism. Therefore, somatic escape variants from a deleterious germline variant are selected for in A1AT deficiency, suggesting functional somatic variants are disease-specific in CLD and point to disease-associated mechanisms.	Illumina NovaSeq 6000	1
EGAD00001015431	Somatic variants accumulate in non-malignant tissues with age. Functional variants leading to clonal advantage of hepatocytes accumulate in the liver from patients with acquired chronic liver disease (CLD). Whether somatic variants are common to CLD from differing aetiologies is unknown. We analysed liver somatic variants in patients with genetic CLD from alpha-1 anti-trypsin (A1AT) deficiency or haemochromatosis. We show that somatic variants in SERPINA1, the gene encoding A1AT, are strongly selected for in A1AT deficiency, with evidence of convergent evolution. Acquired SERPINA1 variants are clustered leading to truncation or coding change at the C-terminus of A1AT. In vitro and in vivo, C-terminal truncation variants reduce disease-associated Z-A1AT polymer accumulation and disruption of the endoplasmic reticulum, supporting the C-terminal domain swap mechanism. Therefore, somatic escape variants from a deleterious germline variant are selected for in A1AT deficiency, suggesting functional somatic variants are disease-specific in CLD and point to disease-associated mechanisms.	Illumina NovaSeq 6000	1
EGAD00001015432	This dataset contains fastq files derived from whole genome sequencing of primary bone marrow samples from acute myeloid leukemia patients with different DNMT3A mutational status (wildtype, single mutant or double mutant).	Illumina NovaSeq 6000	44
EGAD00001015433	The GASCAD-II dataset from the Singapore Gastric Cancer Consortium includes paired tumor-blood whole exome sequencing data for 209 gastric cancer (GC) patients, along with whole transcriptome sequencing data for 125 GC samples. Whole exome sequencing was conducted using Agilent SureSelect Human All Exon V6 kits. For RNA sequencing, total RNA was isolated using the RNeasy Mini Kit, and libraries were prepared with the TruSeq Stranded Total RNA with Ribo-Zero Gold protocol (Illumina). Aligned BAM files for both exome and RNAseq data are included in this dataset	Illumina NovaSeq 6000	1
EGAD00001015434	Data supporting: "Utility of ctDNA assessment after six weeks of immunotherapy to predict radiological response in advanced oesophageal cancer" Linossi et al	Illumina NovaSeq 6000 unspecified	33
EGAD00001015435	Data supporting: "Genomic and epidemiological similarities between phenotypically distinct esophageal adenocarcinoma suggest a single entity" Zamani et al	HiSeq X Five Illumina HiSeq 2000 Illumina NovaSeq 6000 unspecified	2
EGAD00001015437	Geographic and age-related variations in mutational processes in colorectal cancer - patient metadata files (Mutographs)		1
EGAD00001015440	Core phentoype data for AWI-Gen Phase 2 Microbiome Project for 1824 samples.		1
EGAD00001015441	We generated and characterized tumoroids from small cell carcinoma of the ovary, hypercalcemic type. Furthermore, we identified a drug that is selective and effective against SCCOHT tumoroids.	NextSeq 2000	12
EGAD00001015442	We generated and characterized tumoroids from small cell carcinoma of the ovary, hypercalcemic type. Furthermore, we identified a drug that is selective and effective against SCCOHT tumoroids.		1
EGAD00001015443	WGS files for paper titled "Fusion oncoproteins and cooperating mutations define disease phenotypes in NUP98-rearranged leukemia" PMID: 39974131, PMCID: PMC11838931, DOI: 10.1101/2025.01.21.25320683	Illumina HiSeq 2000	3
EGAD00001015444	WXS files for paper titled "Fusion oncoproteins and cooperating mutations define disease phenotypes in NUP98-rearranged leukemia" PMID: 39974131, PMCID: PMC11838931, DOI: 10.1101/2025.01.21.25320683	Illumina HiSeq 2000	3
EGAD00001015445	RNASeq files for paper titled "Fusion oncoproteins and cooperating mutations define disease phenotypes in NUP98-rearranged leukemia" PMID: 39974131, PMCID: PMC11838931, DOI: 10.1101/2025.01.21.25320683	Illumina HiSeq 2000	10
EGAD00001015446	This dataset includes both the anchor scRNA-seq and snRNA-seq datasets used to build the Human Endometrial Cell Atlas (HECA). HECA provides a comprehensive definition of endometrial cell types and states throughout the menstrual cycle in donor with and without endometriosis. It identifies consensus cell types across datasets generated by teams worldwide, as well as previously unreported cell types, all of which are validated and mapped in situ using spatial transcriptomics. Processed data is available from ArrayExpress with accession number E-MTAB-14039. See Marečková, M., Garcia-Alonso, L., Moullet, M. et al. An integrated single-cell reference atlas of the human endometrium. Nat Genet 56, 1925–1937 (2024) https://doi.org/10.1038/s41588-024-01873-w for more information.	Illumina NovaSeq 6000	1
EGAD00001015447	The thyroid gland produces hormones essential for health from embryogenesis to adulthood. Thyroid disorders, including congenital hypothyroidism and thyroid carcinoma, are prevalent and pose significant health challenges. Congenital hypothyroidism often results from thyroid dysgenesis or impaired hormone synthesis, is particularly prevalent in trisomy 21 (T21), while thyroid carcinoma is the most frequent endocrine malignancy, affecting both paediatric and adult populations. Understanding the molecular basis of these conditions requires deeper insights into fetal thyroid development. We generated a spatiotemporal atlas of the human thyroid during early pregnancy, revealing key cell types, including hormone-producing thyrocytes. Thyroid follicular cells are heterogeneous, with two functional cell states (fTFC1, fTFC2) that persist into adulthood, with fTFC2 characterised by elevated PAX8 expression. We demonstrated that T21 thyroids displayed dysgenesis with disrupted follicular morphology, and altered extracellular matrix interactions, and that the fTFC2 signature was enriched in paediatric papillary thyroid cancer compared to adults. These findings uncover thyrocyte heterogeneity across the lifespan, advancing understanding of thyroid development and disease.	Illumina NovaSeq 6000	1
EGAD00001015448	The thyroid gland produces hormones essential for health from embryogenesis to adulthood. Thyroid disorders, including congenital hypothyroidism and thyroid carcinoma, are prevalent and pose significant health challenges. Congenital hypothyroidism often results from thyroid dysgenesis or impaired hormone synthesis, is particularly prevalent in trisomy 21 (T21), while thyroid carcinoma is the most frequent endocrine malignancy, affecting both paediatric and adult populations. Understanding the molecular basis of these conditions requires deeper insights into fetal thyroid development. We generated a spatiotemporal atlas of the human thyroid during early pregnancy, revealing key cell types, including hormone-producing thyrocytes. Thyroid follicular cells are heterogeneous, with two functional cell states (fTFC1, fTFC2) that persist into adulthood, with fTFC2 characterised by elevated PAX8 expression. We demonstrated that T21 thyroids displayed dysgenesis with disrupted follicular morphology, and altered extracellular matrix interactions, and that the fTFC2 signature was enriched in paediatric papillary thyroid cancer compared to adults. These findings uncover thyrocyte heterogeneity across the lifespan, advancing understanding of thyroid development and disease.	Illumina NovaSeq 6000	1
EGAD00001015449	AWI-Gen Phase 2 Microbiome Project for 1820 samples.	Illumina NovaSeq 6000	1
EGAD00001015450	Our study utilizes two novel techniques, cyclical immunofluorescence imaging and single-cell spatial transcriptomics, to spatially map the tumor microenvironment of a large sample set of pediatric high-grade gliomas (pHGG). Using these methods, we have identified an abundant immunosuppressive myeloid cell population that has not been described in the context of pediatric high-grade gliomas. We validate our findings using an in vitro assay, spatial analysis on additional pHGG biopsies, independent spatial transcriptomic data, bulk RNA sequencing data of an expanded cohort of in-house patients, and publicly available bulk RNA sequencing data of an even larger cohort of pHGG patients.		1
EGAD00001015452	Abstract: Children with Down syndrome have a 150-fold increased risk of developing myeloid leukaemia (ML-DS). Unusually for a childhood leukaemia, ML-DS arises from a preleukaemic state, termed transient abnormal myelopoiesis (TAM), and via a conserved sequence of mutations. Here, we examined the relationship between the genetic and transcriptional evolution of ML-DS from a rich collection of primary patient samples through single cell mRNA sequencing, complemented by phylogenetic analyses in progressive disease. We distilled the transcriptional consequence of each genetic step in the evolution of ML-DS, showing that TAM-defining GATA1 mutations account for most of the ML-DS transcriptome, including those of progressive disease. This transcriptional backbone may thus represent a common vulnerability in ML-DS. We extracted the transcriptional difference between TAM and ML-DS which, unexpectedly, reflected features shared across childhood leukaemia. Our approach delineates the transcriptional evolution of ML-DS and provides an analytical blueprint for distilling consequences of mutations directly from patient samples.	Illumina HiSeq 4000 Illumina NovaSeq 6000	1
EGAD00001015453	Abstract: Children with Down syndrome have a 150-fold increased risk of developing myeloid leukaemia (ML-DS). Unusually for a childhood leukaemia, ML-DS arises from a preleukaemic state, termed transient abnormal myelopoiesis (TAM), and via a conserved sequence of mutations. Here, we examined the relationship between the genetic and transcriptional evolution of ML-DS from a rich collection of primary patient samples through single cell mRNA sequencing, complemented by phylogenetic analyses in progressive disease. We distilled the transcriptional consequence of each genetic step in the evolution of ML-DS, showing that TAM-defining GATA1 mutations account for most of the ML-DS transcriptome, including those of progressive disease. This transcriptional backbone may thus represent a common vulnerability in ML-DS. We extracted the transcriptional difference between TAM and ML-DS which, unexpectedly, reflected features shared across childhood leukaemia. Our approach delineates the transcriptional evolution of ML-DS and provides an analytical blueprint for distilling consequences of mutations directly from patient samples.	Illumina NovaSeq 6000	1
EGAD00001015455	Primary human cells cultured in organoid format have great promise as potential regenerative cellular therapies. However, their immunogenicity and mutagenic profile remain unresolved, impeding effective long-term translation to the clinic. In this study we report, for the first time, the generation of human leukocyte antigen (HLA)-I and HLA-II knock-out human expandable primary cholangiocyte organoids (PCOs) using CRISPR-Cas9 as a potential ‘universal’ low-immunogenic therapy for bile duct disorders. HLA-edited PCOs (ePCOs) displayed the same phenotypic and functional characteristics as parental unedited PCOs. Despite minimal off-target edits, duplex sequencing approaches demonstrated that ePCOs and PCOs acquire mutations in culture at similar rates, but without evident selection for cancer-driver mutations. ePCOs induced reduced T cell-mediated immunity and donor-dependent NK cell cytotoxicity in vitro and evaded cytotoxic responses with increased graft survival in humanized mice in vivo. Our findings have important implications for assessment of safety and immunogenicity of primary cell-derived organoid cellular therapies.	Illumina NovaSeq 6000	1
EGAD00001015456	Primary human cells cultured in organoid format have great promise as potential regenerative cellular therapies. However, their immunogenicity and mutagenic profile remain unresolved, impeding effective long-term translation to the clinic. In this study we report, for the first time, the generation of human leukocyte antigen (HLA)-I and HLA-II knock-out human expandable primary cholangiocyte organoids (PCOs) using CRISPR-Cas9 as a potential ‘universal’ low-immunogenic therapy for bile duct disorders. HLA-edited PCOs (ePCOs) displayed the same phenotypic and functional characteristics as parental unedited PCOs. Despite minimal off-target edits, duplex sequencing approaches demonstrated that ePCOs and PCOs acquire mutations in culture at similar rates, but without evident selection for cancer-driver mutations. ePCOs induced reduced T cell-mediated immunity and donor-dependent NK cell cytotoxicity in vitro and evaded cytotoxic responses with increased graft survival in humanized mice in vivo. Our findings have important implications for assessment of safety and immunogenicity of primary cell-derived organoid cellular therapies.	Illumina NovaSeq 6000	5
EGAD00001015457	Primary human cells cultured in organoid format have great promise as potential regenerative cellular therapies. However, their immunogenicity and mutagenic profile remain unresolved, impeding effective long-term translation to the clinic. In this study we report, for the first time, the generation of human leukocyte antigen (HLA)-I and HLA-II knock-out human expandable primary cholangiocyte organoids (PCOs) using CRISPR-Cas9 as a potential ‘universal’ low-immunogenic therapy for bile duct disorders. HLA-edited PCOs (ePCOs) displayed the same phenotypic and functional characteristics as parental unedited PCOs. Despite minimal off-target edits, duplex sequencing approaches demonstrated that ePCOs and PCOs acquire mutations in culture at similar rates, but without evident selection for cancer-driver mutations. ePCOs induced reduced T cell-mediated immunity and donor-dependent NK cell cytotoxicity in vitro and evaded cytotoxic responses with increased graft survival in humanized mice in vivo. Our findings have important implications for assessment of safety and immunogenicity of primary cell-derived organoid cellular therapies.	Illumina NovaSeq 6000	1
EGAD00001015458	This batch is a subset of the full DETECT-A dataset, containing 9 fastq files generated from 8 subjects. All sequencing was conducted using Illumina HiSeq 4000 and Illumina MiSeq platforms. Note that the division into batches follows no specific criteria and that the sequencing data for each subject has multiple files which may span multiple datasets. Thus, for a comprehensive analysis, it is recommended to request access to all datasets that comprise this study.	Illumina HiSeq 4000	1
EGAD00001015460	Transcriptomic data for manuscript titled: Human proximal tubular epithelial cell interleukin-1 receptor signalling triggers cell cycle arrest during hypoxic kidney injury.	NextSeq 2000	16
EGAD00001015461	This dataset contains raw paired-end fastq files from 10x Genomics Visium spatial transcriptomics platform of four early-stage VSCC cases (PMID: 39787746)	NextSeq 2000	16
EGAD00001015464	TrypanoGen+ data containing fastq files from 138 samples.	Illumina Genome Analyzer	125
EGAD00001015465	Combination immune checkpoint inhibition (ICI) with anti-CTLA-4 and anti-PD-1 blockade has demonstrated significant clinical activity across tumour types. Rare cancer patients have limited treatment options due to the scarcity of studies of novel treatment options, so they are often offered chemotherapies untested in their disease. ONJ2016-001/CA209-538 is a prospective, multicentre clinical trial of combination ICI (ipilimumab and nivolumab) in patients with advanced rare cancers, including biliary tract, adrenocortical, rare gynaecological and neuroendocrine cancers (https://clinicaltrials.gov/ct2/show/NCT02923934). The primary endpoint is clinical benefit rate (complete response + partial response + stable disease >3 months) by RECIST 1.1. This study involves generation and analysis of pre-treatment formalin fixed paraffin embedded (FFPE) tumour DNA sequencing data, paired with germline DNA sequencing data, to potentially identify genomic biomarkers that may associate with clinical benefit or other clinical efficacy / toxicity metrics.	Illumina NovaSeq 6000	1
EGAD00001015467		unspecified	13
EGAD00001015468	Targeted gene sequencing data from cancer patient-derived organoids and their matched tumour tissue samples. Part of a study to include genomic and transcriptomic sequencing data, along with CRISPR screening data from cancer patient-derived organoids and matched tumour tissues. Some organoids were developed as part of the Human Cancer Models Initiative (HCMI) in collaboration with Cancer Research UK (CRUK). The study includes organoids derived from colorectal, oesophageal, ovarian, pancreatic, and stomach cancer samples.	HiSeq X Ten Illumina HiSeq 4000 Illumina NovaSeq 6000	1
EGAD00001015469	Whole genome sequencing data from cancer patient-derived organoids and their matched tumour tissue samples. Part of a study to include genomic and transcriptomic sequencing data, along with CRISPR screening data from cancer patient-derived organoids and matched tumour tissues. Some organoids were developed as part of the Human Cancer Models Initiative (HCMI) in collaboration with Cancer Research UK (CRUK). The study includes organoids derived from colorectal, oesophageal, ovarian, pancreatic, and stomach cancer samples.	HiSeq X Ten Illumina HiSeq 4000 Illumina NovaSeq 6000	1
EGAD00001015470	RNA sequencing data from cancer patient-derived organoids. Part of a study to include genomic and transcriptomic sequencing data, along with CRISPR screening data from cancer patient-derived organoids and matched tumour tissues. Some organoids were developed as part of the Human Cancer Models Initiative (HCMI) in collaboration with Cancer Research UK (CRUK). The study includes organoids derived from colorectal, oesophageal, ovarian, pancreatic, and stomach cancer samples.	Illumina HiSeq 4000 Illumina NovaSeq 6000	1
EGAD00001015471	The succession of somatic genetic events associated with the conversion of a normal colorectal epithelial cell into a colorectal carcinoma constitutes a paradigmatic model of cancer development. Familial Adenomatous Polyposis (FAP) is caused by constitutional inactivating mutations in APC, the central gatekeeper gene of colorectal cancer, and is associated with a substantially increased lifetime-risk of colorectal cancer. To investigate the earliest stages of neoplastic change due to APC inactivation, we microdissected and individually whole genome sequenced 279 histologically normal and abnormal colorectal crypts from 15 individuals with FAP. Histologically normal crypts generally exhibited similar mutation burdens and mutational signatures to normal crypts from wild-type individuals of the same age, with 1/110 carrying a somatic inactivating APC mutation. By contrast, 9/18 aberrant crypt foci carried somatic APC mutations and exhibited modestly increased burdens of some mutational signatures found in normal crypts. 12/13 diminutive adenomatous polyps (less than 5mm diameter) showed somatic APC mutations and carried substantially increased mutation loads of most mutational signatures present in normal crypts. Phylogenetic trees of crypts from aberrant crypt foci and adenomatous polyps revealed that some had acquired their initiating somatic APC mutations decades previously during the first few years of life. The results catalogue the changes in somatic mutation rates, mutational processes and “driver” mutations in cancer genes during the earliest stages of colorectal neoplastic transformation initiated by APC inactivation and highlight the long periods of clonal evolution required for a cancer to develop.	HiSeq X Ten Illumina NovaSeq 6000	1
EGAD00001015472	This dataset contains 13 RNA-seq samples in CRAM format, generated using the Illumina NovaSeq 6000 platform. The Roche_mRNA_UniversalAdapterUniquePrime protocol was used for library preparation. The aim of the study is to determine the effect of Short-Term Fasting on the abundance and phenotype of different immune cell populations.		1
EGAD00001015474	WES and RNAseq data from Clonal driver neoantigen loss under EGFR TKI and immune selection pressures	Illumina HiSeq 4000	10
EGAD00001015475	RNASeq files for paper titled "Preclinical Pediatric Molecular Analysis for Therapy Choice (MATCH)"	Illumina HiSeq 2000	578
EGAD00001015476	Transcriptomic Data for Manuscript with title: Comprehensive genomic profiling in esophageal adenocarcinoma unmasks potential precision therapies	Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina NovaSeq 6000 NextSeq 2000	57
EGAD00001015477	The succession of somatic genetic events associated with the conversion of a normal colorectal epithelial cell into a colorectal carcinoma constitutes a paradigmatic model of cancer development. Familial Adenomatous Polyposis (FAP) is caused by constitutional inactivating mutations in APC, the central gatekeeper gene of colorectal cancer, and is associated with a substantially increased lifetime-risk of colorectal cancer. To investigate the earliest stages of neoplastic change due to APC inactivation, we microdissected and individually whole genome sequenced 279 histologically normal and abnormal colorectal crypts from 15 individuals with FAP. Histologically normal crypts generally exhibited similar mutation burdens and mutational signatures to normal crypts from wild-type individuals of the same age, with 1/110 carrying a somatic inactivating APC mutation. By contrast, 9/18 aberrant crypt foci carried somatic APC mutations and exhibited modestly increased burdens of some mutational signatures found in normal crypts. 12/13 diminutive adenomatous polyps (less than 5mm diameter) showed somatic APC mutations and carried substantially increased mutation loads of most mutational signatures present in normal crypts. Phylogenetic trees of crypts from aberrant crypt foci and adenomatous polyps revealed that some had acquired their initiating somatic APC mutations decades previously during the first few years of life. The results catalogue the changes in somatic mutation rates, mutational processes and “driver” mutations in cancer genes during the earliest stages of colorectal neoplastic transformation initiated by APC inactivation and highlight the long periods of clonal evolution required for a cancer to develop.		1
EGAD00001015478	CUTRUN files for paper titled "Fusion oncoproteins and cooperating mutations define disease phenotypes in NUP98-rearranged leukemia"	Illumina HiSeq 2000	30
EGAD00001015479	scRNASeq files for paper titled "Fusion oncoproteins and cooperating mutations define disease phenotypes in NUP98-rearranged leukemia" PMID: 39974131, PMCID: PMC11838931, DOI: 10.1101/2025.01.21.25320683	Illumina HiSeq 2000	9
EGAD00001015480	Virtually no other tumour type is associated with so many different forms as skin cancer. Histologically, tumours of the skin may arise from epithelium, including epidermis, hair follicle, sebaceous or sweat gland, melanocytes, dermal-associated mesenchymal structures or tissue resident immune cells, making for a diversity of clinical presentations. Importantly, many skin tumour types have an extremely poor prognosis. Many skin tumour subtypes have never undergone molecular profiling, or if they have, targeted sequencing has been used and the number of cases analyzed has been so limited that deriving firm conclusions about the profile of driver genes, DNA mutational signatures and germline alleles has not been possible.For 70 key skin tumour subtypes defined by the World Health Organization, that have not undergone extensive genetic analysis previously, we propose to perform whole exome and transcriptome sequencing, from a range of body sites, to build a genomic atlas of dermatological tumours, including detailed maps of SNVs, copy number alterations, genome-wide methylation and expression profiles.	Illumina NovaSeq 6000	1
EGAD00001015481	Virtually no other tumour type is associated with so many different forms as skin cancer. Histologically, tumours of the skin may arise from epithelium, including epidermis, hair follicle, sebaceous or sweat gland, melanocytes, dermal-associated mesenchymal structures or tissue resident immune cells, making for a diversity of clinical presentations. Importantly, many skin tumour types have an extremely poor prognosis. Many skin tumour subtypes have never undergone molecular profiling, or if they have, targeted sequencing has been used and the number of cases analyzed has been so limited that deriving firm conclusions about the profile of driver genes, DNA mutational signatures and germline alleles has not been possible.For 70 key skin tumour subtypes defined by the World Health Organization, that have not undergone extensive genetic analysis previously, we propose to perform whole exome and transcriptome sequencing, from a range of body sites, to build a genomic atlas of dermatological tumours, including detailed maps of SNVs, copy number alterations, genome-wide methylation and expression profiles.	Illumina NovaSeq 6000	1
EGAD00001015482	The SETBP1 gene encodes a DNA-binding nuclear protein that regulates gene expression across multiple tissues. Precise control of SETBP1 dosage is essential for normal cellular function, as both loss- and gain-of-function variants of SETBP1 can lead to severe phenotypic consequences. De novo germline gain-of-function point mutations that prolong SETBP1 half-life result in Schinzel-Giedion syndrome (SGS), an ultra-rare and severe congenital disorder associated with extensive developmental abnormalities and health complications. These SGS-associated variants cluster within a hotspot in the SKI-homology domain of SETBP1, overlapping with somatic mutations recurrently observed in myeloid leukaemia with poor prognosis. Thus far, the precise mechanisms by which these SETBP1 mutations drive disease remain largely unclear. Here, through single-cell and single-nucleus mRNA sequencing of a rare collection of primary patient samples with SGS and myeloid leukaemia, we examined the transcriptional consequences of SETBP1 gain-of-function variants in haematopoietic cells, comparing their impacts when acquired in the germline and somatically in malignancies.	Illumina NovaSeq 6000	1
EGAD00001015484	WXS files for paper titled "Preclinical Pediatric Molecular Analysis for Therapy Choice (MATCH)"	Illumina HiSeq 2000	752
EGAD00001015485	Geographic and age-related variations in mutational processes in colorectal cancer - sequence data (Mutographs)		1
EGAD00001015486	Geographic and age-related variations in mutational processes in colorectal cancer - filtered vcf files (Mutographs)		1
EGAD00001015487	Geographic and age-related variations in mutational processes in colorectal cancer - copy number variants (Mutographs)		1
EGAD00001015488	Bulk RNASeq data of Duodenal organoids, sorted ILC3 and duodenal tissue	Illumina NovaSeq 6000	84
EGAD00001015489	The dataset comprises plasma and bronchoalveolar lavage (BAL) fluid samples from immunocompromised pediatric patients. Sequencing was performed on the NovaSeq 6000 platform, generating 2x150bp paired-end reads. The samples are provided as raw reads, post-demultiplexing, with no additional processing.	Illumina NovaSeq 6000	46
EGAD00001015490	WGS files for Klco paper titled "Genomic Landscape and Clonal Architecture in Pediatric Myeloid Neoplasms with Chromosome 7 Deletions"	Illumina HiSeq 2000	8
EGAD00001015491	WXS files for Klco paper titled "Genomic Landscape and Clonal Architecture in Pediatric Myeloid Neoplasms with Chromosome 7 Deletions"	Illumina HiSeq 2000	8
EGAD00001015492	RNASeq files for Klco paper titled "Genomic Landscape and Clonal Architecture in Pediatric Myeloid Neoplasms with Chromosome 7 Deletions"	Illumina HiSeq 2000	11
EGAD00001015493	MissionBio files for Klco paper titled "Genomic Landscape and Clonal Architecture in Pediatric Myeloid Neoplasms with Chromosome 7 Deletions"	Illumina HiSeq 2000	8
EGAD00001015494	This dataset focuses on mitochondrial variant detection using the TAMITO-seq protocol. A pre-designed mitochondrial panel (https://designer.missionbio.com/catalogpanels/Virtual-mtDNA) was spiked into the scTAMseq workflow targeting 200 CpGs and 167 genotyping loci. Data were generated from a peripheral blood sample of a healthy donor, donor X.2, multiplexed with data from another donor.	Illumina NovaSeq 6000	1
EGAD00001015495	This dataset comprises targeted single-cell RNA expression profiles derived from the same CD34+ bone marrow sample (patient X.1) used for scTAMARA_DNA. 120 RNA targets were selected using LASSO regression to capture key transcripts predictive of cell state within the hematopoietic compartment.	Illumina NovaSeq 6000	1
EGAD00001015496	This dataset features single-cell DNA methylation and nuclear genotyping data obtained using the scTAMARA-seq protocol, a combination of SDR-seq and scTAM-seq. The targeted panel, 200 DNA methylation (a subset of the orginal 448 CpG targets) and 167 genotyping targets, were applied to a CD34+ enriched bone marrow sample from patient X.1. The data was generated on the Mission Bio Tapestri platform.	Illumina NovaSeq 6000	2
EGAD00001015497	This dataset contains single-cell surface marker expression profiles generated by integrating the Cite-seq protocol with scTAMseq. Samples were stained with an antibody panel targeting 45 (donor A.6 & A.7) and 56 (donors A.1-A.5, B.1-B.5, and X.2 ) surface proteins to enable robust phenotypic characterization.	Illumina NovaSeq 6000	10
EGAD00001015498	This dataset comprises single-cell DNA methylation and nuclear genotyping profiles of 7 total bone marrow aspirates (donors A.1-A.7) and 5 CD34+-enriched bone marrow samples (donors B.1-B.5) generated using the scTAMseq protocol. Sequencing was performed on the Mission Bio Tapestri platform, targeting a panel of 448 CpGs to assess epigenetic patterns and We also included 147 genomic regions commonly mutated in clonal hematopoiesis and 20 regions targeting chromosome Y. Samples B.1 & B.5, B.2 & B.4 and A.2 & A.5 were, in pairs, multiplexed into single tapestri lanes.	Illumina NovaSeq 6000	9
EGAD00001015501	30X Illumina HiSeqX whole genome sequencing of 200 samples from the CROATIA-Korcula Study. FASTQ files are deposited	HiSeq X Ten	200
EGAD00001015502	Key objective Are the prognostic transcriptomic G1/G2 gene expression signature, MYC overexpression, and MYC amplification replicable stratifying biomarkers for future clinical trials in high-grade osteosarcoma? Knowledge gathered In an unselected cohort, the G2 gene expression signature and MYC overexpression, but not MYC amplification, were independently associated with poor event-free and overall survival. Relevance Transcriptomic biomarkers may serve as stratifying factors that guide the management of patients with high-grade osteosarcoma. Current data underlines the importance of prospective validation of the G1/G2 signature and MYC overexpression in an international, multicenter, study.		1
EGAD00001015503	Key objective Are the prognostic transcriptomic G1/G2 gene expression signature, MYC overexpression, and MYC amplification replicable stratifying biomarkers for future clinical trials in high-grade osteosarcoma? Knowledge gathered In an unselected cohort, the G2 gene expression signature and MYC overexpression, but not MYC amplification, were independently associated with poor event-free and overall survival. Relevance Transcriptomic biomarkers may serve as stratifying factors that guide the management of patients with high-grade osteosarcoma. Current data underlines the importance of prospective validation of the G1/G2 signature and MYC overexpression in an international, multicenter, study.		1
EGAD00001015504	Multiome sequencing of patient tissue and patient derived organoids in colon cancer	Illumina NovaSeq 6000	22
EGAD00001015505	This dataset contains panel sequencing data of 228 samples from patients with glioblastoma. Sequencing has been performed on Illumina NovaSeq 6000 and NextSeq 500. The sequencing was always paired.	Illumina NovaSeq 6000 NextSeq 500	227
EGAD00001015508	In this study, we demonstrate that primary AML cells harboring the chromosomal translocation t(8;21) are critically dependent on the corresponding fusion gene, RUNX1::RUNX1T1, to suppress differentiation and maintain stemness. Silencing RUNX1::RUNX1T1 expression using siRNA-loaded lipid nanoparticles induces significant changes in chromatin accessibility, redirecting the leukemia-associated transcriptional network towards a myeloid differentiation program. Single-cell analyses reveal that this transcriptional reprogramming is associated with the depletion of immature stem and progenitor-like cell populations, accompanied by an expansion of granulocytic and eosinophilic/mast cell-like populations with impaired self-renewal capacity.		1
EGAD00001015509	In this study, we demonstrate that primary AML cells harboring the chromosomal translocation t(8;21) are critically dependent on the corresponding fusion gene, RUNX1::RUNX1T1, to suppress differentiation and maintain stemness. Silencing RUNX1::RUNX1T1 expression using siRNA-loaded lipid nanoparticles induces significant changes in chromatin accessibility, redirecting the leukemia-associated transcriptional network towards a myeloid differentiation program. Single-cell analyses reveal that this transcriptional reprogramming is associated with the depletion of immature stem and progenitor-like cell populations, accompanied by an expansion of granulocytic and eosinophilic/mast cell-like populations with impaired self-renewal capacity.	Illumina NovaSeq 6000	6
EGAD00001015510	In this study, we demonstrate that primary AML cells harboring the chromosomal translocation t(8;21) are critically dependent on the corresponding fusion gene, RUNX1::RUNX1T1, to suppress differentiation and maintain stemness. Silencing RUNX1::RUNX1T1 expression using siRNA-loaded lipid nanoparticles induces significant changes in chromatin accessibility, redirecting the leukemia-associated transcriptional network towards a myeloid differentiation program. Single-cell analyses reveal that this transcriptional reprogramming is associated with the depletion of immature stem and progenitor-like cell populations, accompanied by an expansion of granulocytic and eosinophilic/mast cell-like populations with impaired self-renewal capacity.	Illumina NovaSeq 6000	3
EGAD00001015514	Solve-RD SR-WGS data generated for 3 samples provided by the group tuh-kounap	DNBSEQ-G400	1
EGAD00001015515	WGS bam files for paired diagnostic and remission samples	Illumina NovaSeq 6000	30
EGAD00001015516	Abstract: Cancer cells display highly heterogeneous and plastic states in glioblastoma, an incurable brain tumour. However, how these malignant states arise and whether they follow a tractable cellular trajectory across tumours is poorly understood. Here, we generated a deep single cell and spatial multi-omic atlas of human glioblastoma that pairs transcriptomic, epigenomic and genomic profiling of 12 tumours across multiple regions. We identify that glioblastoma heterogeneity is driven by spatially-patterned transitions of cancer cells from developmental-like states towards those defined by a glial injury response and hypoxia. This cancer cell trajectory regionalizes tumours into distinct tissue niches and manifests in a conserved manner across tumours as well as genetically distinct tumour subclones. Moreover, using a new deep learning framework to jointly map cancer cell states and clones in situ, we show that tumour subclones are finely spatially intermixed through glioblastoma tissue niches. Finally, we show that this cancer cell trajectory is intimately linked to myeloid heterogeneity and unfolds across regionalised myeloid signalling environments. Our findings define a stereotyped trajectory of cancer cells in glioblastoma and unify glioblastoma tumour heterogeneity into a tractable cellular and tissue framework.	Illumina NovaSeq 6000	61
EGAD00001015519	Single-cell RNA-seq (Smart-Seq2) from 3 distant biopsies (costal, mediastinal, diaphragmatic) of a pleural mesothelioma case.	Illumina HiSeq 2000	384
EGAD00001015526	GBM-Space: Joint Transcriptome and Chromatin Accessibility Profiling of Glioblastoma (10x Genomics - Multiome)	Illumina NovaSeq 6000	1
EGAD00001015527	Cancer cells display heterogeneous and dynamic states in glioblastoma, but how these malignant states arise and whether they follow a tractable cellular trajectory across tumours is poorly understood. Here, we generate a deep single cell and spatial multi-region atlas of 12 isocitrate dehydrogenase wild-type (IDH-wt) primary glioblastomas that integrates transcriptomic, epigenomic and genomic analysis to comprehensively characterise their tumour heterogeneity. The datasets in this study include sequencing data from Visium spatial transcriptomic (10x Genomics) profiling of these tumours.	Illumina NovaSeq 6000	1
EGAD00001015533	This dataset contains low-coverage whole-genome sequencing (lcWGS) data generated from three GIAB reference samples (HG001–HG003) for method benchmarking and concordance analysis in FFPE-GWAS workflows. The data were generated using Illumina NovaSeq with paired-end 150bp reads and downsampled to 2x coverage.	Illumina NovaSeq 6000	1
EGAD00001015534	TRACERx (TRAcking Cancer Evolution through therapy (Rx)) is a prospective cohort study designed to investigate intratumor heterogeneity (ITH) in relation to clinical outcome, and to determine the clonal nature of driver events and evolutionary processes in early stage non-small cell lung cancer (NSCLC).	Illumina HiSeq 2500 NextSeq 2000	124
EGAD00001015535	Fragmentomic features of cell-free DNA represent promising non-invasive biomarkers for cancer diagnosis. However, a lack of systematic evaluation of biases in feature quantification has hindered the adoption of such applications. We compared features derived from whole-genome sequencing of ten healthy donors using nine library kits and ten data-processing routes, and validated them in 1,182 plasma samples from published studies. Our results clarify the variations resulting from library preparation and feature quantification methods. We designed the Trim Align Pipeline and the cfDNAPro R package as unified interfaces for data pre-processing, feature extraction, and visualisation, aiming to standardise multimodal feature engineering and integration for machine learning.	Illumina NovaSeq 6000	82
EGAD00001015536	Nanopore Sequencing Data	PromethION	2
EGAD00001015537	Whole exome, RNA sequencing and TCR sequencing of organoid samples derived from TRACERx patients. The dataset also includes whole exome sequencing from the tumour samples from which the organoids were generated, as well as whole exome sequencing from PDX derived from said tumours.	Illumina HiSeq 2000 Illumina HiSeq 4000 Illumina NovaSeq 6000	125
EGAD00001015538	The dataset for “Validation of cfDNA fragmentome analyses for early detection of liver cancer” includes 380 cram files from whole genome next-generation sequencing on the Illumina NovaSeq 6000 with 10x coverage. The samples analyzed include plasma samples from patients with cancer and liver disease.	Illumina NovaSeq 6000	380
EGAD00001015539	UCL Lung PCA - Whole exome multiregion sequencing data from human preinvasive airway biopsies	Illumina NovaSeq 6000	12
EGAD00001015542	This dataset contains transcriptome sequencing for 12 samples from atypical teratoid rhabdoid tumors (ATRTs). The sequencing was performed on Illumina HiSeq 4000. The sequencing was always paired.	Illumina HiSeq 4000	12
EGAD00001015543	H3K4me3, IgG, and Input ChIP-seq in overexpression of pLV Control, CS-FL and CS-ΔEx4 in SW1116 cells	Illumina HiSeq 2000	6
EGAD00001015544	Study describing the differentiation trajectory of atypical teratoid rhabdoid tumors (ATRTs) at single-cell level. This study describes how the differentiation trajectory differs per subtype of ATRTs, and how those findings could contribute to develop so-called "maturation therapy" tailored towards ATRTs.	Illumina NovaSeq X	16
EGAD00001015545	Study describing the differentiation trajectory...	Illumina NovaSeq 6000	6
EGAD00001015546	Study describing the differentiation trajectory...	Illumina NovaSeq 6000	6
EGAD00001015547	Transcriptomic data generated by RNA sequencing of 30 adults with acute leukemia of ambiguous lineage (ALAL)	Illumina NovaSeq 6000	6
EGAD00001015548	HCA Organoids \| Colon - Cancer, Whole Exome Sequencing (WES)	Illumina NovaSeq 6000	23
EGAD00001015583	Genome and transcriptome sequence data from a malignant peripheral nerve sheath tumor patient,generated as part of the BC Cancer Agency's Personalized OncoGenomics (POG) study		1
EGAD00001015590	Mutations that occur during human ageing in the cell lineages of sperm or eggs have the potential to be transmitted to offspring. In males, positive selection of driver mutations during spermatogenesis is known to increase the birth prevalence of certain developmental disorders and cancer predisposition syndromes. However, direct observation of the scope of this selection in sperm has been limited by the error rates of sequencing technologies. Using the duplex sequencing method known as NanoSeq, we sequenced bulk sperm samples from individuals aged 24 to 75 years. Our findings revealed a linear accumulation mutations per year and mutational signatures consistent with pedigree studies. Deep targeted and deep exome NanoSeq of sperm samples identified genes subject to significant positive selection in the male germline, which are linked to diverse cellular pathways including RAS/MAPK and BMP signaling. Strikingly, nearly all positively selected genes are associated with developmental or cancer predisposition disorders in children. We find that this effect drives an elevated risk of known likely disease-causing mutations in sperm across a wide age range. This implies that the disorders attributed to these genes likely exhibit elevated birth prevalence in human populations, particularly among older fathers. These findings shed light on the dynamics of germline mutations and highlight an increased disease risk for children born to fathers of advanced age.	Illumina NovaSeq 6000	1
EGAD00001015591	Mutations that occur during human ageing in the cell lineages of sperm or eggs have the potential to be transmitted to offspring. In males, positive selection of driver mutations during spermatogenesis is known to increase the birth prevalence of certain developmental disorders and cancer predisposition syndromes. However, direct observation of the scope of this selection in sperm has been limited by the error rates of sequencing technologies. Using the duplex sequencing method known as NanoSeq, we sequenced bulk sperm samples from individuals aged 24 to 75 years. Our findings revealed a linear accumulation mutations per year and mutational signatures consistent with pedigree studies. Deep targeted and deep exome NanoSeq of sperm samples identified genes subject to significant positive selection in the male germline, which are linked to diverse cellular pathways including RAS/MAPK and BMP signaling. Strikingly, nearly all positively selected genes are associated with developmental or cancer predisposition disorders in children. We find that this effect drives an elevated risk of known likely disease-causing mutations in sperm across a wide age range. This implies that the disorders attributed to these genes likely exhibit elevated birth prevalence in human populations, particularly among older fathers. These findings shed light on the dynamics of germline mutations and highlight an increased disease risk for children born to fathers of advanced age.	Illumina NovaSeq 6000	11
EGAD00001015592	Mutations that occur during human ageing in the cell lineages of sperm or eggs have the potential to be transmitted to offspring. In males, positive selection of driver mutations during spermatogenesis is known to increase the birth prevalence of certain developmental disorders and cancer predisposition syndromes. However, direct observation of the scope of this selection in sperm has been limited by the error rates of sequencing technologies. Using the duplex sequencing method known as NanoSeq, we sequenced bulk sperm samples from individuals aged 24 to 75 years. Our findings revealed a linear accumulation mutations per year and mutational signatures consistent with pedigree studies. Deep targeted and deep exome NanoSeq of sperm samples identified genes subject to significant positive selection in the male germline, which are linked to diverse cellular pathways including RAS/MAPK and BMP signaling. Strikingly, nearly all positively selected genes are associated with developmental or cancer predisposition disorders in children. We find that this effect drives an elevated risk of known likely disease-causing mutations in sperm across a wide age range. This implies that the disorders attributed to these genes likely exhibit elevated birth prevalence in human populations, particularly among older fathers. These findings shed light on the dynamics of germline mutations and highlight an increased disease risk for children born to fathers of advanced age.	Illumina NovaSeq 6000	1
EGAD00001015593	The dataset consists of a single sample whereby the genomic DNA was analysed using PacBio WGS at 30x coverage to assess the breakpoint sites of complex chromosomal rearrangements.	PacBio RS II	1
EGAD00001015598	Extensive characterization of mutational signature SBS7a in B-cell precursor acute lymphoblastic leukemia (BCP-ALL) samples. We aimed to describe the presentation of SBS7a in BCP-ALL, identify other pediatric cancers presenting with SBS7a, look into the link of SBS7a and UV-light in the context of BCP-ALL and cancers with provable UV exposure, and pinpoint when SBS7a was acquired.		1
EGAD00001015599	Extensive characterization of mutational signature SBS7a in B-cell precursor acute lymphoblastic leukemia (BCP-ALL) samples. We aimed to describe the presentation of SBS7a in BCP-ALL, identify other pediatric cancers presenting with SBS7a, look into the link of SBS7a and UV-light in the context of BCP-ALL and cancers with provable UV exposure, and pinpoint when SBS7a was acquired.		1
EGAD00001015600	We characterized B-cell precursor acute lymphoblastic leukemia (BCP-ALL) patients that present with the mutational signature SBS7a. We found clear subtype specificity, but presence of SBS7a did not seem linked to heterogeneity within subtypes. To further distinguish the incidence of SBS7a in pediatric cancer we analyzed a pan-cancer WGS cohort and identified frequent SBS7a in anaplastic large cell lymphomas. As SBS7a is commonly found in skin cancers and has been linked to UV-light induced DNA damage, we compared the features of UV-light induced DNA damage in skin cancer and SBS7a in ALL, and found high similarity. We found no defects in UV damage response pathways in SBS7a-positive samples, indicating no additional UV vulnerability. Additionally, transcriptomic data showed no differences between samples with and without SBS7a. Finally we looked into the moment of exposure to the source of SBS7a by deep sequencing of diagnosis-relapse pairs and single cell whole genome sequencing of diagnosis samples with high SBS7a.	Illumina NovaSeq 6000	22
EGAD00001015601	Extensive characterization of mutational signature SBS7a in B-cell precursor acute lymphoblastic leukemia (BCP-ALL) samples. We aimed to describe the presentation of SBS7a in BCP-ALL, identify other pediatric cancers presenting with SBS7a, look into the link of SBS7a and UV-light in the context of BCP-ALL and cancers with provable UV exposure, and pinpoint when SBS7a was acquired.	Illumina NovaSeq 6000	2
EGAD00001015602	We provide a single-nuclei RNA-sequencing (snRNA-seq) dataset derived from four COVID-19 patients, generated using the 10x Chromium Next GEM Single Cell v3.1 kit. For our study, these were integrated with snRNA-seq data with 12 publicly available sc/snRNA-seq datasets, comprising organ donor lung samples (n=89) and COVID-19 lung tissue samples (n=51).	Illumina NovaSeq 6000	75
EGAD00001015603	Bam files are from matched normal + "very early" plasma (diagnostic) samples, sequenced on Illumina NovaSeq 6000 instrument. After bwa alignment, duplicates were removed by picard markduplicate, and base quality scores recalibrated using gatk baserecalibrator + apply bqsr.	Illumina NovaSeq 6000	14
EGAD00001015606	The dataset consists of whole exome and genome sequencing data from infertile males and his parent(s) and/or sibling(s)	Illumina NovaSeq 6000 Illumina NovaSeq X Plus	131
EGAD00001015608	This dataset contains WES and RNASeq data of 31 CDS and ES tumor and control samples. Sequencing was performed on Illumina NovaSeq 6000. The sequencing was performed paired in bulk.	Illumina HiSeq 4000 Illumina NovaSeq 6000	31
EGAD00001015609	Mapping of runs back to samples for PDN, PDO and TIS snRNASeq data. 4 sets of matched PDN, PDO and TIS samples (12 unique samples total) underwent pooled snRNASeq (Parse Biosciences). 2 sublibraries were created from this and sequenced at separate times	Illumina NovaSeq X	2
EGAD00001015611	This study leverages PacBio HiFi long-read sequencing technology to comprehensively analyze the genomic architecture of rare diseases. By generating highly accurate, long-read sequences, we aim to resolve complex structural variants, repeat expansions, and other genomic features that are often missed by short-read sequencing. Our approach facilitates improved variant detection and interpretation, advancing the understanding of rare disease etiology and enabling more precise genetic diagnoses.	Sequel IIe	10
EGAD00001015613	Formalin fixed, paraffin-embedded human breast cancer tumor samples had RNA extracted using the Thermo Scientific KingFisher Flex instrument and the Applied Biosystems MagMAX FFPE DNA/RNA Ultra Kit; Total RNA libraries were created using the Bravo Automated Liquid-Handling Platform and the TruSeq Stranded Total RNA Library Prep Gold Kit; libraries were sequenced on the Illumina NovaSeq 6000 machine using a 2x50 bp paired-end configuration to target a read depth of 120 million clusters per library.	Illumina NovaSeq 6000	328
EGAD00001015614	This dataset contains transcriptome and whole genome bisulfite sequencing for 40 samples with glioblastoma(GBM). The sequencing was performed on Illumina NextSeq 2000 and Illumina NextSeq 550, . The sequencing was always paired.	NextSeq 2000 NextSeq 550	40
EGAD00001015615	Illumina platform sequencing of whole genome libraries prepared from normal and cancer samples and whole transcriptome libraries from cancer samples from 45 donors	Illumina NovaSeq 6000	135
EGAD00001015618	Targeted NanoSeq in buccal epithelium obtained longitudinally from a number of donors. A custom-designed panel of ~200 genes will be used to achieve high depth coverage (aiming for ~1000X at panel sites).As we age, many tissues become colonised by microscopic clones carrying somatic driver mutations. Some of these clones represent a first step towards cancer whereas others may contribute to ageing and other diseases. However, our understanding of the clonal landscapes of human tissues, and their impact on cancer risk, ageing and disease, remains limited due to the challenge of detecting somatic mutations present in small numbers of cells. Here, we introduce a new version of nanorate sequencing (NanoSeq), a duplex sequencing method with error rates of less than 55 per billion base pairs, which is compatible with whole-exome and targeted gene sequencing. Deep sequencing of polyclonal samples with single-molecule sensitivity enables the simultaneous detection of mutations in large numbers of clones, yielding accurate somatic mutation rates, mutational signatures and driver mutation frequencies in any tissue. Applying targeted NanoSeq to 1,042 non-invasive samples of oral epithelium and 371 samples of blood from a twin cohort, we found an unprecedentedly rich landscape of selection, with 46 genes under positive selection driving clonal expansions in the oral epithelium, over 62,000 driver mutations, and evidence of negative selection in some genes. The high number of positively selected mutations in multiple genes provides high-resolution maps of selection across coding and non-coding sites, a form of in vivo saturation mutagenesis. Multivariate regression models enable mutational epidemiology studies on how carcinogenic exposures and cancer risk factors, such as age, tobacco or alcohol, alter the acquisition and selection of somatic mutations. Accurate single-molecule sequencing has the potential to unveil the polyclonal landscape of any tissue, providing a powerful tool to study early carcinogenesis, cancer prevention and the role of somatic mutations in ageing and disease.	Illumina NovaSeq 6000	1
EGAD00001015619	Utilising Targeted NanoSeq, we hope to explore the driver landscape in unprecedented detail in donors from TwinsUK. These donors have paired buccal swabs and data from this will help contextualise that work. As we age, many tissues become colonised by microscopic clones carrying somatic driver mutations. Some of these clones represent a first step towards cancer whereas others may contribute to ageing and other diseases. However, our understanding of the clonal landscapes of human tissues, and their impact on cancer risk, ageing and disease, remains limited due to the challenge of detecting somatic mutations present in small numbers of cells. Here, we introduce a new version of nanorate sequencing (NanoSeq), a duplex sequencing method with error rates of less than 5 per billion base pairs, which is compatible with whole-exome and targeted gene sequencing. Deep sequencing of polyclonal samples with single-molecule sensitivity enables the simultaneous detection of mutations in large numbers of clones, yielding accurate somatic mutation rates, mutational signatures and driver mutation frequencies in any tissue. Applying targeted NanoSeq to 1,042 non-invasive samples of oral epithelium and 371 samples of blood from a twin cohort, we found an unprecedentedly rich landscape of selection, with 46 genes under positive selection driving clonal expansions in the oral epithelium, over 62,000 driver mutations, and evidence of negative selection in some genes. The high number of positively selected mutations in multiple genes provides high-resolution maps of selection across coding and non-coding sites, a form of in vivo saturation mutagenesis. Multivariate regression models enable mutational epidemiology studies on how carcinogenic exposures and cancer risk factors, such as age, tobacco or alcohol, alter the acquisition and selection of somatic mutations. Accurate single-molecule sequencing has the potential to unveil the polyclonal landscape of any tissue, providing a powerful tool to study early carcinogenesis, cancer prevention and the role of somatic mutations in ageing and disease.	Illumina NovaSeq 6000	1
EGAD00001015620	Exome NanoSeq in buccal swab samples obtained from patients with a wide range of clinical phenotypes. As we age, many tissues become colonised by microscopic clones carrying somatic driver mutations. Some of these clones represent a first step towards cancer whereas others may contribute to ageing and other diseases. However, our understanding of the clonal landscapes of human tissues, and their impact on cancer risk, ageing and disease, remains limited due to the challenge of detecting somatic mutations present in small numbers of cells. Here, we introduce a new version of nanorate sequencing (NanoSeq), a duplex sequencing method with error rates of less than 5 per billion base pairs, which is compatible with whole-exome and targeted gene sequencing. Deep sequencing of polyclonal samples with single-molecule sensitivity enables the simultaneous detection of mutations in large numbers of clones, yielding accurate somatic mutation rates, mutational signatures and driver mutation frequencies in any tissue. Applying targeted NanoSeq to 1,042 non-invasive samples of oral epithelium and 371 samples of blood from a twin cohort, we found an unprecedentedly rich landscape of selection, with 46 genes under positive selection driving clonal expansions in the oral epithelium, over 62,000 driver mutations, and evidence of negative selection in some genes. The high number of positively selected mutations in multiple genes provides high-resolution maps of selection across coding and non-coding sites, a form of in vivo saturation mutagenesis. Multivariate regression models enable mutational epidemiology studies on how carcinogenic exposures and cancer risk factors, such as age, tobacco or alcohol, alter the acquisition and selection of somatic mutations. Accurate single-molecule sequencing has the potential to unveil the polyclonal landscape of any tissue, providing a powerful tool to study early carcinogenesis, cancer prevention and the role of somatic mutations in ageing and disease.	Illumina NovaSeq 6000	1
EGAD00001015621	Utilising restriction enzyme NanoSeq, we hope to explore the somatic mutation burden and mutational signatures in buccal cells from TwinsUK donors. Some donors have paired blood samples and data from this will help contextualise that work. All donors have paired Targeted NanoSeq data from buccal swabs. As we age, many tissues become colonised by microscopic clones carrying somatic driver mutations. Some of these clones represent a first step towards cancer whereas others may contribute to ageing and other diseases. However, our understanding of the clonal landscapes of human tissues, and their impact on cancer risk, ageing and disease, remains limited due to the challenge of detecting somatic mutations present in small numbers of cells. Here, we introduce a new version of nanorate sequencing (NanoSeq), a duplex sequencing method with error rates of less than 5 per billion base pairs, which is compatible with whole-exome and targeted gene sequencing. Deep sequencing of polyclonal samples with single-molecule sensitivity enables the simultaneous detection of mutations in large numbers of clones, yielding accurate somatic mutation rates, mutational signatures and driver mutation frequencies in any tissue. Applying targeted NanoSeq to 1,042 non-invasive samples of oral epithelium and 371 samples of blood from a twin cohort, we found an unprecedentedly rich landscape of selection, with 46 genes under positive selection driving clonal expansions in the oral epithelium, over 62,000 driver mutations, and evidence of negative selection in some genes. The high number of positively selected mutations in multiple genes provides high-resolution maps of selection across coding and non-coding sites, a form of in vivo saturation mutagenesis. Multivariate regression models enable mutational epidemiology studies on how carcinogenic exposures and cancer risk factors, such as age, tobacco or alcohol, alter the acquisition and selection of somatic mutations. Accurate single-molecule sequencing has the potential to unveil the polyclonal landscape of any tissue, providing a powerful tool to study early carcinogenesis, cancer prevention and the role of somatic mutations in ageing and disease.	Illumina NovaSeq 6000	1
EGAD00001015622	Development of a targeted methylation assay to determine the cell-type composition of a sample. As we age, many tissues become colonised by microscopic clones carrying somatic driver mutations. Some of these clones represent a first step towards cancer whereas others may contribute to ageing and other diseases. However, our understanding of the clonal landscapes of human tissues, and their impact on cancer risk, ageing and disease, remains limited due to the challenge of detecting somatic mutations present in small numbers of cells. Here, we introduce a new version of nanorate sequencing (NanoSeq), a duplex sequencing method with error rates of less than 5 per billion base pairs, which is compatible with whole-exome and targeted gene sequencing. Deep sequencing of polyclonal samples with single-molecule sensitivity enables the simultaneous detection of mutations in large numbers of clones, yielding accurate somatic mutation rates, mutational signatures and driver mutation frequencies in any tissue. Applying targeted NanoSeq to 1,042 non-invasive samples of oral epithelium and 371 samples of blood from a twin cohort, we found an unprecedentedly rich landscape of selection, with 46 genes under positive selection driving clonal expansions in the oral epithelium, over 62,000 driver mutations, and evidence of negative selection in some genes. The high number of positively selected mutations in multiple genes provides high-resolution maps of selection across coding and non-coding sites, a form of in vivo saturation mutagenesis. Multivariate regression models enable mutational epidemiology studies on how carcinogenic exposures and cancer risk factors, such as age, tobacco or alcohol, alter the acquisition and selection of somatic mutations. Accurate single-molecule sequencing has the potential to unveil the polyclonal landscape of any tissue, providing a powerful tool to study early carcinogenesis, cancer prevention and the role of somatic mutations in ageing and disease.	Illumina NovaSeq 6000	1
EGAD00001015623	Development of a targeted methylation assay to determine the cell-type composition of a sample. As we age, many tissues become colonised by microscopic clones carrying somatic driver mutations. Some of these clones represent a first step towards cancer whereas others may contribute to ageing and other diseases. However, our understanding of the clonal landscapes of human tissues, and their impact on cancer risk, ageing and disease, remains limited due to the challenge of detecting somatic mutations present in small numbers of cells. Here, we introduce a new version of nanorate sequencing (NanoSeq), a duplex sequencing method with error rates of less than 5 per billion base pairs, which is compatible with whole-exome and targeted gene sequencing. Deep sequencing of polyclonal samples with single-molecule sensitivity enables the simultaneous detection of mutations in large numbers of clones, yielding accurate somatic mutation rates, mutational signatures and driver mutation frequencies in any tissue. Applying targeted NanoSeq to 1,042 non-invasive samples of oral epithelium and 371 samples of blood from a twin cohort, we found an unprecedentedly rich landscape of selection, with 46 genes under positive selection driving clonal expansions in the oral epithelium, over 62,000 driver mutations, and evidence of negative selection in some genes. The high number of positively selected mutations in multiple genes provides high-resolution maps of selection across coding and non-coding sites, a form of in vivo saturation mutagenesis. Multivariate regression models enable mutational epidemiology studies on how carcinogenic exposures and cancer risk factors, such as age, tobacco or alcohol, alter the acquisition and selection of somatic mutations. Accurate single-molecule sequencing has the potential to unveil the polyclonal landscape of any tissue, providing a powerful tool to study early carcinogenesis, cancer prevention and the role of somatic mutations in ageing and disease.	Illumina NovaSeq 6000	1
EGAD00001015624	Bottleneck sequencing of human tissue including neurons, cord blood, sperm This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/. As we age, many tissues become colonised by microscopic clones carrying somatic driver mutations. Some of these clones represent a first step towards cancer whereas others may contribute to ageing and other diseases. However, our understanding of the clonal landscapes of human tissues, and their impact on cancer risk, ageing and disease, remains limited due to the challenge of detecting somatic mutations present in small numbers of cells. Here, we introduce a new version of nanorate sequencing (NanoSeq), a duplex sequencing method with error rates of less than 5 per billion base pairs, which is compatible with whole-exome and targeted gene sequencing. Deep sequencing of polyclonal samples with single-molecule sensitivity enables the simultaneous detection of mutations in large numbers of clones, yielding accurate somatic mutation rates, mutational signatures and driver mutation frequencies in any tissue. Applying targeted NanoSeq to 1,042 non-invasive samples of oral epithelium and 371 samples of blood from a twin cohort, we found an unprecedentedly rich landscape of selection, with 46 genes under positive selection driving clonal expansions in the oral epithelium, over 62,000 driver mutations, and evidence of negative selection in some genes. The high number of positively selected mutations in multiple genes provides high-resolution maps of selection across coding and non-coding sites, a form of in vivo saturation mutagenesis. Multivariate regression models enable mutational epidemiology studies on how carcinogenic exposures and cancer risk factors, such as age, tobacco or alcohol, alter the acquisition and selection of somatic mutations. Accurate single-molecule sequencing has the potential to unveil the polyclonal landscape of any tissue, providing a powerful tool to study early carcinogenesis, cancer prevention and the role of somatic mutations in ageing and disease.	Illumina NovaSeq 6000	32
EGAD00001015625	Samples not in other datasets for the paper titled; consHLA: a Next Generation Sequencing Consensus-based HLA Typing Workflow	Illumina NovaSeq 6000	8
EGAD00001015628	Dataset for Manuscript: Cancer genome standards for long-read sequencing using cancer cell line mixtures	PromethION	21
EGAD00001015629	RNA sequencing of 44 AML cases with diverse 3q26 rearrangements. This dataset is complementary to EGAD00001000726 (which only contains t(3;3)/inv(3)) and to EGAD00001006123.	Illumina HiSeq 2500 Illumina NovaSeq 6000	1
EGAD00001015631	This dataset contains raw single-cell RNA sequencing (scRNA-seq) data for five new kidney allograft nephrectomy samples used to construct the described in the following study: https://www.biorxiv.org/content/10.1101/2022.10.28.514222v2.full. The full description of processing is described in the manuscript. Briefly, samples were digested into a single cell suspension and run according to the 10X Genomics Chromium 5’ single cell kit (v2). Sequencing was performed using an Illumina Novaseq instrument. Samples were designated as TXN2 (EGAN00002519570, mycotic pseudoaneurysm, control dataset), TXN3 (EGAN00002235628, renal vein thrombus, control dataset), TXN5 (EGAN00002519573, chronic transplant rejection), TXN6 (EGAN00002519572, chronic pyelonephritis and renal artery stenosis, non-alloimmune graft injury dataset), TXN7 (EGAN00002519576, recurrent focal segmental glomerulosclerosis, non-alloimmune graft injury dataset).	Illumina HiSeq 4000 Illumina NovaSeq 6000	1
EGAD00001015632	This dataset contains the FASTQ files from the Single-Cell GEx+TCR 10X Genomics data from the study "Spatiotemporal T-cell tracking for personalized T-cell receptor T-cell therapy designs in childhood cancer" (EGAS00001008174)	Illumina NovaSeq 6000	19
EGAD00001015634	This dataset comprises single-cell RNA sequencing (scRNA-seq) and immune receptor (TCR/BCR) profiling of peripheral blood mononuclear cells (PBMCs) collected from individuals infected with dengue virus (DENV) exhibiting different clinical severities: asymptomatic dengue (AD), dengue fever (DF), and dengue hemorrhagic fever (DHF). PBMCs were collected during the acute phase and, in some cases, during convalescence. The data provide a resource for understanding protective versus pathogenic immune responses in dengue infection.	Illumina NovaSeq 6000	1
EGAD00001015635	Molecular characterization of 41 tumors from 17 individuals with CMMRD to gain a better understandig of mutational processes driving subsequent tumor development. The molecular characterization includes the investigation of tumor mutational load and mutational signatures.		1
EGAD00001015636	This dataset provides the first high-coverage (~30X) whole-genome sequencing (WGS) data from Sudan, encompassing 125 individuals from five ethnolinguistic groups: Copts, Beja, Fulani, Fur, and Mahas. These populations represent three major linguistic families: Afro-Asiatic, Nilo-Saharan, and Niger-Congo, offering a detailed view of the country’s genetic diversity.	HiSeq X Ten	1
EGAD00001015637	This dataset contains single-cell RNA sequencing (Smart-seq2) and paired TCR repertoire data of sorted dengue virus (DENV)-specific CD8⁺ T cells from individuals with differing clinical outcomes: asymptomatic dengue (AD), dengue fever (DF), and dengue hemorrhagic fever (DHF). Samples were obtained from HLA-A11⁺ and HLA-A24⁺ individuals during acute and convalescent phases, with antigen-specific CD8⁺ T cells isolated using HLA tetramers loaded with immunodominant DENV epitopes. The study identifies phenotypically distinct T cell subsets associated with disease severity, TCR avidity, and antigen reactivity to current or prior DENV serotypes. The dataset supports the understanding of CD8⁺ T cell heterogeneity, TCR features, and potential mechanisms of CD8⁺ T cell-mediated protection versus immunopathogenesis in dengue infection.	Illumina HiSeq 4000	1
EGAD00001015638	WXS dataset of Oncogenic and immunological targets for matched therapy of pediatric blood cancer patients: Dutch iTHER study experience		1
EGAD00001015639	RNA-seq dataset of Oncogenic and immunological targets for matched therapy of pediatric blood cancer patients: Dutch iTHER study experience		1
EGAD00001015640	Whole genome sequencing of 288 single-cell-derived blood colonies from 3 individuals with splicing factor-mutated clonal hematopoiesis.	Illumina NovaSeq 6000	1
EGAD00001015641	Cherry angiomas are common benign blood vessel growths on the skin. In this study we characterise mutational profile of a cherry angioma and adjacent skin biopsied from a patient with multiple lesions with segmental presentation.		1
EGAD00001015642	This dataset contains transcriptome sequencing for 15 samples from chordoma cells. The sequencing was performed on Illumina HiSeq 4000 and Illumina NovaSeq 6000. The sequencing was always paired.	Illumina HiSeq 4000 Illumina NovaSeq 6000	15
EGAD00001015644	This dataset contains FASTQ files from pooled single-cell RNA sequencing (scRNA-seq) of neural stem cells (NSCs) derived from induced pluripotent stem cells (iPSCs). The iPSCs were reprogrammed using the non-integrative Sendai virus system (CytoTune-iPS 2.0), introducing OCT3/4, SOX2, KLF4, and cMYC. Donor somatic cells included keratinocytes, PBMCs, and lymphoblastoid cell lines from individuals with ADHD and age-matched healthy controls (6–18 years). iPSCs underwent standard quality control, including immunocytochemistry, qRT-PCR, mycoplasma testing, and SNP genotyping. NSCs were produced via neural induction and validated by qRT-PCR and immunostaining for PAX6, SOX2, NESTIN, TUJ1, and FOXG1. Sequencing was performed on the Illumina NovaSeq 6000.	Illumina NovaSeq 6000	2
EGAD00001015645	Detection of human brain cancers using genomic and immune cell characterization of cerebrospinal fluid through CSF-BAM	Illumina NovaSeq 6000	754
EGAD00001015647	RNA-seq dataset of Neoantigen Peptides derived from V(D)J-recombined Immunoglobulins Drive Outgrowth of Cytolytic CD8+ T-cells		1
EGAD00001015648	Single cell transcriptomics study of thymic transplant biopsies Allogeneic thymus transplantation is the only curative therapy for complete DiGeorge Syndrome (cDGS), a rare severe primary immunodeficiency characterised by athymia. GOSH is one of only two centres worldwide to offer this treatment. Despite a lack of major histocompatibility complex (MHC)-matching between donor and host, transplanted thymus becomes repopulated by recipient bone marrow derived precursor cells and supports development of functional T-cells. The mechanisms underlying thymopoiesis in this context are poorly understood, but over time we observe reconstitution of T-cell immunity, with the ability to produce host naïve T-cells showing a broad T-cell receptor (TCR) repertoire and to generate MHC-restricted T-cell proliferative responses. Although lifesaving, the achieved immunological reconstitution is typically not complete with circulating T-cell numbers usually remaining below the age related normal ranges. Additionally, we observe persistence of donor-derived T-cells of unknown clinical significance. To gain more insight into the mechanisms by which MHC-mismatched transplanted thymus supports T-cell development with self-tolerance, as well as into the basis of suboptimal T-cell immunity, we now aim to investigate immune reconstitution after thymus transplantation in further detail by using single-cell transcriptomics, applied to thymic transplant biopsies and peripheral blood samples collected during standard post-transplant patient care. By identifying which lineages of host- derived cells repopulate the thymic tissue after transplantation, we will be able to address the role of MHC in positive and negative T-cell selection during T-cell differentiation. We will also be able to clarify the exact ontogeny of the persistent donor T-cells, as well as their possible role. Understanding the mechanisms of action of HLA-mismatched transplanted thymus will contribute to treatment optimisation. Additionally, our research provides a unique opportunity to further investigate key immunological concepts, such as tolerance and autoimmunity, challenging existing paradigms in thymus immunology. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2025-07-22.	Illumina NovaSeq 6000	1
EGAD00001015659	Spatial transcriptomics study of thymic transplant biopsies . This dataset contains all the data available for this study on 2025-07-28.	Illumina NovaSeq 6000	1
EGAD00001015660	Single cell RNA sequencing on thymic cells from children with syndromic diseases such as Trisomy 21 and inflammatory/autoimmune diseases such as myasthenia gravis. Our group has previously performed atlasing of the healthy human thymus using single cell transcriptomics and obtained an unprecedented understanding of human thymus throughout development. In this study, we intend to survey diseased human thymus and compare this to healthy to obtain a better understanding of thymic dysfunction in human diseases. The diseases we will survey include congenital conditions such as Down's syndrome, CHARGE syndrome, DiGeorge syndrome as well as acquired diseases such as myasthenia gravis and thymic hyperplasia. We intend to perform genetic and genomic study on the samples, including expression analysis. This will not only help us to understand the cells and molecular pathways affected in dysfunctional thymus; but may also lead to generalisable understandings in human genetic diseases and inflammatory/autoimmune diseases. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2025-07-28.	Illumina NovaSeq 6000	1
EGAD00001015661	A total of 54 samples derived from 9 patients and frozem in 6 timepoints (T0 - immediately after resection, T1 - 10 minutes after resection, T2 - 20 minutes after resection, T3 - 30 minutes after resection, T4 - 45 minutes after resection, T5 - 60 minutes after resection) to assess the impact of ischemia on gene expression.	HiSeq X Ten	54
EGAD00001015662	Sequencing data of leukemic samples included in study. Including WES, WGS and targeted sequencing files from patients at diagnosis, remission and at relapse. The dataset includes samples from 52 pediatric patients, totaling 311 samples.	Illumina HiSeq X Illumina MiSeq Illumina NovaSeq 6000 NextSeq 500	302
EGAD00001015663	Dual snRNAseq and ATACseq of replicate PBMC samples for 10X beta testing. . This dataset contains all the data available for this study on 2025-07-28.	Illumina NovaSeq 6000	1
EGAD00001015664	Engineered cartilage: deriving design principles from human developmental pathways . This dataset contains all the data available for this study on 2025-07-30.	Illumina NovaSeq 6000	1
EGAD00001015665	This study involves atlasing the development of the postnatal gut nervous system in order to elucidate the pathogenic mechanisms of Hirschsprung disease. . This dataset contains all the data available for this study on 2025-07-31.	Illumina NovaSeq 6000	1
EGAD00001015666	Spatial transcriptome analysis of the human heart . This dataset contains all the data available for this study on 2025-07-31.	Illumina NovaSeq 6000	1
EGAD00001015667	This study involves atlasing the development of the postnatal gut nervous system in order to elucidate the pathogenic mechanisms of Hirschsprung disease. . This dataset contains all the data available for this study on 2025-07-31.	Illumina NovaSeq 6000	1
EGAD00001015668	Cell Atlas of the diseased human heart. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2025-07-31.	Illumina NovaSeq 6000	1
EGAD00001015669	Lymphomas are types of blood cancer derived from lymphocytes, and can be classified into two main categories, Hodgkin's lymphomas (HL) and non-Hodgkin lymphomas (NHL). T-cell lymphoma is a type of NHL that develops from T lymphocytes and accounts for 10-15% of total NHL cases. As a rare disease, T-cell lymphoma is functionally, pathologically, and clinically complex and heterogeneous. The 2016 revision of the WHO classification has listed 29 subtypes of mature T/NK-cell neoplasms, among which over 25 are T lymphocyte in origin. These subtypes of T-cell lymphomas have different biological behaviours and clinical prognosis. Clinically, T-cell lymphoma ranges from indolent (slow-growing) to aggressive (fast-growing and spreading) disease. According to the guideline booklet provided by the Lymphoma Research Foundation (https://lymphoma.org/aboutlymphoma/nhl/), indolent T-cell lymphoma subtypes include adult T-cell leukaemia/lymphoma (ATLL), cutaneous T-cell lymphoma (CTCL), and mycosis fungoides (MF), while the aggressive subtypes include anaplastic large cell lymphoma (ALCL), angioimmunoblastic T-cell lymphoma (AITL), peripheral T-cell lymphoma (PTCL), and T lymphoblastic lymphoma (TLBL). PTCL and ALCL represent the most common subtypes of aggressive T-cell lymphoma whereas TLBL is a rare and aggressive subtype mostly diagnosed in children. In this study, we will use single cell RNA sequencing and single cell multiomic methods three aggressive T-cell lymphoma subtypes (ALCL, PTCL and TLBL) and try to gain a deeper insight into the genotypic and phenotypic features of these T-cell lymphomas. . This dataset contains all the data available for this study on 2025-07-31.	Illumina NovaSeq 6000	1
EGAD00001015670	Transcriptomic analysis of skeletal muscle, to understand the mechanisms of muscle ageing. . This dataset contains all the data available for this study on 2025-07-31.	Illumina NovaSeq 6000	1
EGAD00001015673	Refractory cancers may arise either through the acquisition of resistance mechanisms or represent distinct disease states. The origin of childhood T-cell acute lymphoblastic leukaemia (T-ALL) that does not respond to initial treatment, i.e. refractory disease, is unknown. Refractory T-ALL carries a poor prognosis and cannot be predicted at diagnosis. Here, we perform single cell mRNA sequencing of T-ALL from 58 children (84 samples) who did, or did not respond to initial treatment. We identify a transcriptionally distinctive blast population, exhibiting features of innate-like lymphocytes, as the major source of refractory disease. Evidence of such blasts at diagnosis heralds refractory disease across independent datasets and is associated with survival in a large, contemporary trial cohort. Our findings portray refractory T-ALL as a distinct disease with the potential for immediate clinical utility. This dataset contains 56 single-cell RNA sequencing data files with corresponding single-cell T cell receptor sequencing data files. The data here form the validation cohort in our paper.	Illumina NovaSeq 6000	1
EGAD00001015674	This dataset contains optical genome mapping (OGM) data from 56 samples, processed for structural variation and copy number analysis. The data includes .xmap, .cmap, and .smap files representing high-resolution genome maps and structural variant calls. These samples were used to identify complex rearrangements and reconstruct genome karyotypes.		1
EGAD00001015675	Two samples derived from a colorectal cancer female patient - tumor tissue and normal adjacent tissue analyzed via Visium Spatial Gene Expression.	HiSeq X Ten	2
EGAD00001015678	Paired-end RNA sequencing data from 33 metastatic breast cancer samples. Sequencing was performed on the Illumina NovSeq 6000 using NEBNext Single Cell and Low Input RNA Library Prep Kit for Illumina.	Illumina NovaSeq 6000	1
EGAD00001015679	Single-cell profiling of healthy adult volunteers that were inoculated with SARS-CoV-2. . This dataset contains all the data available for this study on 2025-08-11.	Illumina NovaSeq 6000	1
EGAD00001015680	Dataset for Synchronous Endometrial and Ovarian Cancer	NextSeq 2000	54
EGAD00001015681	We have a case of two siblings (twins) who were born with nodular blue rashes over their body. The first twin had a large periorbital subcutaneous mass which was diagnosed as infantile undifferentiated sarcoma. This baby was too systematically unwell for chemotherapy and died at 12 days of age. The second sibling received two cycles of emergency chemotherapy however died due to disease progression and secondary haemorrhage at 2 months of age. Post-mortem normal and tumour samples have been taken. In addition to placental tissue and parental germline DNA, the goal will be to investigate intra-uterine transfer of sarcoma and the phylogenetic relationship of tumours in both twins.	Illumina NovaSeq 6000	10
EGAD00001015682	This dataset contains paired tumor and matched normal whole-genome sequencing (WGS) data from longitudinally collected samples of patients enrolled in an Australian clinical study involving multiple cancer types. For each patient, one or more tumor biopsies were obtained at different treatment time points (Scr, Day 0, Day 15, Day 22, Day 33), and matched normal samples were collected either from blood or non-tumor tissue.	Illumina NovaSeq 6000	56
EGAD00001015683	Between January 1st, 2016, and December 31th, 2021, patients with glioblastoma IDH-wt (CNS WHO grade 4) were included. RNA-seq was performed in a subset of patients.	Illumina NovaSeq 6000	58
EGAD00001015684	CUTseq data for study EGAS00001008261	NextSeq 2000	87
EGAD00001015687	Exome data for study EGAS00001008261	NextSeq 2000	9
EGAD00001015688	This dataset contains transcriptome sequencing for 16 samples of forebrain organoids from Lissencephaly patients. The sequencing was performed on Illumina NextSeq 6000 . The sequencing was always paired.	Illumina NovaSeq 6000	10
EGAD00001015689	Transposable elements (TEs), once regarded as parasitic genomic remnants, are now recognized as key regulators of gene expression and genome evolution, yet the functional specificity of individual TE subfamilies remains largely unexplored. This dataset investigates the transcriptional consequences of targeted repression of MER57E3 and LTR10B2 elements using CRISPR interference (CRISPRi) in human induced pluripotent stem cells (hiPSCs). hiPSCs expressing CRISPRi machinery (n = 2 biological replicates) were transduced with guide RNAs targeting individual or grouped copies of MER57E3 and LTR10B2, as well as the ZNF678 promoter or a lacZ non-targeting control. Transduced cells were subsequently differentiated into neural progenitor cells (NPCs), and total RNA was extracted for mRNA library preparation and sequencing. The dataset comprises 24 single-end mRNA-seq FASTQ files generated from these NPCs and wild-type controls.	NextSeq 2000	24
EGAD00001015692	Biopsies from the terminal ileum of healthy individuals and terminal ileum of Crohns disease patients are collected and processed single cells and processed for single-cell RNA-sequencing using 10X Genomics 3' v3 and v3.1 and Illumina sequencing. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/	Illumina HiSeq 4000 Illumina NovaSeq 6000	2
EGAD00001015695	RNA-Seq of of 11 primary pancreatic neuroendocrine tumors mutated in MEN1/DAXX/ ATRX used to identify molecular subtypes	Illumina NovaSeq 6000 Illumina NovaSeq X	3
EGAD00001015697	WGS files for paper titled "Preclinical Pediatric Molecular Analysis for Therapy Choice (MATCH)"	Illumina HiSeq 2000	664
EGAD00001015698	This dataset contains bulk RNA-seq and single-cell RNA-seq (scRNA-seq) of bone marrow obtained from 2 patients diagnosed with severe congenital neutropenia (SCN) attributed to LCP1 mutations. This is the first dataset of its kind. Moreover, the dataset also contains scRNA-seq of bone marrow derived from a healthy control.	Illumina NovaSeq 6000	4
EGAD00001015700	Genotyping data of BMI model markers generated using Global Screening Array (Illumina Inc.)		1
EGAD00001015701	This Dataset contains whole genome sequencing data for 148 sample, transcriptome sequencing for 66 samples and exome sequencing for 123 samples from pediatric solid tumors. Sequencing was performed on Illumina NovaSeq 6000 and Illumina HiSeq 4000. The sequencing was paired.	Illumina HiSeq 4000 Illumina NovaSeq 6000	337
EGAD00001015702	This dataset contains transcriptome sequencing for 5 samples of MCC tumors . The sequencing was performed on Illumina NextSeq 6000 . The sequencing was always paired.	Illumina NovaSeq 6000	1
EGAD00001015703	This raw data contains 5' scRNA-seq with TCR enrichment data from 18 individual patients analysing cells taken from tumour-involved lymph nodes (LN), malignant seromas and patient-derived xenografts (PDX) including systemic/nodal ALK+ ALCL (n=3 PDX), BIA-ALCL (n=2 seromas), PTCL-NOS (n=5 nodal tumours), nodal T follicular helper lymphoma - angioimmunoblastic type (nTFHL-AI) (n=1), nTFHL, not otherwise specified (nTFHL, NOS, n=1), TLBL (n=5) and 1 case of Sezary Syndrome (SS, cutaneous tumour). Droplet-based 5’ scRNA-seq with TCR enrichment (10x Chromium platform) was performed for all live cells, or the viable CD45+CD3- and CD45+CD3+ fractions following cell sorting by flow cytometry.	Illumina NovaSeq 6000	1
EGAD00001015704	10x Visium section (lesional skin - atopic dermatitis) used to map location of skin fibroblasts.	Illumina NovaSeq 6000	1
EGAD00001015705	Aim: to comprehensively characterize the cellular and transcriptional landscape of pediatric pilocytic astrocytomas across anatomical tumor locations	Illumina NovaSeq 6000	6
EGAD00001015706	This dataset contains 9 RNA sequencing samples of hiPSC-derived forebrain organoid ARHGAP11A knockout and hiPSC-derived forebrain organoid control cells. Sequencing was performed on Illumina NovaSeq 6000. The sequencing was always paired.	Illumina NovaSeq 6000	9
EGAD00001015707	This study investigates high-risk rhabdomyosarcoma (RMS) using multiple single-cell and spatial genomic technologies. We generated and analysed single-cell and single-nucleus RNA-sequencing, chromatin accessibility, and spatial transcriptomics data from primary tumours and validation samples. These datasets characterise cellular diversity within rhabdomyosarcoma and identify cell states associated with aggressive disease. The data support research into tumour biology, risk stratification, and therapeutic target discovery. This repository houses the whole-genome sequencing of RMS tumoroids data. . This dataset contains all the data available for this study on 2025-09-30.	Illumina NovaSeq 6000	1
EGAD00001015708	This study investigates high-risk rhabdomyosarcoma (RMS) using multiple single-cell and spatial genomic technologies. We generated and analysed single-cell and single-nucleus RNA-sequencing, chromatin accessibility, and spatial transcriptomics data from primary tumours and validation samples. These datasets characterise cellular diversity within rhabdomyosarcoma and identify cell states associated with aggressive disease. The data support research into tumour biology, risk stratification, and therapeutic target discovery. This repository houses the single-cell RNA sequencing of RMS tumouroids data. . This dataset contains all the data available for this study on 2025-09-30.	Illumina NovaSeq 6000	1
EGAD00001015709	This study investigates high-risk rhabdomyosarcoma (RMS) using multiple single-cell and spatial genomic technologies. We generated and analysed single-cell and single-nucleus RNA-sequencing, chromatin accessibility, and spatial transcriptomics data from primary tumours and validation samples. These datasets characterise cellular diversity within rhabdomyosarcoma and identify cell states associated with aggressive disease. The data support research into tumour biology, risk stratification, and therapeutic target discovery. This repository houses the whole-genome sequencing of RMS tumours data. . This dataset contains all the data available for this study on 2025-09-30.	Illumina NovaSeq 6000	1
EGAD00001015710	This study investigates high-risk rhabdomyosarcoma (RMS) using multiple single-cell and spatial genomic technologies. We generated and analysed single-cell and single-nucleus RNA-sequencing, chromatin accessibility, and spatial transcriptomics data from primary tumours and validation samples. These datasets characterise cellular diversity within rhabdomyosarcoma and identify cell states associated with aggressive disease. The data support research into tumour biology, risk stratification, and therapeutic target discovery. This repository houses the single-cell RNA sequencing of RMS tumours data. . This dataset contains all the data available for this study on 2025-09-30.	Illumina NovaSeq 6000	1
EGAD00001015711	This study investigates high-risk rhabdomyosarcoma (RMS) using multiple single-cell and spatial genomic technologies. We generated and analysed single-cell and single-nucleus RNA-sequencing, chromatin accessibility, and spatial transcriptomics data from primary tumours and validation samples. These datasets characterise cellular diversity within rhabdomyosarcoma and identify cell states associated with aggressive disease. The data support research into tumour biology, risk stratification, and therapeutic target discovery. This repository houses the single-cell ATAC sequencing of RMS tumours data. . This dataset contains all the data available for this study on 2025-09-30.	Illumina NovaSeq 6000	1
EGAD00001015712	This study investigates high-risk rhabdomyosarcoma (RMS) using multiple single-cell and spatial genomic technologies. We generated and analysed single-cell and single-nucleus RNA-sequencing, chromatin accessibility, and spatial transcriptomics data from primary tumours and validation samples. These datasets characterise cellular diversity within rhabdomyosarcoma and identify cell states associated with aggressive disease. The data support research into tumour biology, risk stratification, and therapeutic target discovery. This repository houses the whole-genome sequencing of RMS tumours data. . This dataset contains all the data available for this study on 2025-09-30.	HiSeq X Ten Illumina HiSeq 4000 Illumina NovaSeq 6000	1
EGAD00001015713	This study investigates high-risk rhabdomyosarcoma (RMS) using multiple single-cell and spatial genomic technologies. We generated and analysed single-cell and single-nucleus RNA-sequencing, chromatin accessibility, and spatial transcriptomics data from primary tumours and validation samples. These datasets characterise cellular diversity within rhabdomyosarcoma and identify cell states associated with aggressive disease. The data support research into tumour biology, risk stratification, and therapeutic target discovery. This repository houses the single-cell RNA sequencing of RMS tumours data. . This dataset contains all the data available for this study on 2025-09-30.	Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina NovaSeq 6000	1
EGAD00001015714	This study investigates high-risk rhabdomyosarcoma (RMS) using multiple single-cell and spatial genomic technologies. We generated and analysed single-cell and single-nucleus RNA-sequencing, chromatin accessibility, and spatial transcriptomics data from primary tumours and validation samples. These datasets characterise cellular diversity within rhabdomyosarcoma and identify cell states associated with aggressive disease. The data support research into tumour biology, risk stratification, and therapeutic target discovery. This repository houses the bulk RNA sequencing of RMS tumours data. . This dataset contains all the data available for this study on 2025-09-30.	Illumina NovaSeq 6000	1
EGAD00001015715	This study investigates high-risk rhabdomyosarcoma (RMS) using multiple single-cell and spatial genomic technologies. We generated and analysed single-cell and single-nucleus RNA-sequencing, chromatin accessibility, and spatial transcriptomics data from primary tumours and validation samples. These datasets characterise cellular diversity within rhabdomyosarcoma and identify cell states associated with aggressive disease. The data support research into tumour biology, risk stratification, and therapeutic target discovery. This repository houses the single-cell ATAC sequencing of RMS tumours data. . This dataset contains all the data available for this study on 2025-09-30.	Illumina NovaSeq 6000	1
EGAD00001015716	This study investigates high-risk rhabdomyosarcoma (RMS) using multiple single-cell and spatial genomic technologies. We generated and analysed single-cell and single-nucleus RNA-sequencing, chromatin accessibility, and spatial transcriptomics data from primary tumours and validation samples. These datasets characterise cellular diversity within rhabdomyosarcoma and identify cell states associated with aggressive disease. The data support research into tumour biology, risk stratification, and therapeutic target discovery. This repository houses the whole-genome sequencing of RMS tumours data. . This dataset contains all the data available for this study on 2025-09-30.	Illumina NovaSeq 6000	1
EGAD00001015717	This study investigates high-risk rhabdomyosarcoma (RMS) using multiple single-cell and spatial genomic technologies. We generated and analysed single-cell and single-nucleus RNA-sequencing, chromatin accessibility, and spatial transcriptomics data from primary tumours and validation samples. These datasets characterise cellular diversity within rhabdomyosarcoma and identify cell states associated with aggressive disease. The data support research into tumour biology, risk stratification, and therapeutic target discovery. This repository houses the single-cell RNA sequencing of RMS tumours data. . This dataset contains all the data available for this study on 2025-09-30.	Illumina NovaSeq 6000	1
EGAD00001015718	This study investigates high-risk rhabdomyosarcoma (RMS) using multiple single-cell and spatial genomic technologies. We generated and analysed single-cell and single-nucleus RNA-sequencing, chromatin accessibility, and spatial transcriptomics data from primary tumours and validation samples. These datasets characterise cellular diversity within rhabdomyosarcoma and identify cell states associated with aggressive disease. The data support research into tumour biology, risk stratification, and therapeutic target discovery. This repository houses the bulk RNA sequencing of RMS tumours data. . This dataset contains all the data available for this study on 2025-09-30.	Illumina NovaSeq 6000	1
EGAD00001015719	This study investigates high-risk rhabdomyosarcoma (RMS) using multiple single-cell and spatial genomic technologies. We generated and analysed single-cell and single-nucleus RNA-sequencing, chromatin accessibility, and spatial transcriptomics data from primary tumours and validation samples. These datasets characterise cellular diversity within rhabdomyosarcoma and identify cell states associated with aggressive disease. The data support research into tumour biology, risk stratification, and therapeutic target discovery. This repository houses the single-cell ATAC sequencing of RMS tumours data. . This dataset contains all the data available for this study on 2025-09-30.	Illumina NovaSeq 6000	1
EGAD00001015720	T cells are central to adaptive immunity and thus crucial for understanding and treating human disease. While extensively studied in model organisms, translating this knowledge to humans can sometimes be limited by their rapid cross-species evolution and diverse . This dataset contains all the data available for this study on 2026-03-10.	Illumina NovaSeq 6000	1
EGAD00001015721	Single cell transcriptomics study of thymic transplant biopsies . This dataset contains all the data available for this study on 2025-10-02.	Illumina NovaSeq 6000	1
EGAD00001015722	Single cell sequencing will be carried out by multiome profiling (RNAseq and ATACseq). Additonal modalities may be examined. Spatial profiling may involve Visium, Curio and other technologies. This data set will feed into a larger analysis of the human lungs over development, that aims to detail all the cell types of the human lungs and airways and will extend current knowledge by providing chromatin accessibility data that will allow us to link GWAS data and identify regulatory networks and then compare these against fetal and adult data sets . This dataset contains all the data available for this study on 2025-10-02.	Illumina NovaSeq 6000	22
EGAD00001015723	Single cell sequencing will be carried out by multiome profiling (RNAseq and ATACseq). Additonal modalities may be examined. Spatial profiling may involve Visium, Curio and other technologies. This data set will feed into a larger analysis of the human lungs over development, that aims to detail all the cell types of the human lungs and airways and will extend current knowledge by providing chromatin accessibility data that will allow us to link GWAS data and identify regulatory networks and then compare these against fetal and adult data sets . This dataset contains all the data available for this study on 2025-10-02.	Illumina NovaSeq 6000	1
EGAD00001015724	Engineered cartilage: deriving design principles from human developmental pathways . This dataset contains all the data available for this study on 2025-10-02.	Illumina NovaSeq 6000	1
EGAD00001015725	Leveraging single-cell sequencing technologies to shed light on the immune aetiology of chronic inflammatory demyelinating polyradiculoneuropathy (CIDP) . This dataset contains all the data available for this study on 2025-10-02.	Illumina NovaSeq 6000	1
EGAD00001015727	This dataset contains 75 WGS samples of T-ALL (including matched remission samples where available).		1
EGAD00001015728	1230 participants from both South Africa and The Democratic Republic of Congo underwent whole exome sequencing on an Illumina NovaSeq 6000 at Wellcome Trust Sanger Institute.		1230
EGAD00001015737	Knowledge about abnormal organ development is important to understand pathology and to develop novel treatment approaches for individuals with congenital and acquired disease. Most of our current understanding is based on examination of tissues from the embryo and early foetus, collected from women undergoing termination of pregnancy in the first trimester (third) of pregnancy. There is very little known about normal and abnormal organ development from a developmental perspective during the crucial last two-thirds of pregnancy when much remodelling of foetal tissues occurs. This study will generate a single-cell atlas of late-foetal lungs, blood, heart, bone and immune organs. . This dataset contains all the data available for this study on 2025-10-14.	Illumina NovaSeq 6000	27
EGAD00001015738	Samples for this project are taken from adult human hearts which have been preserved using 2 different methods (cold static storage on ice, and hypothermic perfusion). Samples have been acquired at various time points during post-preservation normothermic perfusion. The aim of the project is to assess the transcriptional changes caused by these differing methods of preservation. . This dataset contains all the data available for this study on 2025-10-14.	Illumina NovaSeq 6000	16
EGAD00001015739	Animal studies have demonstrated that resident memory T (Trm) cells provide enhanced protective responses to a broad array of tissue-tropic pathogens, thus making Trm cells promising targets for novel vaccination strategies. However, the biological pathways that enable the long-term survival of Trm cells are poorly understood in humans. Here, we will employ a unique human intestinal transplantation setting that allows us to study the retention of persistent T cells in the grafts and the temporal development of resident T-cell populations from recruited recipient T cells. We will integrate high resolution transcriptomics, epigenetics, proteomics and immune repertoire single-cell data from purified intestinal T cells. These single-cell multiomics approaches will uncover the diversity and differentiation of the T cell populations in the human intestine, allowing us to temporally resolve the generation and maintenance of gut resident T cells. . This dataset contains all the data available for this study on 2025-10-14.	Illumina NovaSeq 6000	40
EGAD00001015750	Clonal haematopoiesis (CH) arises from the expansion of hematopoietic stem cells (HSCs) carrying leukaemia-associated somatic mutations. CH is linked to pathological immune dysregulation and a greater risk of age-related inflammatory diseases. Yet, how CH mutations impact HSC differentiation into immune effector cells remains understudied. Here, we report a single-cell resolution functional and multi-omic investigation of HSC clonal and differentiation dynamics in individuals with DNMT3A-R882 CH. DNMT3A-R882 reshapes the clonal architecture of haematopoiesis towards an aged phylogenetic structure. Functionally, DNMT3A-R882 HSCs produce decreased monocytic output but more abundant and mature neutrophil progeny compared to WT HSCs in the same individual. Whereas DNMT3A-R882 myeloid progenitors display attenuated inflammatory transcriptional programmes, DNMT3A-R882 mature neutrophils acquire proinflammatory and immunomodulatory features typical of maladaptive immunity and CH co-morbidities. Our findings, validated in humanised mice, identify aberrant DNMT3A-R882 HSC-driven neutropoiesis as a key link between CH, immune dysregulation and risk of inflammatory disease.	Illumina NovaSeq 6000	1
EGAD00001015751	Human Cell Atlas - WGS Adult Heart . This dataset contains all the data available for this study on 2025-10-16.	Illumina NovaSeq 6000	1
EGAD00001015752	Phenotype data for 1230 participants from both South Africa and The Democratic Republic of Congo		1230
EGAD00001015753	Paired fastqs for DNA and RNA from Illumina TSO500 targeted sequencing from PDX.	Illumina NovaSeq X	16
EGAD00001015754	Raw sgRNA sequencing data from a pilot CRISPR screening library to test the dual guide vector system (8,914 guide pairs), and from a large scale genetic interaction screen (~100,136 guide pairs) in HT-29 cells. This data refers to Burgold et al. Nature Communications 2025.	Illumina HiSeq 2500 Illumina MiSeq	1
EGAD00001015755	Acral melanoma, which is not ultraviolet (UV)-associated, is the most common type of melanoma in several low- and middle-income countries including Mexico. Latin American samples are significantly underrepresented in global cancer genomics studies, which directly affects patients in these regions as it is known that cancer risk and incidence may be influenced by ancestry and environmental exposures. To address this, we characterise the genome and transcriptome of 123 acral melanoma tumours from 92 Mexican patients, a population notable because of its genetic admixture. Compared with other studies of melanoma, we found fewer frequent mutations in classical driver genes such as BRAF, NRAS or NF1. While most patients had predominantly Amerindian genetic ancestry, those with higher European ancestry had increased frequency of BRAF mutations and a lower median number of structural variants. The tumours with activating BRAF mutations have a transcriptional profile more similar to cutaneous non-volar melanocytes, suggesting that acral melanomas in these patients may arise from a distinct cell of origin compared to other tumours arising in these locations. KIT mutations were found in a subset of these tumours, and quadruple wild-type samples (non BRAF/NRAS/NF1/KIT) differed from mutated samples in their structural genomic profile and overall and recurrence-free survival patterns. Transcriptional profiling defined three expression clusters; these characteristics were associated with recurrence-free and overall survival. We highlight potential novel low-frequency drivers, such as PTPRJ, NF2 and RDH5. Our study enhances knowledge of this understudied disease and underscores the importance of including samples from diverse ancestries in cancer genomics studies.	Illumina HiSeq 4000	1
EGAD00001015756	Acral melanoma, which is not ultraviolet (UV)-associated, is the most common type of melanoma in several low- and middle-income countries including Mexico. Latin American samples are significantly underrepresented in global cancer genomics studies, which directly affects patients in these regions as it is known that cancer risk and incidence may be influenced by ancestry and environmental exposures. To address this, we characterise the genome and transcriptome of 123 acral melanoma tumours from 92 Mexican patients, a population notable because of its genetic admixture. Compared with other studies of melanoma, we found fewer frequent mutations in classical driver genes such as BRAF, NRAS or NF1. While most patients had predominantly Amerindian genetic ancestry, those with higher European ancestry had increased frequency of BRAF mutations and a lower median number of structural variants. The tumours with activating BRAF mutations have a transcriptional profile more similar to cutaneous non-volar melanocytes, suggesting that acral melanomas in these patients may arise from a distinct cell of origin compared to other tumours arising in these locations. KIT mutations were found in a subset of these tumours, and quadruple wild-type samples (non BRAF/NRAS/NF1/KIT) differed from mutated samples in their structural genomic profile and overall and recurrence-free survival patterns. Transcriptional profiling defined three expression clusters; these characteristics were associated with recurrence-free and overall survival. We highlight potential novel low-frequency drivers, such as PTPRJ, NF2 and RDH5. Our study enhances knowledge of this understudied disease and underscores the importance of including samples from diverse ancestries in cancer genomics studies.	Illumina HiSeq 4000	1
EGAD00001015757	This dataset contains whole exome sequencing data for 343 samples collected from 118 male donors, including 66 Alzheimer disease patients and 52 controls. DNA for sequencing was extracted from FACS-isolated CD4 T cells, NK cells and myeloid cells (monocytes or granulocytes).	Illumina NovaSeq 6000	153
EGAD00001015763	PEACE (Posthumous Evaluation of Advanced Cancer Environment) whole exome sequencing data	Illumina HiSeq 4000	529
EGAD00001015768	Multiple myeloma (MM) is consistently preceded by monoclonal gammopathy of undetermined significance (MGUS) and smoldering multiple myeloma (SMM). While these precursor conditions are asymptomatic, they are not entirely benign and carry a lifelong risk of progression to MM. Unlike other cancers defined by pathology, malignant transformation from MGUS or SMM to MM has so far relied on demonstration of clinical end-organ damage as morphology and cytogenetics cannot reliably distinguish them. In this study, using genomic data from 374 patients with MGUS or SMM (277 training, 97 validation), to our knowledge, we demonstrate for the first time the ability to identify malignant transformation in MGUS and SMM. We introduce the concept of genomic MM and genomic MGUS to differentiate the subsets of MGUS and SMM that are biologically malignant with genomic features indistinguishable from MM from the subset that is premalignant and unlikely to progress to malignancy. Importantly, we find that most SMM has biological features of malignant transformation indistinguishable from MM. As expected, this subset that we consider having genomic MM is associated with a high risk of progression to MM although some patients remained progression-free beyond 5 years. Conversely, 60% of MGUS and 10% of SMM have no evidence of malignant transformation (genomic MGUS), with no progression during follow-up. Integration of genomic features with the 2/20/20 International Myeloma Working Group model significantly improved the prediction of progression among genomic MM. These findings support the use of genomic criteria to refine the classification and the risk stratification in myeloma precursor conditions.	Illumina NovaSeq 6000	1
EGAD00001015769	This dataset contains single-cell RNA sequencing data from two samples: L1 (breast cancer metastasis to lung) and N3 (adjacent normal lung tissue) from a female patient. The samples were processed using 10X Chromium 3' technology and sequenced on Illumina NextSeq 500. This data supports the study "Endocrine therapy reprogramming of breast cancer facilitates metastatic escape via upregulation of P-Rex1/Rac1 signalling" which investigates how endocrine therapy reprograms breast cancer cells, leading to upregulation of P-Rex1/Rac1 signalling pathways that facilitate metastatic escape. The single-cell analysis reveals cellular reprogramming mechanisms in endocrine therapy-resistant breast cancer cells that form metastases in the lung.	NextSeq 500	2
EGAD00001015770	Mechanisms of clonal evolution in myeloid neoplasms remain incompletely understood. Darwinian theory predicts that the (micro)environment of clone-propagating stem cells may contribute to clonal selection. Here, we provide data fitting this model, establishing a relationship between stromal niche inflammation, inflammatory stress in HSPCs, clonal resistance and leukemic evolution in human MDS.	Illumina NovaSeq 6000	1
EGAD00001015771	To profile transcriptional alterations of HSPCs in LR-MDS patients, 7AAD−CD45+Lin-CD235a−CD34+ cells were isolated from healthy donors and LR-MDS patients for RNA sequencing.	Illumina NovaSeq 6000	1
EGAD00001015780	Heterogeneity is a hallmark of clear cell renal cell carcinoma (ccRCC). In this study, hyperpolarized [1-13C]pyruvate MRI (HP-13C-MRI) was used to probe metabolic heterogeneity non-invasively in 6 ccRCC patients from which 58 tumor and healthy tissues biopsies were acquired postoperatively. MRI parameters were correlated with 146 metabolite concentrations and the expression of 2523 metabolic genes across 34 metabolic pathways. Using metabologram projections and pathway consensus scoring, several metabolic pathways were correlated with the metabolic MRI, including glycolysis, pentose phosphate pathway, and the TCA cycle. A simple thresholding approach applied to the imaging metrics was sufficient to provide meaningful metabolic pathway information. We also show that tissue metabolic heterogeneity increases in regions with higher pyruvate-to-lactate conversion on imaging. These results provide validation of the biological relevance of HP-13C-MRI and support the possibility of its clinical use to improve disease characterization, stratification, targeting of biopsies, and providing novel methods to monitor treatment response.	Illumina HiSeq 4000	1
EGAD00001015781	Testing the differences between FFPE and fresh frozen derived single cell results.	Illumina NovaSeq 6000 Illumina NovaSeq X Plus	1
EGAD00001015783	Thyroid hormones are essential for health, but human thyroid development and cellular heterogeneity are poorly understood. We generated a high-resolution, spatiotemporal transcriptional atlas of human fetal thyroid, and examined its perturbations in trisomy 21 (T21), where aberrant thyroid development is associated with congenital hypothyroidism, and in papillary thyroid carcinoma (PTC). Normal tissue revealed two functional follicular cell states (fTFC1, fTFC2) with divergent PAX8 expression levels, persisting into adulthood. Additionally, the rare C-cell population was comprehensively profiled at single-cell resolution for the first time. T21 fetal thyroid was hypoplastic, with disrupted follicular morphology and altered expression of genes mediating cellular topology and extracellular matrix interactions. Finally, the transcriptional signals of both fetal follicular cell states were found to be perturbed in PTC. Overall, we defined a reference atlas for normal human fetal thyrocyte heterogeneity and provided novel insights into thyroid disorders.	Illumina NovaSeq 6000	1
EGAD00001015795	This dataset contains Chromium single-cell RNA-seq data and demultiplexing support for the PCA Atlas study EGAS00001008332. Included data objects: - 16 Chromium scRNA-seq FASTQ runs (EGAR accessions) for captures PCa1–PCa16. - 16 aligned BAM analyses (one Cell Ranger possorted_genome_bam.bam per capture). - 16 capture-level genotype VCF analyses derived from the Axiom UK Biobank array, used as demultiplexing panels. - demux_map.csv (EGAZ00001945520): donor–capture–sample mapping for genotype-based demultiplexing. - chromium.csv (EGAZ00001945537): run-level mapping linking captures, EGA samples, runs (EGAR), experiments (EGAX) and BAM analyses. Together these objects provide a complete view of the Chromium scRNA-seq data and the genotype-based demultiplexing support needed to reproduce donor assignment and downstream analyses.See also https://zenodo.org/records/17372603	Illumina NovaSeq 6000	16
EGAD00001015796	This dataset contains Visium spatial transcriptomics data for the PCA Atlas study EGAS00001008332. It includes: - Raw Visium FASTQ runs (EGAR00004172272–EGAR00004172283) for all libraries. - spaceranger alignment BAM analyses for those libraries where BAM output was generated. - visium.csv: a slide-level mapping table linking Visium slide samples to EGA sample accessions, sequencing runs/experiments and spaceranger BAM analyses. - visium_tma.csv: a TMA mapping table describing how mini-TMA slides relate slide-level Visium libraries to tissue-level samples/cores. Together these objects provide the raw and aligned Visium data and the technical/biological mappings needed to understand and reuse the spatial transcriptomics component of the PCA Atlas. High resolution histology images are available from the following Zenodo repository, along with additional spaceranger outs for each slide: - https://zenodo.org/records/17411292	Illumina NovaSeq 6000	4
EGAD00001015797	This study involves atlasing the development of the postnatal gut nervous system in order to elucidate the pathogenic mechanisms of Hirschsprung disease. . This dataset contains all the data available for this study on 2026-01-15.	Illumina NovaSeq 6000	1
EGAD00001015798	Spatial transcriptome analysis of thymic cells from children with syndromic diseases such as Trisomy 21 and inflammatory/autoimmune diseases such as myasthenia gravis. . This dataset contains all the data available for this study on 2026-01-19.	Illumina NovaSeq 6000	1
EGAD00001015799	This dataset contains phenotypic metadata from participants in the BIOCLOCK endurance exercise intervention and epigenetic aging study. It includes intervention status, cardiorespiratory fitness (VO2 max), body composition, estimated leukocyte proportions, and other relevant traits linked to epigenetic aging outcomes.		1
EGAD00001015800	WGS files for paper titled "Patient-derived pediatric brain tumor organoids faithfully recapitulate primary tumors"	Illumina HiSeq 2000	62
EGAD00001015801	WXS files for paper titled "Patient-derived pediatric brain tumor organoids faithfully recapitulate primary tumors"	Illumina HiSeq 2000	32
EGAD00001015802	RNASeq files for paper titled "Patient-derived pediatric brain tumor organoids faithfully recapitulate primary tumors"	Illumina HiSeq 2000	60
EGAD00001015811	The dataset includes short-read whole-genome sequencing (SR-WGS) data, SR-RNAseq data, OGM (Bionano) data, LR-WGS data, Epigenetics (RBSS) data and deep-WES data generated within the project Solve-RD from biosamples submitted by the group chud-faivre. The cohorts consists of 111 individuals recruited as part of Solve-RD cohorts 2-4 and aligns with ERN-ITHACA related disease entities (for a description of the Solve-RD cohorts see www.solve-rd.eu/fact-sheet-solve-rd-cohorts).	DNBSEQ-G400 DNBSEQ-T7 Illumina NovaSeq 6000	1
EGAD00001015815	10ng from each cfDNA sample was used as the input material. For gDNA samples, 1.5-2ug each was first sonicated by Covaris R230 system (Woburn, MA). Next, size-selection with 0.8x-1.5x two-step Ampure XP beads was performed to enrich the 150-200bp fragments. 200ng of the size-selected gDNA was used as input material. The pre-capture library construction was performed with NEB UltraII DNA library prep kit for Illumina (Ipswich, MA) and the adapter was the methylated EM-seq adapter. Before the first PCR amplification, the gDNA samples were subjected to bisulfite conversion by QIAGEN Epitect kit, and cfDNA samples were subjected to enzymatic conversion by NEBNext enzymatic methyl-seq kit if the hybrid capture panel designed for converted DNA was used downstream. Twist panel hybrid capture was performed with their standard protocol (Twist Bioscience, South San Francisco, CA). The concentration of the post-capture library after the second PCR amplification was measured by Qubit 1xdsDNA HS kit (ThermoFisher, Waltham, MA). Their quality was examined by TBE-UREA PAGE and Bioanalyzer before sequencing. Libraries were then sequenced with 150bp paired-end reads on Illumina machines by Genewiz, Inc. (South Plainfield, NJ, USA). This dataset contains MethylScan data of the plasma samples from 248 noncancer individuals and 27 cancer patients (17 liver and 10 GI-tract cancers).	Illumina NovaSeq X	1
EGAD00001015817	Rituximab, a CD20+ B cell depletion therapy, is frequently used in the treatment of systemic lupus erythematosus (SLE). However, variability in patient response highlights the need for a deeper understanding of the underlying immune cell dynamics of B cell depletion and repopulation. In this study, we conducted longitudinal single-cell profiling of nine SLE patients treated with rituximab from pretreatment to up to 15 months post-treatment. These were compared to eight healthy controls. We profiled PBMCs via 10X Genomics single-cell RNA, surface protein (CITE-seq), B cell receptor (BCR), and T cell receptor (TCR) sequencing and sequenced bulk BCR repertoires in parallel. For single cell sequencing, 10 pools were created with an equal number of cells from 4 samples each. Samples collected at different timepoints from the same patient were distributed across different pools to allow demultiplexing of individuals using genotypes captured from the scRNA-seq data. Libraries were pooled using a ratio of GEX:CITE:TCR:BCR=9:2:1:1 and were sequenced across two lanes of the NovaSeq 6000. For bulk BCR sequencing, individual samples were demultiplexed using primer barcodes and paired-end BCR amplicon reads were merged prior to submission; data are provided as unmapped single-end reads.	Illumina MiSeq Illumina NovaSeq 6000	33
EGAD00010000050	Matched tumor-negative pancreas tissues	Affymetrix SNP 6.0	15
EGAD00010000051	Cell line derived from microdissected primary pancreatic ductal adenocarcinoma tissues	Affymetrix SNP 6.0	15
EGAD00010000052	Monozygotic twins that are discordant for schizophrenia (Genotyping)	CompleteGenomics build 1.4.2.8 - CG Build 1.4.2.8	36
EGAD00010000096	DBA case samples using 250K Nsp	Affymetrix_250K(Nsp) - gtype	27
EGAD00010000124	Psoriasis cases as part of WTCCC2 phase 2	Illumina_670k - Illuminus	2622
EGAD00010000130	Cerebellar ataxia, mental retardation, and disequilibrium syndrome (CAMRQ) samples	Illumina Illumina 300 Duo V2 - Bead Studio	2
EGAD00010000144	Healthy volunteer collection of European Ancestry	Illumin OmniExpress v1.0 - Illumina GenomeStudio	288
EGAD00010000148	tumour samples using Affymetrix Genome-Wide SNP6.0 arrays	Affymetrix_GenomeWide_SNP6.34	104
EGAD00010000150	WTCCC2 project samples from Ankylosing spondylitis Cohort	Illumina_670k - Illuminus	2005
EGAD00010000158	Affymetrix 6.0 cel files	Affymetrix SNP 6.0	473
EGAD00010000160	Illumina HT 12 IDATS		-
EGAD00010000162	Illumina HT 12 IDATS	Illumina HT 12	-
EGAD00010000164	Affymetrix 6.0 CEL files	Affymetrix SNP 6.0	-
EGAD00010000202	Case samples (Illumina_660K & Illumina_670K)	Illumina_660K/Illumina_670K	1478
EGAD00010000210	Normalized expression data; discovery set	Illumina HT 12	1
EGAD00010000211	Normalized expression data; validation set	Illumina HT 12	-
EGAD00010000212	Normalized expression data; normals	Illumina HT 12	-
EGAD00010000213	Segmented (CBS) copy number aberrations (CNA); discovery set	Affymetrix SNP 6.0	-
EGAD00010000214	Segmented (CBS) copy number variants (CNV); discovery set	Affymetrix SNP 6.0	-
EGAD00010000215	Segmented (CBS) copy number aberrations (CNA); validation set	Affymetrix SNP 6.0	-
EGAD00010000216	Segmented (CBS) copy number variants (CNV); validation set	Affymetrix SNP 6.0	-
EGAD00010000217	Segmented (HMM) copy number aberrations (CNA); discovery set	Affymetrix SNP 6.0	-
EGAD00010000220	Ovarian & matched normal (Genotypes)	Complete Genomics - CG Build 1.4.2.8	2
EGAD00010000230	WTCCC2 samples from Hypertension Cohort	- Illuminus	2943
EGAD00010000232	WTCCC2 samples from Type 2 Diabetes Cohort	- Illuminus	2975
EGAD00010000234	WTCCC2 samples from 1958 British Birth Cohort	Illumina HumanExome-12v1_A-GenCall, zCall	12241
EGAD00010000236	WTCCC2 samples from Coronary Artery Disease Cohort	- Illuminus, GenoSNP	3125
EGAD00010000238	CLL Expression array	Affymetrix GeneChip Human Genome U133 plus 2.0	64
EGAD00010000246	Coeliac disease cases and control samples. (1958BC samples excluded)	GenoSNP Illumina ImmunoBeadChip - Illuminus	10758
EGAD00010000248	1958BC control samples	GenoSNP Illumina ImmunoBeadChip - Illuminus	6812
EGAD00010000250	NBS control samples	GenoSNP Illumina ImmunoBeadChip - Illuminus	3030
EGAD00010000252	CLL Expression Arrays	Affymetrix U219	137
EGAD00010000254	CLL Methylation Arrays	Illumina HumanMethylation450	165
EGAD00010000260	PNET genotyping	Illumina OmniQuad 2.5 - CNVpartition	77
EGAD00010000262	WTCCC2 project Schizophrenia (SP) samples	Affyemtrix 6.0 - CHIAMO	3019
EGAD00010000264	WTCCC2 project samples from Ischaemic Stroke Cohort	Illumina_670k - Illuminus	4205
EGAD00010000266	Metabric breast cancer samples (Genotype raw data)	Affymetrix SNP 6.0	543
EGAD00010000268	Metabric breast cancer samples (Expression raw data)	Illumina HT 12	543
EGAD00010000270	Metabric breast cancer samples (Images)	Aperio image - H&E stained tissue_section	564
EGAD00010000272	Colon tumour samples	Illumina_2.5M	75
EGAD00010000274	Colon matched tumour samples	Illumina_2.5M	74
EGAD00010000276	SCLC tumor genotypes	Illumina_2.5M	56
EGAD00010000278	SCLC matched normal genotypes	Illumina_2.5M	51
EGAD00010000280	CLL Expression array	Affymetrix snp 6.0	4
EGAD00010000282	Pharmacogenomic response to Statins samples (Genotypes/Phenotypes)	Affymetrix 6.0 - CHIAMO	4134
EGAD00010000284	NBS control samples only (Hap300)	Illumina (Various)	2500
EGAD00010000286	All cases and controls (Hap550)		11950
EGAD00010000288	All cases and Finnish, Dutch, Italian control samples (Hap550)		6313
EGAD00010000290	NBS control samples only (Hap550)		2276
EGAD00010000292	All cases and Finnish, Dutch, Italian control samples (Hap300)		10339
EGAD00010000294	1958BC control samples only (Hap300)		2436
EGAD00010000296	1958BC control samples only (Hap550)		2224
EGAD00010000298	All cases and controls (Hap300)		13761
EGAD00010000300	Summary statistics from Haemgen RBC GWAS	Affymetrix Illumina Perlegen	1
EGAD00010000371	Case and control samples (Genotypes)	Infinium_370k - GenomeStudio	170
EGAD00010000377	DNA methylation analysis of 6 primary lymphoma samples	HumanMethylation450k Bead Chip - Genome Studio	6
EGAD00010000379	DNA methylation analysis of 2 peripheral blood samples	HumanMethylation450k Bead Chip - Genome Studio	2
EGAD00010000381	MRCE sample using 300K	Illumina 300K - GenomeStudio	543
EGAD00010000383	MRCA sample using 100K	Illumina 100K - GenomeStudio	1
EGAD00010000385	MRCA sample using 300K	Illumina 300K - GenomeStudio	394
EGAD00010000387	Cambridge control samples using a 1.2M genotyping chip from Illumina	Illumina Human 1.2M Duo custom BeadChips v1 - Genome Studio	188
EGAD00010000389	Cambridge control samples using a 24k expression array from Illumina	Illumina Human-Ref 8 v3.0 expression array	395
EGAD00010000391	Cambridge control samples using a 660K genotyping chip from Illumina	Illumina Human 660K Quad BeadChips - Illuminus	232
EGAD00010000395	Myeloma case sample genotype using Affymetrix SNP6.0	Affymetrix_SNP6	19
EGAD00010000417	Han Chinese samples using Illumina OMNIExpress (cases)	Illumina OMNIExpress	62
EGAD00010000419	Han Chinese samples using Affymetrix (cases)	Affymetrix_6.0	62
EGAD00010000421	Han Chinese samples using Affymetrix (controls)	Affymetrix_6.0	187
EGAD00010000423	Han Chinese samples using Illumina OMNIExpress (controls)	Illumina OMNIExpress	213
EGAD00010000425	Han Chinese samples using Immunochip	HanChinese_Immunochip	192
EGAD00010000427	DNA methylation analysis of 4 peripheral blood samples	HumanMethylation450k Bead Chip - Genome Studio	4
EGAD00010000429	DNA methylation analysis of 4 primary lymphoma samples	HumanMethylation450k Bead Chip - Genome Studio	4
EGAD00010000434	Normalised mRNA expression	Illumina HT 12	1302
EGAD00010000436	Illumina HT 12 IDAT files	Illumina HT 12	1302
EGAD00010000438	Normalized miRNA expression data	Agilent ncRNA 60k	1480
EGAD00010000440	Segmented copy number data	Affymetrix_SNP6_raw	1302
EGAD00010000442	Affymetrix SNP 6.0 CEL files	Affymetrix_SNP6_raw	1302
EGAD00010000444	Agilent ncRNA 60k txt files	Agilent ncRNA 60k	1480
EGAD00010000446	Monocyte Gene Expression	Illumina Human-Ref-8 v3 beadchip	758
EGAD00010000448	Macrophage Gene Expression	Illumina Human-Ref-8 v3 beadchip	758
EGAD00010000450	Genome Wide Genotype Data	Illumina Human Custom 1,2M and Human 610 Quad Custom arrays	758
EGAD00010000452	Chondrosarcoma case sample genotype using Affymetrix SNP6.0	Affymetrix_SNP6	36
EGAD00010000456	Leukemia samples using 450K DNA methylation		800
EGAD00010000458	Controls using 450K DNA methylation		151
EGAD00010000460	GENCORD2 DNA methylation		294
EGAD00010000462	SJLGG Case samples using Gene Expression Array	Affymetrix_U133v2	75
EGAD00010000464	Down syndrome SNP genotyping data	Illumina 550K - Illumina Genome Studio	338
EGAD00010000466	Down syndrome CNV genotyping data	NimbleGen 135K aCGH - NimbleScan	108
EGAD00010000468	Uveal melanoma matched Tumour and blood samples	Illumina HumanOmni2.5	24
EGAD00010000470	CLL Expression Array	GPL570	20
EGAD00010000472	CLL Expression Array	Affymetrix U219	219
EGAD00010000474	blood-based gene expression from breast cancer cases and age-matched controls in case-control serie 2 (CC2)	Illumina	98
EGAD00010000476	blood-based gene expression from breast cancer cases and age-matched controls in case-control serie 1 (CC1)	Illumina	110
EGAD00010000478	blood-based gene expression from breast cancer cases and age-matched controls in case-control serie 3 (CC3)	Illumina	118
EGAD00010000480	ccRCC case samples using 250K Nsp	Affymetrix_250K(Nsp) - gtype	240
EGAD00010000482	ccRCC case samples using methylation array	Illumina Infinium HumanMethylation 450K - GenomeStudio	1
EGAD00010000484	ccRCC control samples using 250K Nsp	Affymetrix_250K(Nsp) - gtype	234
EGAD00010000486	ccRCC case samples using expression array	Agilent Human Whole Genome 4x44k v2 - Feature Extraction	101
EGAD00010000488	Chondroblastoma case sample genotype using Affymetrix SNP6.0	Affymetrix_SNP6-	7
EGAD00010000490	Affymetrix Genome-Wide Human SNP Array 6.0 data	Affymetrix 6.0-	19
EGAD00010000492	Cases_Human660W-Quad_v1_A	Illumina_Human660W-Quad_v1_A-Not supplied	4
EGAD00010000494	Controls_Human660W-Quad_v1_A	Illumina_Human660W-Quad_v1_A-Not supplied	4
EGAD00010000496	Genome-wide SNP genotyping of African rainforest hunter-gatherers and neighbouring agriculturalists	Illumina HumanOmni1-Quad-Illumina GenomeStudio	260
EGAD00010000498	Affymetrix SNP6.0 genotype data for prostate cancer patients	Affymetrix_SNP6-	18
EGAD00010000500	Case samples using U133 Plus 2.0 Array	Affymetrix_U133plus2-	35
EGAD00010000502	Case samples using SNP Array 6.0	Affymetrix_U133plus2-	35
EGAD00010000504	Control samples using SNP Array 6.0	Affymetrix_U133plus2-	35
EGAD00010000506	WTCCC2 BO (Barretts oesophagus) samples	Illumina_670k-Illuminus	1991
EGAD00010000508	Matched control samples using SNP 6.0 Array	GenomeWideSNP_6-BirdseedV2	12
EGAD00010000510	Matched control samples using HumanOmni1-Quad	GenomeWideSNP_6-BirdseedV2	12
EGAD00010000512	Case samples using HumanOmni1-Quad	GenomeWideSNP_6-BirdseedV2	12
EGAD00010000514	Case samples using SNP 6.0 Array	GenomeWideSNP_6-BirdseedV2	12
EGAD00010000516	Samples from the Pomak Villages in Greece, Pomak isolate	HumanExome_12v1.1_A -GenCall, zCall	1046
EGAD00010000518	Samples from the Greek island of Crete, MANOLIS cohort	HumanExome_12v1.1_A -GenCall, zCall	1280
EGAD00010000520	Healthy volunteer collection of European Ancestry	Illumina OmniExpress v1.0-Illumina GenomeStudio	144
EGAD00010000522	Samples from the Greek island of Crete, MANOLIS cohort	HumanOmniExpress-12 v1.1 BeadChip-GenCall	1364
EGAD00010000526	SNP 6.0 arrays of small cell lung cancer	Affymetrics_SNP_6.0-	63
EGAD00010000528	Illumina HumanHT-12 v4 array		-
EGAD00010000532	Illumina Human Omni1-Quad SNP genotyping array		-
EGAD00010000534	Illumina HumanMethylation450 BeadChip		-
EGAD00010000536	21 unlinked autosomal microsatellite loci for 30 Central Asian populations	Applied Biosystems 3100 automated sequencer-GeneMarker v.1.6 (Softgenetics)	1702
EGAD00010000538	28 unlinked autosomal microsatellite loci for 20 African and 4 philippine populations	Applied Biosystems 3100 automated sequencer-GeneMarker v.1.6 (Softgenetics)	1702
EGAD00010000542	Cusihg's syndrome normal samples using 250K	Affymetrix 250K Nsp-GTYPE	16
EGAD00010000544	Cusihg's syndrome tumor samples using 250K	Affymetrix 250K Nsp-GTYPE	16
EGAD00010000546	SNP 6.0 arrays of carcinoid samples	Affymetrics_SNP_6.0-	74
EGAD00010000552	Neuroblastoma samples		130
EGAD00010000554	SNP 6.0 arrays of small cell lung cancer		1032
EGAD00010000556	SNP 6.0 arrays of small cell lung cancer		1
EGAD00010000558	SNP 6.0 arrays of small cell lung cancer	Affymetrix SNP 6.0	54
EGAD00010000560	SNP array of 7 HCCs and matched background liver in children with bile salt export pump deficiency	Illumina HumanOmniExpress-12 v1.	14
EGAD00010000562	Medulloblastoma DNA methylation	Illumina_HumanMethylation450	115
EGAD00010000564	HipSci - Healthy Normals - Expression Array - May 2014		120
EGAD00010000566	HipSci - Healthy Normals - Genotyping Array - May 2014		120
EGAD00010000568	HipSci - Healthy Normals - Methylation Array - May 2014		-
EGAD00010000570	Imputation-based meta-analysis of severe malaria in Kenya.		3343
EGAD00010000572	Imputation-based meta-analysis of severe malaria in Gambia.		2870
EGAD00010000574	Pleuropulmonary blastoma samples using 250K		14
EGAD00010000578	Gencode case samples using 550K		249
EGAD00010000580	Gencode control samples using 550K		217
EGAD00010000584	WTCCC2 Glaucoma samples using Illumina 670k array	Illumina 670k (custom Illumina Human660W-Quad)	2765
EGAD00010000594	SCOOP severe early-onset obesity cases		1720
EGAD00010000596	PCGP Ph-likeALL GEA		837
EGAD00010000598	PCGP Ph-likeALL SNP6		1724
EGAD00010000600	Prostate Adenocarcinomas samples using 450K	Illumina450K	80
EGAD00010000602	WTCCC2 Reading and Mathematics ability (RM) samples from UK using the Affymetrix 6.0 array		3665
EGAD00010000604	DNA methylation data using Illumina 450K		2195
EGAD00010000606	SNP6 data for matched normal samples		8
EGAD00010000608	SNP6 data for seminoma samples		8
EGAD00010000610	Samples from the Greek island of Crete, MANOLIS cohort		221
EGAD00010000612	Celiac disease North Indian samples using Immunochip		-
EGAD00010000614	40 Druze Trios		120
EGAD00010000616	HumanOmni1-Quad genotyping array		230
EGAD00010000618	Ischemic stroke cases		3682
EGAD00010000620	Controls		3683
EGAD00010000622	SNP array data for gastric cancer cell lines		30
EGAD00010000624	A new beta-globin mutation responsible of a beta-thalassemia (HbVar database ID 2928) was observed in 8 unrelated French families. The mutation carriers originated from Nord-Pas-de-Calais, a Northern French region where the chief town is Lille. 5 unrelated mutation carriers were genotyped for a set of 12 microsatellites from chromosome 11, around the beta-globin gene. Among the 5 mutation carriers, 4 were genotyped for 97 European Ancestry Informative SNPs (EAIMs).		-
EGAD00010000626	A new beta-globin mutation responsible of a beta-thalassemia (HbVar database ID 2928) was observed in 8 unrelated French families. The mutation carriers originated from Nord-Pas-de-Calais, a Northern French region where the chief town is Lille. 5 unrelated mutation carriers were genotyped for a set of 12 microsatellites from chromosome 11, around the beta-globin gene. Among the 5 mutation carriers, 4 were genotyped for 97 European Ancestry Informative SNPs (EAIMs).		37
EGAD00010000628	The TEENAGE study target population comprised adolescent students aged 13 to 15 years attending the first three classes of public secondary schools located in the wider Athens area of Attica.		1
EGAD00010000630	The TEENAGE study target population comprised adolescent students aged 13 to 15 years attending the first three classes of public secondary schools located in the wider Athens area of Attica.		436
EGAD00010000632	WTCCC2 People of the British Isles (POBI) samples using Illumina 1.2M array		2912
EGAD00010000634	WTCCC2 People of the British Isles (POBI) samples using Affymetrix 6.0 array		2930
EGAD00010000636	WTCCC2 Visceral Leishmaniasis samples from Brazil using Illumina 670k	0	1
EGAD00010000638	WTCCC2 Visceral Leishmaniasis samples from Indial using Illumina 670k	0	1
EGAD00010000640	WTCCC2 Visceral Leishmaniasis samples from Sudanl using Illumina 670k	0	1
EGAD00010000642	CLL Expression Array		1
EGAD00010000644	Affymetrix SNP6.0 cancer cell line exome sequencing data		1022
EGAD00010000646	DNA methylation analysis of 35 prostate tumor and 6 normal prostate samples		41
EGAD00010000648	nccRCC tumor/normal genotypes		1
EGAD00010000650	Genotypes from Omni2.5 chip		1213
EGAD00010000652	Genotyped samples using Illumina HumanOmni2.5		402
EGAD00010000654	Control samples using SNP 6.0 Arrays		1
EGAD00010000656	Case samples using SNP 6.0 Array		1
EGAD00010000658	DLBCL 148 SNP 6.0 Cohort		1
EGAD00010000662	Finnish population cohort genotyping		-
EGAD00010000664	Finnish population cohort genotyping_B		-
EGAD00010000666	Purified plasma cells from tonsil of Healthy donor		1
EGAD00010000668	Purified plasma cells from bone marrow of Monoclonal gammopathy of unknown significance patient		1
EGAD00010000670	Purified plasma cells from bone marrow of Pooled healthy donors		1
EGAD00010000672	Purified plasma cells from bone marrow of Multiple myeloma patient		1
EGAD00010000674	ELSA genome-wide genotypes, excluding estimated related individuals. There are 3 files: .fam, .bim, .bed		7412
EGAD00010000676	ELSA genome-wide genotypes, including estimated related individuals. There are 3 files: .fam, .bim, .bed		7452
EGAD00010000678	Tumor sample SNP arrays	Illumina SNP array	11
EGAD00010000680	Tumor sample CGH arrays	Agilent CGH array	4
EGAD00010000682	glioma samples tumor using 250K		762
EGAD00010000684	glioma normal samples using cytoscan		3
EGAD00010000686	glioma samples tumor using cytoscan		5
EGAD00010000688	glioma normal samples using 250K		119
EGAD00010000690	Genome-wide SNP genotyping of African rainforest hunter-gatherers and neighbouring agriculturalists by Illumina HumanOmniExpress		160
EGAD00010000692	Genome-wide DNA methylation epigenotyping of African rainforest hunter-gatherers and neighbouring agriculturalists by Illumina HumanMethylation450		372
EGAD00010000694	HCC array for cnv		55
EGAD00010000696	PCGP ETP ALL SNP6		-
EGAD00010000698	PCGP INF ALL SNP6		-
EGAD00010000702	SNP-chip genotyping data for one proband in the DDD study (Ref : Carvalho AJHG 2015)		1
EGAD00010000704	610k genotyping imputed on Hapmap 3 and 1000G Phase 1 CEU		714
EGAD00010000708	Human samples typed on Illumina Omni 5M		-
EGAD00010000710	ATRT genotyping blood		11
EGAD00010000712	ATRT genotyping		40
EGAD00010000714	aplastic anemia samples tumor using 250K	Affymetrix 250K Nsp-GTYPE	440
EGAD00010000716	BLUEPRINT DNA Methylation of different B-cell subpopulations		35
EGAD00010000718	BLUEPRINT Gene expression of different B-cell subpopulations		42
EGAD00010000722	Pilot experiment on functional genomics in osteoarthritis (coreex)		1
EGAD00010000724	Pilot experiment on functional genomics in osteoarthritis (methyl)		-
EGAD00010000730	WTCCC2 Psychosis Endophenotype samples from UK, Germany, Holland, Spain and Australia using the Affymetrix 6.0 array		1
EGAD00010000736	AAD case and control samples from UK and Norway		117
EGAD00010000738	Generation Scotland APOE data		18336
EGAD00010000740	Osteoarthritis cases genotyped on Illumina HumanOmniExpress from the arcOGEN Consortium (http://www.arcogen.org.uk/) with broader consent.		674
EGAD00010000742	Subset 1 of osteoarthritis cases genotyped on Illumina610k from the arcOGEN Consortium (http://www.arcogen.org.uk/) with broader consent.		5383
EGAD00010000744	Subset 2 of osteoarthritis cases genotyped on Illumina 610k from the arcOGEN Consortium (http://www.arcogen.org.uk/) with consent for osteoarthritis studies only.		2326
EGAD00010000748	Genotyping using Illumina Human OmniExpress12v1.0		1
EGAD00010000750	German glioma control germline genotypes using Illumina HumanExome-12v1_A array	Illumina HumanExome-12v1_A	2391
EGAD00010000752	German glioma case germline genotypes using Illumina HumanExome-12v1_A array	Illumina HumanExome-12v1_A	899
EGAD00010000754	UK glioma case germline genotypes using Illumina HumanExome-12v1_A array	Illumina HumanExome-12v1_A	596
EGAD00010000756	French glioma control germline genotypes using Illumina HumanExome-12v1_A array	Illumina HumanExome-12v1_A	699
EGAD00010000758	French glioma case germline genotypes using Illumina HumanExome-12v1_A array	Illumina HumanExome-12v1_A	906
EGAD00010000764	Ovarian tumor samples using Illumina		1
EGAD00010000766	We have established a mechanism for the collection of postal DNA samples from consenting National Joint Registry for England and Wales (NJR) patients and have carried out genotyping genome-wide in 903 patients with the condition Developmental Dysplasia of the Hip (DDH) on the Illumina CoreExome array		903
EGAD00010000768	Replication data for HipSci normal samples using both HumanCoreExome-12_v1 and HumanOmni2.5-8 BeadChips		-
EGAD00010000771	HipSci - Healthy Normals - Methylation Array - April 2015		-
EGAD00010000773	HipSci - Healthy Normals - Genotyping Array - November 2014	Illumina	580
EGAD00010000775	HipSci - Healthy Normals - Expression Array - November 2014	Illumina	580
EGAD00010000777	HipSci - Bardet-Biedl Syndrome - Genotyping Array - November 2014		-
EGAD00010000779	HipSci - Monogenic Diabetes - Genotyping Array - November 2014	Illumina	9
EGAD00010000781	HipSci - Bardet-Biedl Syndrome - Methylation Array - April 2015		-
EGAD00010000783	HipSci - Bardet-Biedl Syndrome - Expression Array - November 2014		-
EGAD00010000785	HipSci - Monogenic Diabetes - Expression Array - November 2014		-
EGAD00010000787	Epigen-Brasil samples using HumanOmni2.5		6487
EGAD00010000789	ATRT expression	Illumina Human HT6-v3 Array	4
EGAD00010000790	ATRT expression	Illumina Human HT6-v3 Array	41
EGAD00010000791	Illumina HumanOmni2.5-8 BeadChip		1
EGAD00010000807	Illumina HumanCoreExome genotyping data from the British Society for Surgery of the Hand Genetics of Dupuytrenâ€™s Disease consortium (BSSH-GODD consortium) collection		4201
EGAD00010000811	ATL tumor samples using Illumina 610K SNP array		1
EGAD00010000813	ATL tumor samples using Illumina 450K Methylation array		1
EGAD00010000815	ATL tumor samples using Affymetrix 250K SNP array		1
EGAD00010000817	HipSci - Monogenic Diabetes - Methylation Array - April 2015		-
EGAD00010000819	Summary statistics from meta-analysis for BP phenotypes		-
EGAD00010000823	Results of SNP arrays on synchronous CRC samples		1
EGAD00010000827	Illumina Infinium 450K array data		1
EGAD00010000829	Illumina Infinium 450K array data		70
EGAD00010000831	BLUEPRINT EpiMatch: harnessing epigenetics for haematopoietic stem cell transplantation		85
EGAD00010000847	Genotyping using Affymetrix SNP6.0		49
EGAD00010000850	BLUEPRINT DNA methylation profiles of monocytes, neutrophils and T cells from healthy donors		525
EGAD00010000853	VeraCode GoldenGate GT Assay technology		147
EGAD00010000854	WTCCC3 UK maternal cases of pre-eclampsia	Illumina Human670-QuadCustom_v1	3980
EGAD00010000858	Achalasia cases & controls		8151
EGAD00010000859	Smad3	Illumina ChIP-Sequencing	16
EGAD00010000860	Pol2	Illumina ChIP-Sequencing	16
EGAD00010000862	H3K27me3	Illumina ChIP-Sequencing	16
EGAD00010000863	H3K27Ac	Illumina ChIP-Sequencing	16
EGAD00010000865	MBDSEQ	Illumina MBD-Sequencing	16
EGAD00010000867	Expression Arrays	Illumina beadarray	16
EGAD00010000868	Targeted bisulfite sequencing	Illumina Bisulfite-Sequencing	16
EGAD00010000869	RNA expression microarray	Illumina_HumanHT-12v4	62
EGAD00010000870	DNA methylation microarray	Illumina_Infinium_HumanMethylation450	48
EGAD00010000871	CLL and normal B cell samples using 450K		226
EGAD00010000872	Genotyped case and control sampes using HumanExome Beadchip		1610
EGAD00010000874	Understanding Society Sequenom genotypes	Sequenom	8590
EGAD00010000875	CLL Expression Array	Affymetrix U219	-
EGAD00010000881	Digital images of ovarian cancer sections	Aperio	91
EGAD00010000883	The ARGO-Larissa GWAS.	Illumina HumanCoreExome-24v1-0	859
EGAD00010000886	samples using Affymetrix HG_U133_+2	Affymetrix HG_U133_+2	99
EGAD00010000887	Freeze 1 of the RP3 project	Illumina Human Methylation 450k BeadChip	3898
EGAD00010000889	Gencode control samples using SNP6.0	SNP6.0	183
EGAD00010000890	Understanding Society GWAS, all samples	Illumina HumanCoreExome-12v1-0	20926
EGAD00010000891	Understanding Society GWAS, samples that passed quality control	Illumina HumanCoreExome-12v1-0	19888
EGAD00010000892	Healthy individuals from Italy	Illumina	300
EGAD00010000897	Infinium 450K in Rhabdomyosarcoma	Infinium HumanMethylation450 BeadChip	53
EGAD00010000901	Russian Tuberculosis samples using Affymetrix 6.0	Affymetrix Genome-Wide Human SNP Array 6.0 Genotypes	11937
EGAD00010000902	Genome-wide study of resistance to severe malaria in eleven worldwide populations:Gambia	Illumina Omni 2.5M	5594
EGAD00010000903	Genome-wide study of resistance to severe malaria in eleven worldwide populations:Malawi	Illumina Omni 2.5M	3088
EGAD00010000904	Genome-wide study of resistance to severe malaria in eleven worldwide populations:Kenya	Illumina Omni 2.5M	3865
EGAD00010000908	Illumina SNP-arrays for matching retinoblastoma-blood pairs and retinoblastoma cell lines.	HumanOmni1 Quad BeadChip	132
EGAD00010000909	HipSci - Embryonic Stem Cells - Methylation Array - April 2016	Illumina	2
EGAD00010000910	HipSci - Embryonic Stem Cells - Expression Array - April 2016	Illumina	2
EGAD00010000911	HipSci - Embryonic Stem Cells - Genotyping Array - April 2016	Illumina	2
EGAD00010000912	SEA 610K	Illumina 610K	1
EGAD00010000913	SEA 660K	Illumina 660K	3
EGAD00010000915	Affymetrix SNP6.0 breast cancer genome sequencing data	Affymetrix SNP6.0	344
EGAD00010000916	BASIS breast cancer DNA methylation Illumina 450k	Illumina 450k	457
EGAD00010000917	399 tumors profiled using Agilent miRNA microarrays (Product Number G4872A, design ID 046064). The arrays are based on miRBase release 19.0 and 2006 human miRNAs are represented. 150 ng total RNA was used as input.	Agilent miRNA microarrays	399
EGAD00010000918	Understanding Society GWAS, samples that passed quality control, imputed to UK10K + 1000 Genomes combined reference panel	Illumina HumanCoreExome-12v1-0 chip, UK10K + 1000 Genomes combined reference panel imputed	19888
EGAD00010000919	samples using Illumina HUMANOMNI1QUAD	HUMANOMNI1QUAD	2
EGAD00010000920	samples using Illumina HUMANOMNIEXPRESS	HUMANOMNIEXPRESS	50
EGAD00010000921	samples using Affymetrix CYTOSCANHD	CYTOSCANHD	12
EGAD00010000922	Subset 1 of osteoarthritis cases from the arcOGEN Consortium (http://www.arcogen.org.uk/) genotyped on HumanCoreExome-24v1-0 with broader consent.	Illumina HumanCoreExome-24v1-0	494
EGAD00010000923	Subset 2 of osteoarthritis cases from the arcOGEN Consortium (http://www.arcogen.org.uk/) genotyped on HumanCoreExome-12v1-0 with consent for osteoarthritis studies only.	Illumina HumanCoreExome-12v1-0	463
EGAD00010000924	Subset 2 of osteoarthritis cases from the arcOGEN Consortium (http://www.arcogen.org.uk/) genotyped on HumanCoreExome-12v1-1 with consent for osteoarthritis studies only.	Illumina HumanCoreExome-12v1-1	991
EGAD00010000925	Subset 1 of osteoarthritis cases from the arcOGEN Consortium (http://www.arcogen.org.uk/) genotyped on HumanCoreExome-12v1-0 with broader consent.	Illumina HumanCoreExome-12v1-0	855
EGAD00010000926	Subset 1 of osteoarthritis cases from the arcOGEN Consortium (http://www.arcogen.org.uk/) genotyped on HumanCoreExome-12v1-1 with broader consent.	Illumina HumanCoreExome-12v1-1	3075
EGAD00010000927	Subset 2 of osteoarthritis cases from the arcOGEN Consortium (http://www.arcogen.org.uk/) genotyped on HumanCoreExome-24v1-0 with consent for osteoarthritis studies only.	Illumina HumanCoreExome-24v1-0	248
EGAD00010000928	WTCCC3_Primary Biliary Cirrhosis Replication Post-QC	Illumina ImmunoChip	2861
EGAD00010000929	WTCCC3_Primary Biliary Cirrhosis Replication	Illumina ImmunoChip	2981
EGAD00010000934	Agilent miRNA dataset	Agilent SurePrint Human miRNA Microarray	2
EGAD00010000935	ACGH 244K dataset	Agilent 244K	10
EGAD00010000936	Affymetrix Exon Array dataset	Affymetrix GeneChip Human Exon 1.0 ST	2
EGAD00010000937	ACGH 180K dataset	Agilent 180K	5
EGAD00010000938	mRNA Array Agilent 44K dataset	Agilent 44K	16
EGAD00010000939	Illumina 1M SNP Array dataset	Illumina 1M SNP Array	2
EGAD00010000940	Gambian specimens with trachomatous scarring WHO grade C2/C3	Illiumina Omni 2.5	1531
EGAD00010000941	Gambian specimens without trachomatous scarring	Illumina Omni 2.5	1531
EGAD00010000942	Breast lesions assayed with Affymetrix SNP 6.0	Affymetrix SNP 6.0	125
EGAD00010000943	Sahel population study using 2.5M	Illumina HumanOmni2.5	161
EGAD00010000944	Genotyping data from Southeast Borneo individuals	Illumina Human Omni Express Bead Chip-24 v1.0	41
EGAD00010000946	Human samples, 450k analysis	Illumina 450k	127
EGAD00010000947	Lymphoma samples using CytoSNP	Illumina CytoSNP	35
EGAD00010000948	Lymphoma samples using 450k	Illumina 450k	95
EGAD00010000949	Lymphoma samples using HumanOmni	Illumina HumanOmni2.5	104
EGAD00010000950	WTCCC2 Bacteraemia Susceptibility (BS) smaples using Affymetrix 6.0	Affymetrix 6.0	4924
EGAD00010000951	SNP array data for 668 cancer cell lines	Illumina 2.5M	668
EGAD00010000952	Where Are You From? samples types at 517K SNP loci	Illumina HumanOmniExpress-24 BeadChip	598
EGAD00010000953	Healthy adult volunteers and newborns recruited in various countries across Oceania.	HumanCore-24 BeadChip	937
EGAD00010000954	Healthy volunteers recruited in New Caledonia	HumanCore-24 BeadChip	356
EGAD00010000955	Rheumatic heart disease cases recruited in Fiji with higher density genotyping	HumanOmniExpressExome-8 BeadChip	32
EGAD00010000956	Rheumatic heart disease cases recruited in New Caledonia with higher density genotyping	HumanOmniExpressExome-8 BeadChip	34
EGAD00010000957	Rheumatic heart disease cases recruited in New Caledonia	HumanCore-24 BeadChip	465
EGAD00010000958	Healthy volunteers recruited in Fiji with higher density genotyping	HumanOmniExpressExome-8 BeadChip	32
EGAD00010000959	Healthy volunteers recruited in Fiji	HumanCore-24 BeadChip	854
EGAD00010000960	Definite and borderline rheumatic heart disease cases and patients with mild non-diagnostic valvulopathy recruited in Samoa	HumanCore-24 BeadChip	126
EGAD00010000961	Rheumatic heart disease cases recruited in Fiji	HumanCore-24 BeadChip	535
EGAD00010000962	Healthy volunteers and missing phenotype individuals recruited in New Caledonia with higher density genotyping	HumanOmniExpressExome-8 BeadChip	30
EGAD00010000963	Healthy volunteers recruited in Samoa	HumanCore-24 BeadChip	24
EGAD00010000965	Array data from 4778 individuals from general population of rural Uganda		4778
EGAD00010000983	MeDIP-seq RPM chromsome BED files for Peripheral Blood from EPITWIN Project (Columns 4-4353 represent samples)	MeDIP-seq	4350
EGAD00010001001	Primary renal cell carcinoma (RCC), RCC metastases and cell lines by Illumina 450K	Illumina 450K	62
EGAD00010001003	This data set contains two data files. First data file (file name: PREDO_GA_EGA_methylation_data.csv) includes methylation data from 485512 sites accross human genome from 96 individuals acquired from Illumina 450K -chip. The other data file (file name: PREDO_GA_EGA_phenotypes.csv) contains the gestation ages and the genders of the 96 samples.	Illumina 450K-chip (methylation data)	96
EGAD00010001004	WTCCC1 project samples from 1958 British Birth Cohort	Infinium 550K	1504
EGAD00010001005	Illumina HumanCoreExome-12v1-1_A chip typing in a Greek adolescent population	Illumina Human Core Exome 12v1.1	120
EGAD00010001006	Proteomics LC-MS MS dataset	Liquid chromatographyâ€“mass spectrometry	8
EGAD00010001012	BLUEPRINT DNA Methylation 450K data of mantle cell lymphoma	Illumina HumanMethylation 450K	86
EGAD00010001025	BLUEPRINT DNA methylation profiles of monocytes, T cells and B cells in type 1 diabetes-discordant monozygotic twins	Illumina 450K	302
EGAD00010001029	Summary statistics for a multi-cohort epigenome-wide association study. This includes summary statistics (effect-size, standard error, p-value) for 470,000 methylation markers.		-
EGAD00010001032	RNA Expression using Illumina HT12 v3	Illlumina HT12 v3	153
EGAD00010001034	WTCCC3 Anorexia Nervosa GWAS	Illumina Human670-QuadCustom_v1_A	1696
EGAD00010001040	Methylation changes in OA patients with chronic exposure to cobalt and chromium	Illumina HumanMethylation450	68
EGAD00010001043	WTCCC3 Anorexia Nervosa Infinium-HumanCoreExome	Illumina HumanCoreExome-12v1-0_A and HumanCoreExome-24v1-0_A	925
EGAD00010001045	APCDR AGV Project: Array data from 99 Igbo. Raw data, intensity files and post-QC Plink files.	Illumina HumanOmni2.5-4v1_B	-
EGAD00010001046	APCDR AGV Project: Array data from 86 Sotho. Raw data, intensity files and post-QC Plink files.	Illumina HumanOmni2-5_8v1_A	-
EGAD00010001047	APCDR AGV Project: Array data from 107 Ethiopians (Amhara, Oromo, Somali; subset of Ethiopian Genome Project Genotyping). Raw data, intensity files and post-QC Plink files.	Illumina HumanOmni2-5_8v1_A	-
EGAD00010001048	APCDR AGV Project: Array data from 79 Jola. Raw data, intensity files and post-QC Plink files.	Illumina HumanOmni2-5_8v1_A	-
EGAD00010001049	APCDR AGV Project: Array data from 99 Kikuyu. Raw data, intensity files and post-QC Plink files.	Illumina HumanOmni2.5-4v1_B	-
EGAD00010001050	APCDR AGV Project: Array data from 78 Wolof. Raw data, intensity files and post-QC Plink files.	Illumina HumanOmni2-5_8v1_A	-
EGAD00010001051	APCDR AGV Project: Array data from 97 Barundi. Raw data, intensity files and post-QC Plink files.	Illumina HumanOmni2-5_8v1_A	-
EGAD00010001052	APCDR AGV Project: Array data from 100 Kalenjin. Raw data, intensity files and post-QC Plink files.	Illumina HumanOmni2.5-4v1_B	-
EGAD00010001053	APCDR AGV Project: Array data from 100 Banyarwanda. Raw data, intensity files and post-QC Plink files.	Illumina HumanOmni2.5-4v1_B and HumanOmni2-5_8v1_A	-
EGAD00010001054	APCDR AGV Project: Array data from 74 Fula	Illumina HumanOmni2-5_8v1_A	-
EGAD00010001055	APCDR AGV Project: Array data from 100 Baganda. Raw data, intensity files and post-QC Plink files.		-
EGAD00010001056	APCDR AGV Project: Array data from 100 Zulu. Raw data, intensity files and post-QC Plink files.	Illumina HumanOmni2.5-4v1_B and HumanOmni2-5_8v1_A	-
EGAD00010001057	APCDR AGV Project: Array data from 88 Mandinka. Raw data, intensity files and post-QC Plink files.	Illumina HumanOmni2-5_8v1_A	-
EGAD00010001062	blood-based gene expression from breast cancer cases and age-matched controls	IlluminaHuman AWG-6 and HT12	455
EGAD00010001063	blood-based gene expression from breast cancer cases	IlluminaHuman AWG-6 and HT12	173
EGAD00010001064	tumor-based gene expression from breast cancer cases	IlluminaHuman HT12	173
EGAD00010001074	Rare CNVs from schizophrenia cases and controls	Mulitple CNV platforms	1
EGAD00010001075	Argentine samples using 250K	Illumina Exome 250K	391
EGAD00010001079	Affymetrix SNP6.0 array breast cancer data	Affymetrix SNP6.0	66
EGAD00010001081	Summary statistics for Malaria Genomic Epidemiology Network, "A novel locus of resistance to severe malaria in a region of ancient balancing selection", Nature (2015)	Illumina Omni 2.5M	1
EGAD00010001099	Digital images of ovarian cancer metastases	Aperio	127
EGAD00010001101	Genotype data from Chad, Lebanon, and Yemen	Illumina HumanOmni2.5-8 v1.1 B	-
EGAD00010001102	Genotype data from Chad, Lebanon, and Yemen	Illumina HumanOmni2.5-8 v1.2 A	-
EGAD00010001103	Genotype data from Chad, Lebanon, and Yemen	Illumina HumanOmni2.5-8 v1.1 B	-
EGAD00010001131	The 100 European-descent (EUB) and 100 African-descent (AFB) Belgians studied were genotyped for a total of 4,301,332 SNPs on the Illumina HumanOmni5-Quad BeadChips. Whole-exome sequencing was carried out for the same 200 individuals with the Nextera Rapid Capture Expanded Exome kit, on the Illumina HiSeq 2000 platform, with 100-bp paired-end reads. This kit delivers 62 Mb of genomic content per individual, including exons, untranslated regions (UTR), and microRNAs. Omni5 and exome datasets were merged, yielding a concordance rate between platforms of 99.93%.	Illumina HumanOmni5-Quad and exome sequencing	200
EGAD00010001139	HipSci - Healthy Normals - Methylation Array - October 2016	Illumina	181
EGAD00010001141	Summary data from Meta-analysis of Genome-Wide-Association Studies for plasma levels of Coagulation Factor XI (FXI)		-
EGAD00010001143	HipSci - Healthy Normals - Expression Array - September 2016	Illumina	613
EGAD00010001145	HipSci - Bardet-Biedl Syndrome - Methylation Array - October 2016	Illumina	45
EGAD00010001147	HipSci - Healthy Normals - Genotyping Array - September 2016	Illumina	613
EGAD00010001149	HipSci - Monogenic Diebetes - Methylation Array - October 2016	Illumina	35
EGAD00010001153	Family Trios on aCGH 8x60K	Agilent 8x60K	138
EGAD00010001155	Crohn's disease DNA samples genotyped using UK Biobank Axiom array	Axiom UKB	1676
EGAD00010001157	Genotyping of additional Inflammatory Bowel Disease cases - 2014 (QC pass samples)	Illumina Human Core Exome 12v1-1_a	9247
EGAD00010001158	Genotyping of additional Inflammatory Bowel Disease cases - 2014 (all samples)	Illumina Human Core Exome 12v1-1_a	11767
EGAD00010001161	Oncotrack metastatic samples using 450K. The shared AF analysis files oncotrackDNAmAnalysis.R and oncotrackDNAmBetaScores.txt which were applied for both Oncotrack_450K_tumor (EGAD00010001162) and Oncotrack_450K_metastatic (EGAD00010001161) datasets are included on Oncotrack_450K_tumor (EGAD00010001162) dataset.	Illumina 450K	15
EGAD00010001162	Oncotrack primary tumor samples using 450K. The dataset includes shared AF analysis files oncotrackDNAmAnalysis.R and oncotrackDNAmBetaScores.txt which were applied for both Oncotrack_450K_tumor (EGAD00010001162) and Oncotrack_450K_metastatic (EGAD00010001161) datasets.	Illumina 450K	67
EGAD00010001176	This dataset contains 15 control SNP-array dataset from 15 EGFR mutant lung adenocarcinoma patients.	Illumina	15
EGAD00010001177	This dataset contains 61 tumors SNP-array dataset from 15 EGFR mutant lung adenocarcinoma patients.	Illumina	61
EGAD00010001179	Tissue samples using Illumina HumanOmniExpress-FFPE-12 v1.0 BeadChip	Illumina HumanOmniExpress-FFPE-12 v1.0 BeadChip	22
EGAD00010001184	This data set includes the following summary level data file used for the imputation data: imputation.sv.assoc.txt: results from single variant association analysis in imputed samples		-
EGAD00010001185	This data set includes the following summary level data files used for the GoT2D WGS analysis: wgs.assoc.samples.list: list of samples to keep for association analysis wgs.assoc.variants.list: list of variants to keep for association analysis wgs.sv.assoc.txt: single variant association results		-
EGAD00010001187	This data set includes the following summary level data file used for the exome chip analysis: exome_chip.sv.assoc.txt: results from single variant association analysis in exome chip		-
EGAD00010001188	This data set includes the following summary level data files used for the 13k analysis of T2D-GENES data: wes.variants.list: list of variants to keep for any analysis of the exomes data wes.assoc.samples.list: list of samples to keep for association analysis wes.assoc.variants.list: list of variants to keep for association analysis wes.sv.assoc.txt: single variant association analysis results wes.gene.ptv.variants.list.txt: list of protein truncating variants to use in gene-level analysis wes.gene.ptv.assoc.txt: results from gene-level tests of protein truncating variants wes.gene.nsstrict.variants.list.txt: list of NSstrict variants to use in gene-level analysis wes.gene.nsstrict.assoc.txt: results from gene-level tests of NSstrict variants wes.gene.nsbroad.variants.list.txt: list of NSbroad variants to use in gene-level analysis wes.gene.nsbroad.assoc.txt: results from gene-level tests of NSbroad variants wes.gene.ns.variants.list.txt: list of non synonymous variants to use in gene-level analysis wes.gene.ns.assoc.txt: results from gene-level tests of non synonymous variants		-
EGAD00010001192	Germline genotype data on 56,479 ovarian cancer cases and controls	Illumina OncoArray	56479
EGAD00010001196	Raw Array data from the CPCGene BRCA study	Affymetrix OncoScan FFPE Express	48
EGAD00010001198	Case control samples using Infinium Omni2.5	Infinium Omni2.5M	274
EGAD00010001200	Genotyping data from Indonesian sea nomad and surrounding populations	Illumina Omni 5	105
EGAD00010001202	Human genotyping data for patients infected by hepatitis C virus	Affymetrix UKBiobank Array	563
EGAD00010001204	MacTel Projet consortium case and control genotypes from Ilumina Omni5 chip	Illumina Omni5	1
EGAD00010001209	Genome-wide SNP genotyping data for 1,235 western Africans by Illumina HumanOmniExpress-12 array, used in the EGAS00001002078 study	Illumina HumanOmniExpress-12	1235
EGAD00010001211	Inverse variance weighted fixed effect meta-analysis of three European GWAS studies of the offspring of Pre-eclampsia affected births (2658 Cases and 308267 Controls).		-
EGAD00010001212	Genetic studies of pregnancy-related cardiometabolic disorders in Central Asian, Northern European, and Colombian populations	Illumina HumanOmniExpress-12v1_J	-
EGAD00010001216	Melanoma cell lines CNV by SNP6	SNP6	22
EGAD00010001218	Raw Array data from the CPCGene 200PG study	Affymetrix OncoScan FFPE Express	502
EGAD00010001221	Illumina Omni 2.5M SNPchip data (build37) of Ethiopian samples from the Pagani et al. 2015 AJHG paper (doi: http://dx.doi.org/10.1016/j. ajhg.2015.04.019)	Illumina HumanOmni2-5_8v1_A	124
EGAD00010001223	Illumina Omni 2.5M SNPchip data (build37) of Egyptian samples from the Pagani et al. 2015 AJHG paper (doi: http://dx.doi.org/10.1016/j. ajhg.2015.04.019)	Illumina HumanOmni2-5M-8v1-1_B	100
EGAD00010001228	Primary and PDX SqCC samples using Infinium OmniExpress-24	Infinium_OmniExpress-24v1.0	24
EGAD00010001232	CN/LOH-profile of Translocation-negative FL_8	Affymetrix SNP 6.0	1
EGAD00010001233	CN/LOH-profile of Translocation-negative FL_5	Affymetrix SNP 6.0	1
EGAD00010001234	CN/LOH-profile of Translocation-negative FL_9	Affymetrix SNP 6.0	1
EGAD00010001235	CN/LOH-profile of Translocation-negative FL_11	Affymetrix SNP 6.0	1
EGAD00010001236	CN/LOH-profile of Translocation-negative FL_4	Affymetrix SNP 6.0	1
EGAD00010001237	CN/LOH-profile of Translocation-negative FL_10	Affymetrix SNP 6.0	1
EGAD00010001238	CN/LOH-profile of Translocation-negative FL_2	Affymetrix SNP 6.0	1
EGAD00010001239	CN/LOH-profile of Translocation-negative FL_6	Affymetrix SNP 6.0	1
EGAD00010001240	CN/LOH-profile of Translocation-negative FL_1	Affymetrix SNP 6.0	1
EGAD00010001241	CN/LOH-profile of Translocation-negative FL_7	Affymetrix SNP 6.0	1
EGAD00010001243	UK TGCT control samples using the Infinium 1.2M array	Illumina Infinium 1.2M array	4946
EGAD00010001246	UK TGCT controls samples using theInfinium OncoArray-500K BeadChip	Infinium OncoArray-500K BeadChip	7422
EGAD00010001247	UK TGCT case samples using theInfinium OncoArray-500K BeadChip	Infinium OncoArray-500K BeadChip	3206
EGAD00010001249	TGCT - GWAS loci Hi-C data	Illumina HiSeq 2000	1
EGAD00010001251	Epigenome of 36 rainforest hunther-gathering Baka of Cameroon by Illumina HumanMethylation450 array, used in the EGAS00001002226 study	Illumina HumanMethylation450	38
EGAD00010001253	Affymetrix SNP 6.0	Affymetrix SNP 6.0	245
EGAD00010001255	Autosomal STR genotypes using 15 Identifiler loci	Applied Biosystems	990
EGAD00010001258	Pilot study on the interplay between genetic, epigenetic, and environmental risk factors for obesity and related cardiometabolic diseases with 973 samples from South Africa genotyped on Illumina Human MetaboChip array.	Human Cardio Metabochip	973
EGAD00010001260	DNAm Case samples using Illumina Infinium 450K	Illumina 450K array	33
EGAD00010001261	DNAm Case samples using Illumina Infinium 450K	Illumina 450K array	33
EGAD00010001262	DNAm Case samples using Illumina Infinium 450K	Illumina 450K array	32
EGAD00010001265	original population (oMSC) and highly migrative subpopulation (sMSC) of murine eGFP+ bone marrow MSC	Affymetrix Mouse Gene ST 2.0	6
EGAD00010001273	Affymetrix GeneChipÂ® Human Transcriptome Array 2.0	Affymetrix GeneChipÂ® Human Transcriptome Array 2.0	34
EGAD00010001274	Expression profiling by Nanostring cancer immune	Nanostring Cancer Immune	30
EGAD00010001275	Affymetrix GeneChipÂ® Human Transcriptome Array 2.0	Affymetrix GeneChipÂ® Human Transcriptome Array 2.0	34
EGAD00010001276	Expression profiling by Nanostring cancer pathway	Nanostring cancer pathway	30
EGAD00010001278	ATRX SNP6 data on Affymetrix 600k	Affymetrix 600K	-
EGAD00010001280	Transcriptome array dataset	Affymetrix HG_U133_+2	25
EGAD00010001281	SNP array dataset	HUMANOMNIEXPRESS	50
EGAD00010001283	Illumina HumanOmni5-Quad BeadChips	Illumina	229
EGAD00010001285	Genotyping of knee osteoarthritis patients who have undergone total joint replacement	Illumina InfiniumCoreExome-24v1-1_A	17
EGAD00010001287	Array methylation profiling of knee osteoarthritis patients who have undergone total joint replacement	Illumina HumanMethylation450K	68
EGAD00010001289	Resolving the Genetic Architecture of Aseptic Loosening After Total Hip Replacement	Illumina InfiniumCoreExome-24v1-1_A	2880
EGAD00010001291	Methylation profiling of hip osteoarthritis patients who have undergone total joint replacement	Illumina HumanMethylation450K	27
EGAD00010001292	Genotyping of hip osteoarthritis patients who have undergone total joint replacement	Illumina InfiniumCoreExome-24v1-1_A	9
EGAD00010001294	Methylation data using 450K	Illumina 450k	1128
EGAD00010001296	DNA methylation analysis from primary human JMML and normal blood samples using 450K	Illumina_450K	-
EGAD00010001298	primary human ACC and normal samples using 450K	Illumina_450K	110
EGAD00010001300	Medulloblastoma expression profiling	Affymetrix expression array	146
EGAD00010001301	Medulloblastoma expression profiling	Affymetrix expression array	246
EGAD00010001304	Genotyping data from Comorian individuals	Illumina Human Omni5 Bead Chip	49
EGAD00010001307	iOmics gene expression data using Expression Array	Affymetrix Human Gene 1.0 ST Array	269
EGAD00010001308	iOmics miRNA data via qPCR quantification	patented mSMRT-qPCR miRNA assay (MIRXES)	351
EGAD00010001309	iOmics genomic data using 2.5M and Exome array	Illumina 2.5M and Illumina Exome array	323
EGAD00010001310	iOmics lipid data via mass spectrometry (MS)	Agilent 1200 LC system	359
EGAD00010001315	Single cell transcriptomics of PBMCs of 47 donors from the Lifelines Deep cohort (general population, Northern part of the Netherlands). Cells of five or six different donors were pooled together in one sample pool, resulting in eight different sample pools. In total, 28.855 cells were captured and their transcriptomes were sequenced to an average depth of 74k. Genotype data was available for each donor, which allowed us to use the Demuxlet method that uses variable SNPs between the pooled individuals to determine which cell belongs to which individual. Since genotype information is lacking of 2 individuals, the transcriptome of only 45 individuals could be retrieved.	Illumina HiSeq4000	45
EGAD00010001319	Medulloblastoma methylation profiling	Illumina Infinium HumanMethylation450 BeadChip	345
EGAD00010001323	Medulloblastoma methylation profiling	Illumina Infinium HumanMethylation450 BeadChip	911
EGAD00010001326	Papuan Genotyping	Illumina Multi-EthnicGlobal_A1	380
EGAD00010001328	HipSci - Healthy Normals - Genotyping Array - July 2017	Illumina	-
EGAD00010001330	HipSci - Healthy Normals - Expression Array - July 2017	Illumina	1
EGAD00010001332	HipSci - Bardet-Biedl Syndrome - Genotyping Array - July 2017	Illumina	1
EGAD00010001334	HipSci - Monogenic Diabetes - Genotyping Array - July 2017	Illumina	1
EGAD00010001340	HipSci - Bardet-Biedl Syndrome - Expression Array - July 2017	Illumina	1
EGAD00010001342	HipSci - Monogenic Diabetes - Expression Array - July 2017	Illumina	1
EGAD00010001344	HipSci - Hereditary Cerebellar Ataxias - Genotyping Array - July 2017	Illumina	1
EGAD00010001346	HipSci - Hereditary Spastic Paraplegia - Genotyping Array - July 2017	Illumina	1
EGAD00010001348	HipSci - Kabuki Syndrome - Genotyping Array - July 2017	Illumina	1
EGAD00010001350	HipSci - Usher Syndrome - Genotyping Array - July 2017	Illumina	1
EGAD00010001352	HipSci - Alport Syndrome - Genotyping Array - July 2017	Illumina	1
EGAD00010001354	HipSci - Congenital Hyperinsulinia - Genotyping Array - July 2017	Illumina	1
EGAD00010001356	HipSci - Hypertrophic Cardiomyopathy - Genotyping Array - July 2017	Illumina	1
EGAD00010001358	HipSci - Primary Immune Deficiency - Genotyping Array - July 2017	Illumina	23
EGAD00010001360	HipSci - Bleeding and Platelet Disorders - Genotyping Array - July 2017	Illumina	1
EGAD00010001362	HipSci - Macular Dystrophy - Genotyping Array - July 2017	Illumina	1
EGAD00010001364	HipSci - Retinitis Pigmentosa - Genotyping Array - July 2017	Illumina	1
EGAD00010001366	HipSci - Battens Disease - Genotyping Array - July 2017	Illumina	1
EGAD00010001368	HipSci - Hereditary Cerebellar Ataxias - Expression Array - July 2017	Illumina	1
EGAD00010001370	HipSci - Hereditary Spastic Paraplegia - Expression Array - July 2017	Illumina	1
EGAD00010001372	HipSci - Kabuki Syndrome - Expression Array - July 2017	Illumina	1
EGAD00010001374	HipSci - Usher Syndrome - Expression Array - July 2017	Illumina	1
EGAD00010001376	HipSci - Alport Syndrome - Expression Array - July 2017	Illumina	1
EGAD00010001378	HipSci - Congenital Hyperinsulinia - Expression Array - July 2017	Illumina	1
EGAD00010001380	HipSci - Hypertrophic Cardiomyopathy - Expression Array - July 2017	Illumina	1
EGAD00010001382	HipSci - Primary Immune Deficiency - Expression Array - July 2017	Illumina	1
EGAD00010001384	HipSci - Bleeding and Platelet Disorders - Expression Array - July 2017	Illumina	1
EGAD00010001386	HipSci - Macular Dystrophy - Expression Array - July 2017	Illumina	1
EGAD00010001388	HipSci - Retinitis Pigmentosa - Expression Array - July 2017	Illumina	1
EGAD00010001390	HipSci - Battens Disease - Expression Array - July 2017	Illumina	1
EGAD00010001392	Genotyping data from Swahili individuals	Illumina Human Omni5 Bead Chip	91
EGAD00010001395	A replication cohort consisting of 1428 adult survivors of any non-ALL pediatric cancer	Genome-Wide Human SNP Array 6.0 - Thermo Fisher Scientific	1428
EGAD00010001396	A discovery cohort of 856 adult survivors of pediatric ALL	Genome-Wide Human SNP Array 6.0 - Thermo Fisher Scientific	856
EGAD00010001400	Difference in gene expression values between case and control, log2 values. Blood transcriptome from women participating in the Norwegian Women and Cancer study (NOWAC) Post-genome Cohort taken up to eight years before brest cancer diagnosis. Illumina HumanWG-6 version 3 or Illumina HumanHT-12 expression bead chip, combined on identical nucleotide universal identifiers.	Illumina HumanWG-6	467
EGAD00010001403	Gene expression read counts	Illumina HiSeq2000	132
EGAD00010001406	Breast cancer tissue and controls	Exiqon 7th generation miRCURY LNA microRNA microarray system	149
EGAD00010001408	Illumina Infinium 450K array data	Illumina 450K	34
EGAD00010001410	Genotyped samples using Illumina Infinium HumanCoreExome Beadchip	Illumina Infinium HumanCoreExome Beadchip	502
EGAD00010001412	Blood transcriptome from women participating in the Norwegian Women and Cancer study (NOWAC)	Illumina HumanWG-6 version 3 or Illumina HumanHT-12 expression bead chip, combined on identical nucleotide universal identifiers.	920
EGAD00010001414	Raw Array data from the PRAD-CA for ICGC DCC Release26	Affymetrix OncoScan FFPE Express	1
EGAD00010001416	BBMRI - BIOS project - Freeze 2 - methylation	Illumina Human Methylation 450k BeadChip	4386
EGAD00010001418	HumanOmni25M-8v1-1	Illumina	24
EGAD00010001420	Read counts determined using HTSeq-count for the BBMRI BIOS Freeze 2 RNAseq data	RNAseq	3560
EGAD00010001422	1000G Phase 3 Imputed cases and controls from NSAID-induced PUD study	Illumina Omni 2.5	676
EGAD00010001424	Codelink Human Whole Genome from Blood taken at 72 hours after birth (11 cases)	Codelink Human Whole Genome Bioarray	11
EGAD00010001425	Codelink Human Whole Genome from Blood taken at 72 hours after birth (9 controls)	Codelink Human Whole Genome Bioarray	9
EGAD00010001427	Cardio-Metabochip genotypes for B99 cohort	Illumina	1336
EGAD00010001428	Cardio-Metabochip genotypes for IHIT cohort	Illumina	2791
EGAD00010001430	Gene expression analysis from primary human JMML samples using Illumina Human HT-12 v4	Illumina_HumanHT-12_V4	15
EGAD00010001433	ATRT methylation	Illumina HumanMethylation450 BeadChip	162
EGAD00010001443	SNP array	Affymetrix SNP6.0	154
EGAD00010001447	Array-based association data	Illumina Omni Express/Illumina Core Exome	784
EGAD00010001449	Methylation Control samples using 450K Array	Illumina_450K	22
EGAD00010001450	Methylation JMML samples using 450K Array	Illumina_450K	92
EGAD00010001452	Genome-wide SNP genotyping data for 102 Pakistani individuals by Illumina HumanOmni2.5-8 array, used in the EGAS00001002558 study	Illumina HumanOmni2.5-8	102
EGAD00010001455	illumina 450K	450K	1347
EGAD00010001457	These are the log2CPM (log2 counts per million) fragments per gene counts associated with the BAM files in EGAD00001003806, in tab separated format. Counts for 36 postmortem brain samples from 9 non-demented control subjects and 9 Hereditary cerebral hemorrhage with amyloidosis-Dutch type subjects are included (1 Frontal cortex sample and 1 Occipital cortex sample per subject). RNA samples were depleted for ribosomal RNA with the Ribo Zero Gold Human kit (Illumina) and strand specific RNA-Seq libraries were generated. Paired-end sequencing was performed on a HiSeq2500 Illumina system (2x50bp reads). Alignments were performed using GSNAP v2014-12-23 with setting "--npaths 1" on GRCh38 reference genome without the alternative contigs. Fragment per gene counting was performed using HTSeq-count v0.6.1p1 with setting "--stranded reverse". The gene annotation used for quantification were UCSC RefSeq genes for GRCh38 downloaded on 2015-07-13.	Illumina HiSeq 2500	36
EGAD00010001461	illumina 450K	450K	472
EGAD00010001463	Genotype cases using Illumina HumanOmni5	Illumina HumanOmni5	279
EGAD00010001470	Himalayan population genetic study raw data (Himalaya)	Illumina HumanOmniExpress-12-v1-0	170
EGAD00010001471	Himalayan population genetic study QC filtered data	Illumina HumanOmni1-Quad_v1-0, HumanOmniExpress-12-v1-0, humanomniexpress-24-v1-1, HumanOmni25-8v1-2_A1	738
EGAD00010001472	Himalayan population genetic study raw data (Himalaya)	Illumina HumanOmni1-Quad_v1-0	565
EGAD00010001473	Himalayan population genetic study raw data (Tibet)	Illumina humanomniexpress-24-v1-1	148
EGAD00010001479	SNP data for 991 Irish individuals	Illumina	991
EGAD00010001481	CONTROL SAMPLES USING QuantStudio 12K Flex Real-Time PCR System (Thermo Fisher Scientific, Waltham, MA, USA)	OpenArray	258
EGAD00010001482	CASE SAMPLES USING QuantStudio 12K Flex Real-Time PCR System (Thermo Fisher Scientific, Waltham, MA, USA)	OpenArray	657
EGAD00010001484	Genetic Overlap between Metabolic and Psychiatric disease	Illumina HumanCoreExome-12v1-0	2611
EGAD00010001486	290 controls	Illumina HumanOmniExpress BeadChip	290
EGAD00010001487	252 dengue fever patients and 159 dengue shock syndrome patients	Illumina Human 660W Quad BeadChip	411
EGAD00010001489	Genotype data for 5,699,237 genotyped and imputed SNPs in the 816 healthy donors of the Milieu IntÃ©rieur cohort		816
EGAD00010001491	ADP array data, comprised of 2217 samples of Asian ancestry (excluding the Japanese population from ADP). Samples were genotyped on different Illumina or Affy platform.	Affymetrix/Illumina	3933
EGAD00010001495	Intensity files for Immunochip genotypes from blood	Illumina Immunochip	314
EGAD00010001499	EXOME ARRAY ANALYSIS OF ADVERSE REACTIONS TO FLUOROPYRIMIDINE-BASED THERAPY FOR GASTROINTESTINAL CANCER	Illumina HumanExome Array	504
EGAD00010001500	miRNA profiling of human plucked hair follicle from frontal and occipital scalp	Affymetrix miRNA 4.0 Array	48
EGAD00010001501	mRNA profiling of human plucked hair follicle from frontal and occipital scalp	Illumina HT12	48
EGAD00010001506	Methylation array dataset	Illumina 450k	38
EGAD00010001509	A WTCCC2 project - replication study for bacteraemia susceptibility in 2518 individuals from Kenya, genotyped on the Illumina Immunochip chip.	Illumina Infinium ImmunoChip	2518
EGAD00010001511	SNP 6.0 arrays of LCNEC samples	Affymetrix SNP 6.0	54
EGAD00010001513	Copy Number Variation as determined on Illumia Omin Arrays	Illumina Beadchip	122
EGAD00010001515	Nanostring PanCancer immune profiling data for The interface of malignant and immunologic clonal dynamics in high-grade serous ovarian cancer	Nanostring	120
EGAD00010001519	Raw Array data from the PRAD-CA for ICGC DCC Release27	Affymetrix OncoScan FFPE Express	110
EGAD00010001521	Bisulfite Converted DNA obtained from Whole Blood analysed on IlluminaHumanMethylationEPIC BeadChip microarrays processed with bigmelon R package	IlluminaHumanMethylationEPIC	1175
EGAD00010001526	DNA for 2482 individuals from Chongqing was extract from peripheral blood and genotyped by Illumina Omni Zhonghua-8 version 2 gene chips.	Illumina	2482
EGAD00010001527	DNA for 1546 individuals from Chongqing was extract from peripheral blood and genotyped by Illumina Omni Zhonghua-8 version 1 gene chips.	Illumina	1546
EGAD00010001528	DNA for 2979 individuals from Guangzhou was extract from peripheral blood and genotyped by Sequenom, with digit-number working memory, visuospatial working memory, recent long-term memory measured.	Sequenom	2979
EGAD00010001533	A cohort of 2886 participants of the Japan PBC-GWAS Study	Affymetrix Axiom Genome-Wide ASI 1 Array	2886
EGAD00010001535	mRNA expression profile of kidney cancer	nanostring	126
EGAD00010001536	kidney cancer tissue sample	Illumina CytoSNP 12 bead array	129
EGAD00010001538	502 genotypes obtained from Illumina DNA-arrays. Available as plink formatted files		502
EGAD00010001540	Oncoscan CHP files for the Mesothelemia Project	Illumina Oncoscan Array	100
EGAD00010001542	Expression data for 42 PMBCL patient samples (32 IL4R WT cases and 10 cases with mutations in IL4R)	Illumina DASL Assay	42
EGAD00010001544	Imputed genetic data for INTERVAL proteomics cohort	Affymetrix Axiom UK Biobank + imputation to 1000GP3 and UK10K	3301
EGAD00010001546	ATRT expression	Illumina HumanHT-12 v4.0 Array	43
EGAD00010001551	The Kibbutzim Family Study (KFS) aimed to investigate the environmental and genetic determinants of cardiometabolic traits (phenotype is LDL-C)	Illumina HumanCoreExome BeadChip array	901
EGAD00010001557	503 genotypes obtained from Illumina DNA-arrays. Available as plink formatted files	Illumina arrays	503
EGAD00010001561	Quantile-normalised and batch corrected	Illumina HT12.4	703
EGAD00010001562	WG mRNA profiling in FFPE primary melanoma	Illumina HT12.4	703
EGAD00010001564	Primary renal cell carcinoma (RCC) by Affymetrix GeneChip miRNA 4.0	Affymetrix GeneChip miRNA 4.0	56
EGAD00010001566	Allelic imbalance data for cell lines derived from RPE1 with TP53 knockout	humanomniexpress-24-v1-1-a	2
EGAD00010001569	Summary statistics from Stage-1 GWAS for blood pressure phenotypes		5
EGAD00010001571	Genomic Landscape of Chordoid Glioma	Illumina HumanCoreExome-24 array	9
EGAD00010001573	Variations on the Y chromosome from 44 samples		44
EGAD00010001574	Alignement including 83 MT AA sequences and 2 reference sequences, rCRS and RSRS		83
EGAD00010001575	This dataset contains the per-chromosome RFMix input and output files for the local ancestry inference of 59 Aboriginal Australian genomes as reported in Malaspinas et al., 2016. Local ancestry was inferred assuming four mixing ancestral populations represented by: Europeans (27 individuals), Asians (29 individuals), Papuans (13 individuals) and Native Australians (7 individuals from the WCD region).		66
EGAD00010001577	RNA from the same tumor sample (n=98) was also processed using the 3' IVT kit (Affymetrix) and hybridized to U133 Plus 2.0 arrays (Affymetrix).	Affymetrix GeneChip Scanner 3000 7g	98
EGAD00010001579	This dataset contains files generated from Affymetrix Oncoscan Arrays. For each sample there are two paired cel files containing the raw data from AT and GT channels. Raw data has been transfromed to OSCHP signal files also within this dataset.	Oncoscan Array	157
EGAD00010001581	Copy Number Alterations arrays from 21 patients and 24 samples performed by Affymetrix 6.0, Agilent 1M and Oncoscan CNV platforms	Affymetrix 6.0; Agilent 1M; Oncoscan CNV	24
EGAD00010001582	Gene Expression Profiling from 21 cases: 14 CCND1-negative Mantle Cell Lymphoma and 7 CCND1-positive Mantle Cell Lymphoma	Genechip Human Genome U133 Plus 2.0 array	21
EGAD00010001584	The CentralAfricanCMC_Pemberton dataset encompasses 153,798 SNPs from the Illumina Cardio-MetaboChip (Voight et al. 2012) genotyped in 406 individuals from 19 Central African Populations from Gabon, Cameroon, Centralafrican Republic and Uganda). Individual phenotypic and cultural information at the individual level for this data set encompass gender, lifestyle (hunter-gatherer or farmer), and, when available, stature phenotype (standing height in cm, sitting-height in cm, and weight in kg). Other cultural, linguistic, and geographical location information about the sampled populations can be found in Pemberton et al. , Human Genetics, 2018 (https://doi.org/10.1007/s00439-018-1902-3).This dataset can only be accessed and used for non-commercial research purposes with a finality complying with the informed consent provided by Central African donors for the study of human evolutionary history only.	Illumina Cardio-MetaboChip	406
EGAD00010001586	This data set contains an .Rdata file for all the processed segmentation profiles from 81 lpWGS samples included used in downstream analyses from Github repository Evo_history_CACRC. Lastly, there is an .Rdata object with 50 segmentations for 50 total samples from 25 sporadic SNP adenomas used in the comparison with colitis samples.	Low Pass Whole Genome Sequencing (LP-WGS)	131
EGAD00010001587	This dataset contains 30 idat files each from 15 SNP array runs on patient colitis-associated colorectal cancer tumours. All phenotypes are cancer. See Baker et al. 2018 Supplementary Table 2 for patient details of 12 tumours used in the analyses in the publication.	SNP array	15
EGAD00010001589	Primary renal cell carcinoma (RCC) and RCC metastases by Affymetrix GeneChip HTA 2.0	Affymetrix Human Transcriptome Array 2.0	112
EGAD00010001591	SNPtest association statistics from case-control analysis (includes imputed SNPs) namely : rsID, Chromosome, Position, Beta, SE.	Illumina_OncoArray-500K Bead Array	8169
EGAD00010001593	This dataset includes raw data (.idat) for the Illumina Human450k beadchip and methylation levels (.txt files). Methylation level were treated for normalization and background substraction. We removed probes with at least one of the following characteristics: (1) weak signal (pâ€‰>â€‰0.01) (2128 CpG sites), (2) SNP-enriched sites (4100 sites), (3) out of a CpG context (not on a CG) (3149 sites), or (4) located on sex chromosomes (11,129 sites). A total of 465,071 CpG sites were analyzed initially. Signal was then normalized, first by scaling to the internal controls using the methylumi R package, then by applying the method of subset-quantile within array normalization (SWAN) implemented in the minfi R package.	Illumina 450K	167
EGAD00010001594		Illumina 450K	24
EGAD00010001596	DNA methylation data from patient RMS tumor samples from Illumina 450 K arays	Illlimuna EPIC 450 K	32
EGAD00010001598	Batch 1 of unfiltered genotype data for DDD Study patients (N=2,997), some of which were used in the neurodevelopmental disorder discovery GWAS (Niemi et al., Nature 2018). Samples were genotyped on the Illumina HumanCoreExome BeadChip. QC'd data is available in release EGAD00010001604	Illumina HumanCoreExome-24v1-0	3000
EGAD00010001600	Batch 2 of unfiltered genotype for DDD Study patients (N=8,286), some of which were used in the neurodevelopmental disorder discovery GWAS (Niemi et al., Nature 2018). Samples were genotyped on the Illumina InfiniumCoreExome Beadchip. QC'd data is available in release EGAD00010001604	Illumina HumanCoreExome-24v1-1	8207
EGAD00010001602	Unfiltered genotype data for DDD Study trios (patient and parents) (N=2,166 samples), some of which were used for replication of neurodevelopmental disorder polygenic risk (Niemi et al., Nature 2018). Samples were genotyped on the Illumina HumanOmniExpress BeadChip	Illumina SangerDDD_OmniExPlusv1_15019773	3822
EGAD00010001604	Post-QC (pre-imputation) genotype data for N=6,983 DDD probands included in the neurodevelopmental disorder discovery GWAS (Niemi et al., Nature 2018). Consists of filtered set of samples and variants from EGAD00010001598 and EGAD00010001600. Includes patient HPO phenotype terms and GWAS summary statistics (including imputed variants). Samples were genotyped on the Illumina HumanCoreExome BeadChip and Illumina InfiniumCoreExome Beadchip	Illumina HumanCoreExome-24v1	6987
EGAD00010001606	Post-QC (pre-imputation) genotype data for N=2,166, a subset of trios described in EGAD00010001602. These data form N=722 complete trios in which the proband has a neurodevelopmental phenotype (Niemi et al. Nature 2018). Includes HPO phenotype terms for patients. Samples were genotyped on the Illumina HumanOmniExpress BeadChip	Illumina SangerDDD_OmniExPlusv1_15019773 MiSeq	2225
EGAD00010001608	The T cell Receptor Sequencing dataset contains 84 files related to T cell receptor sequences obtained using ImmunoSeq by Adaptive Biotechnologies and phenotype metadata from 23 patients enrolled on a phase II clinical trial of neoadjuvant immune checkpoint blockade in high-risk resectable melanoma at MD Anderson Cancer Center (NCT02519322). Included are data on baseline and on-treatment samples from tumor and blood.	MiSeq	59
EGAD00010001610	DNA methylation of NF1-glioma	Illumina 850K Epic Array	31
EGAD00010001612	MAGEcontrol samples using omni 2.5M	Genotype	737
EGAD00010001618	Genome-wide DNA methylation profiles of MZ twins clinically discordant for MS generated using Illuminaâ€™s Infinium MethylationEPIC BeadChip assay (EPIC array)	Illumina Infinium MethylationEPIC BeadChip assay	90
EGAD00010001620	Single cell RNA-seq analysis of human skin.	single cell RNA-seq	12
EGAD00010001622	Human Core Exome Genotyping for 1471 samples from the STudy Into Lean and Thin Subjects (STILTS) cohort	Illumina humancoreexome-12v1-1_a	1471
EGAD00010001623	Human Core Exome Genotyping for 1456 severe early onset obesity cases (SCOOP)	Illumina humancoreexome-12v1-1_a	1456
EGAD00010001624	Genetics of thinness compared to obesity - summary statistics	Illumina humancoreexome-12v1-1_a	2927
EGAD00010001626	mpMRI visible prostate tumour samples (PI-RADSv2 5)	OncoScan	20
EGAD00010001627	mpMRI invisible prostate tumour samples	OncoScan	20
EGAD00010001629	Methylation of anaplastic meningiona samples	Ilumina Infinium HumanMethylationEPIC BeadChip array	26
EGAD00010001631	SNP array datas of Matched cancer-PNE	GeneChip Human Mapping 250K NspI	124
EGAD00010001633	Genotyping data for 32 individuals from a family affected by HPAH	Illumina Infinium CoreExome-24 BeadChip v1.1	32
EGAD00010001635	Over 1.87 million SNP and CNV loci are screened by Affymetrix SNP 6.0 array		415
EGAD00010001636	Over 2.5 million SNP and CNV loci are screened by Illumina Infinium Omni2.5Exome-8 Kit		196
EGAD00010001637	Over 1.87 million SNP and CNV loci are screened by Affymetrix SNP 6.0 array		539
EGAD00010001638	Over 2.5 million SNP and CNV loci are screened by Illumina Infinium Omni2.5Exome-8 Kit		262
EGAD00010001640	The individuals were genotyped for the Illumina Human Omni Express Bead Chip (OmniExpress), containing 741,000 SNPs. 22 samples were excluded with more than 10% missing genotypes		478
EGAD00010001642		Illumina EPIC methylation bead array	25
EGAD00010001643		Illumina 450k methylation bead array	73
EGAD00010001645	Genotype of PTPN22 SNPs in LOTx donors and recipients		290
EGAD00010001647	Genotype data from the Affymetrix 6.0 platform for 4,375 Colombian pre-eclampsia cases and controls	Affmetrix 6.0	4375
EGAD00010001649	This set features unfiltered, aligned, UMI-based single cell RNA sequencing count data for 5290 Blood, intraepithelial ileum (IEL) and lamina propria ileum (LPL) T cells from Crohn's disease patients as published in Uniken Venema et al, Gastroenterology 2019	Smartseq2_adapted	3
EGAD00010001651	The individuals were assayed for genome-wide SNP genotypes using the Illumina Human Omni5 Bead Chip (Illumina), which surveys 4,284,426 single nucleotide markers regularly spaced across the genome	Illumina Human Omni5 Bead Chip (4,284,426 SNPs)	3
EGAD00010001653	Binary Plink files for post-GWAS quality control in 7409 samples genotyped using Axiom 815K Spanish Biobank array (Thermo Fisher)	Axiom 815K Spanish Biobank array	7409
EGAD00010001654	Binary Plink files for pre-GWAS quality control in 7409 samples genotyped using Axiom 815K Spanish Biobank array (Thermo Fisher)	Axiom 815K Spanish Biobank array	7409
EGAD00010001655	Intensity calculation on the pixel values of the DAT file for 7409 samples genotyped using Axiom 815K Spanish Biobank array (Thermo Fisher)	Axiom 815K Spanish Biobank array	7409
EGAD00010001657	Meta-analysis association statistics from case-control analysis (includes imputed SNPs)	NA	30657
EGAD00010001659	Transcriptomics analysis results	Illumina HumanHT-12 v4	36
EGAD00010001660	Methylomics analysis results formatted as a beta matrix	Illumina HumanMethylation-450k	36
EGAD00010001664	4988 samples issued from GCAT cohort, genotyped with MEGAex-Infinium Array, with data for Cr1-22. Plink files with QC and imputed (SHAPEIT+IMPUTE).	Illumina-Genotyping Array	4988
EGAD00010001665	4988 samples issued from GCAT cohort, genotyped with MEGAex-Infinium Array, with data for Cr1-22. Plink files with QC but not imputed.	Illumina-Genotyping Array	4988
EGAD00010001667	33 ETMR samples were genotyped using Illumina HumanOmni2.5M array. Hybridization and scanning was done according to the manufacturer's instructions (Illumina). Copy number (Log R) and B allele frequency estimates were obtained sing the Genotyping module (v1.9.4) in GenomeStudio v2011.1 (Illumina). Normalized and log2-transformed copy number measurements were imported from genomestudio and analysed using R package CopyNumber to identify segments with similar copy number.	Illumina Omni2.5	33
EGAD00010001669	77 ETMR samples were profiled using methylation array. DNA from frozen tissue and formalin-fixed, paraffin-embedded (FFPE) materials were analyzed with the Illumina Infinium HumanMethylation450 (450k) and MethylationEPIC (EPIC) array according to manufacturerâ€™s instructions and with a modified method that was previously described (Torchia et al. Cancer Cell. 2016; Triche et al. Nucleic Acids Res. 2013).	Illumina Infinium HumanMethylation450K	77
EGAD00010001671	Raw sequencing reads from H3K27ac ChIP and input DNA from lymphoblastoid cells of three TET2 mutation carriers and two wild-type family members were quality and adapter trimmed with cutadapt version 1.16 in Trim Galore version 0.3.7 using default parameters. Trimmed reads were aligned to hs37d5 reference genome using Bowtie2 (version 2.1.0). Duplicate reads were removed with samtools rmdup (v1.7). Fragment coverage of paired-end reads was calculated from bam files with BEDtools genomecov (v2.26.0).	Illumina HiSeq 2500	5
EGAD00010001673	Genotype array data from normal tissue	Illumina HumanOmni2.5-8 BeadChip	21
EGAD00010001674	Methylation data from tumor tissue	Agilent SureSelectXT Human Methyl. Seq	21
EGAD00010001675	Genotype array data from tumor tissue	Illumina HumanOmni2.5-8 BeadChip	21
EGAD00010001676	Expression array data from tumor tissue	Thermo Fisher Scientific GeneChip Human Transcriptome Array 2.0	21
EGAD00010001678	Genotype data for 140 present-day individuals from five populations in Pakistan in The first horse herders and the impact of early Bronze Age steppe expansions into Asia DOI: 10.1126/science.aar7711. Sampling details are presented in supplementary section S2.1 Data generation	Infinium OmniExpressExome-8 v.1.3 BeadChip	140
EGAD00010001680	Illumina Infinium Human 450k methylation arrays	Illumina	17
EGAD00010001681	Affymetrix PrimeView Human Gene Expression	Affymetrix	56
EGAD00010001683	Case_control_meta_analysis	Array	1
EGAD00010001685	EXCEED samples imputed to HRC reference panel using Michigan Imputation server	Axiom UK Biobank array	5216
EGAD00010001687	Meningioma Methylation	Illumina	280
EGAD00010001689	Tumor biopsies from LAM disease were retrospectively analyzed by multiple techniques to characterize the alterations in patients ,to elucidate the landscape of genetic/genomic alterations.	Affymetrix OncoScan	24
EGAD00010001691	4 AC samples, each with adenoma, carcinoma and normal colon tissue (12 samples in total) were analysed on the Infinium MethylationEPIC BeadChip for copy number alteration analyses.	Beadarray	12
EGAD00010001695	Islet_HumanMethylation450K_ThurnerEtAl	HumanMethylation450K	41
EGAD00010001699	EXCEED genotyping	Axiom UK Biobank array	5216
EGAD00010001701	Total RNA (100ng) from 21 ETMRs with C19MC structural alterations and 28 other PBTs was prepared with nCounter miRNA Sample Prep Kit according to standard protocol. miRNA expression profiling was conducted with human v1, v2, or v3 miRNA panel on nCounter miRNA expression platform (NanoString Technologies, Seattle, WA) according to manufacturerâ€™s protocol. Signal normalization was done using nSolver Analysis and batch corrected using ComBat (Johnson et al. Biostatistics. 2007). 565 miRNAs overlapped between all three versions and was used for further analyses. Fold change and supervised t-test with FDR correction was calculated between the ETMRs and other PBTs.	Nanostring	49
EGAD00010001703	RNAseq reads were aligned with STAR 2.5.3a and gene expression was quantified with RSEM 1.3.0		144
EGAD00010001705	ALL SAMPLES USING ClariomD microarray (Affymetrix)	Affymetrix ClariomD	54
EGAD00010001707	Table of gene-level RNA counts from 21 newborn screening dried blood spot (DBS) samples. These DBS samples were obtained from extremely low gestional age newborns, where 10 of them were affected by a fetal inflammatory response (FIR) before birth, and 11 were unaffected. Total RNA was sequenced using an Illumina NextSeq-500 instrument. The sample preparation protocol included the depletion of rRNA and globin mRNA using the Globin Zero Gold rRNA Removal Kit from Illumina. Libraries were prepared using the NebNext Ultra TM II Directionl RNA LIbrary Prep Kit (New England Biolabs). Rows correspond to genes and columns to samples, where there is an additional column (BS13sub), corresponding to sample BS13, which was downsampled to 1/4 of its original depth.	Illumina NextSeq-500	21
EGAD00010001709	Gene expression for 303 ADME and ADME related genes (averaged log2 signal intensities using Human-WG6v2 Expression BeadChip)	Illumina Human-WG6v2 Expression BeadChip	150
EGAD00010001711	Illumina Infinium MethylationEPIC BeadChip kit (Illumina, Inc., San Diego, CA). Standard Illumina procedures using Illumina iScan scanner.	Illumina Infinium MethylationEPIC BeadChip	120
EGAD00010001713	Illumina Infinium Omni2.5 Genome-Wide Genotyping Array	Illumina Infinium Omni2.5 BeadChip	48
EGAD00010001715	Data on Affymetrix 6.0 arrays for Genome-Wide Association Study of colorectal cancer in the Spanish population. Additionally, geographical origin for each sample is provided, which constitutes the largest to-date Spanish genomic sample population	Affymetrix 6.0 array	1299
EGAD00010001717	This is the affymetrix gene expression data of the metastatic tumours related to this study.	Human Clariom D Arrays	11
EGAD00010001719	RNA-sequencing data	Paired RNA-sequencing data	20
EGAD00010001720	Data from Infinium EPIC 850K DNA methylation beadchip	Infinium EPIC DNA methylation beadchip	77
EGAD00010001722	GWAS results in epacts format for Danjou et al, Nature Genetics 2015	Illumina arrays	6305
EGAD00010001724	598764 SNPs genotyped for 719 indivuals, merge from Illumina Omni1 and Illumina Omni2.5	Illumina Omni1 and Illumina Omni2.5	719
EGAD00010001726	Blastic plasmacytoid dendritic cell neoplasm (BPDCN) is a rare hematologic malignancy that is most similar in expression profiles to plasmacytoid dendritic cells. However, patients often exhibit features of AML and can progress to AML. In this project, we will determine the differentially and commonly expressed genes between BPDCN and AML specimens. Available BPDCN and TET2-mutated AML specimens were taken for transcriptome microarray analysis.	ThermoFisher Scientific ClariomTM D Pico Assay	7
EGAD00010001727	Blastic plasmacytoid dendritic cell neoplasm (BPDCN) is a rare hematologic malignancy that is most similar in expression profiles to plasmacytoid dendritic cells. However, patients often exhibit features of AML and can progress to AML. In this project, we will determine the differentially and commonly expressed genes between BPDCN and AML specimens. Available BPDCN and TET2-mutated AML specimens were taken for transcriptome microarray analysis.	ThermoFisher Scientific ClariomTM D Pico Assay	6
EGAD00010001729	2619 individuals with visual contour perception phenotype scores, in the form of averaged accuracy	NA	2619
EGAD00010001731	DNA methylation in bronchial biopsies of asthmatics, asthma in remission and healthy subjects	Infinium HumanMethylation450 BeadChip array (450k array)	179
EGAD00010001733	EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:Vietnam	Illumina Omni 2.5M	1728
EGAD00010001734	EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:Ghana	Illumina Omni 2.5M	782
EGAD00010001735	EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:PNG	Illumina Omni 2.5M	815
EGAD00010001736	EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:Nigeria	Illumina Omni 2.5M	419
EGAD00010001737	EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:Mali	Illumina Omni 2.5M	900
EGAD00010001738	EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:Gambia	Illumina Omni 2.5M	5594
EGAD00010001739	EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:BurkinaFaso	Illumina Omni 2.5M	1446
EGAD00010001740	EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:Cameroon	Illumina Omni 2.5M	1471
EGAD00010001741	EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:Malawi	Illumina Omni 2.5M	3088
EGAD00010001742	EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:Kenya	Illumina Omni 2.5M	3865
EGAD00010001743	EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations:Tanzania	Illumina Omni 2.5M	979
EGAD00010001746	Functional genomics approaches to understand osteoarthritis	Illumina HumanCoreExome-24v1-1	77
EGAD00010001748	EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations: phased genotypes	Illumina Omni 2.5M	17960
EGAD00010001755	Skin tumour	Illumina EPIC	7
EGAD00010001757	Tumor biopsies from LAM disease were analyzed by MLPA to characterize the alterations in patients ,to elucidate the landscape of genetic/genomic alterations. The dataset include 44 samples.	3730xl	44
EGAD00010001761	Tumor biopsies from LAM disease were analyzed by sanger to characterize the alterations in patients ,to elucidate the landscape of genetic/genomic alterations. The dataset include 21 samples.	sanger(3730 XL)	21
EGAD00010001763	Association results from Polish cohort	Immunochip	1062
EGAD00010001764	Association results from the Dutch cohort	Immunochip	3378
EGAD00010001765	Association results from Spanish cohort	Immunochip	2325
EGAD00010001766	Association results fromthe Agentinian cohort two	Immunochip	465
EGAD00010001767	Association results from British cohort	Immunochip	16002
EGAD00010001768	Association results from the Agentinian cohort one	Immunochip	741
EGAD00010001769	Association results from the Irish cohort	Immunochip	848
EGAD00010001770	Results of the celiac diease meta-analysis	Immunochip	27786
EGAD00010001771	Association results from the Italian cohort	Immunochip	2965
EGAD00010001775	Genotypes of Russian people from Ustuyzhna (Vologda Oblast, Russia)	Infinium OmniExpress-24v1-2_A1, iScan+ (Illumina)	46
EGAD00010001776	Genotypes of nenets people from Yamalo-Nenets Autonomous Okrug (Russia)	Infinium OmniExpress-24v1-2_A1, iScan+ (Illumina)	41
EGAD00010001783	Western Mediterranean Illumina Infinium Omni 2.5 array data	Illumina Infinium Omni2.5M	142
EGAD00010001795	Array data for oesophageal and related samples â€“ kno_paper_methyl_release	Illumina	78
EGAD00010001797	Methylation microarray profiling (Illumina Human Methylation 450k and EPIC platforms) of 60 adult glioblastomas. Tumours were subtyped using the approach from Sturm et al. (https://doi.org/10.1016/j.ccr.2012.08.024): 12 IDH, 18 MES, 12 RTK I, 18 RTK II. DNA was prepared, assayed on the microarrays, and raw data computationally processed as described in Capper et al., "DNA methylation-based classification of central nervous system tumours": https://www.nature.com/articles/nature26000		60
EGAD00010001799	EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations: Sequenom MassArray genotypes	Sequenom MassArray (Agena Bioscience)	40256
EGAD00010001801	Sample genotyped with Axiom InCor BB (Affymetrix) with local ancestry masking of non-Native American ancestry	Axiom InCor BB (Affymetrix)	59
EGAD00010001802	Sample genotyped with Axiom InCor BB (Affymetrix)	Axiom InCor BB (Affymetrix)	83
EGAD00010001803	Sample genotyped with Axiom Human Origins (Affymetrix)	Axiom Human Origins (Affymetrix)	12
EGAD00010001805	CASE AND CONTROL SAMPLES USING Infinium MethylationEPIC	Infinium MethylationEPIC	24
EGAD00010001807	Genotyping of Y chromosome in Polish population	iScan, Illumina	2705
EGAD00010001811	h5 files from 15 single cell PDAC samples described in "Transcription phenotypes of pancreatic cancer are driven by genomic events events during tumour evolution"		15
EGAD00010001813	Over 1.87 million SNP and CNV loci are screened by Affymetrix SNP 6.0 array	Affymetrix SNP 6.0 array	91
EGAD00010001814	Over 1.87 million SNP and CNV loci are screened by Affymetrix SNP 6.0 array	Affymetrix SNP 6.0 array	195
EGAD00010001816	The Jerusalem Perinatal Study (JPS) aimed to examine the developmental origins of cardiometabolic risk.	Affymetrix Biobank array	2714
EGAD00010001818	EGAS00001001311: Genome-wide study of resistance to severe malaria in eleven worldwide populations: Gambian trio HLA typing	Sanger Sequencing	96
EGAD00010001822	Array data for oesophageal and related samples â€“ sj_paper_methyl_tumour_release	Illumina	285
EGAD00010001825	Expression measurements in NK cells	Affymetrix Human Gene_1.0ST array	140
EGAD00010001826	Expression measurements in CD4 cells	Affymetrix Human Gene_1.0ST array	123
EGAD00010001827	Expression measurements in moonoocytes	Affymetrix Human Gene_1.0ST array	131
EGAD00010001828	Expression measurements in B cells	Affymetrix Human Gene_1.0ST array	124
EGAD00010001829	Expression measurements in CD8 cells	Affymetrix Human Gene_1.0ST array	146
EGAD00010001830	Illumina Immunochip genotypes	Illumina Immunochip	170
EGAD00010001834	Array data for oesophageal and related samples â€“ sj_paper_methyl_normal_release	Illumina	100
EGAD00010001838	Array data for oesophageal and related samples â€“ sj_paper_methyl_barretts_release	Illumina	150
EGAD00010001841	Copy Number Alterations arrays from 153 samples performed by Affymetrix 6.0 and Oncoscan CNV platforms	Affymetrix 6.0; Cytoscan	153
EGAD00010001842	Gene Expression Profiling from 44 Mantle Cell Lymphoma cases	Human Genome U219 array plate	44
EGAD00010001844	mtDNA variant positions vcf files for 86 human samples	HiSeq X Ten	86
EGAD00010001846	Whole-genome DNA methylation profiling of CD14+ monocytes obtained from CD-active, CD-remissive and non-CD individuals	Illumina Infinium HumanMethylation 450k BeadChip	25
EGAD00010001848	Genotype data obtained using the coreExome Illumina SNP chip array for all the individuals included in the study of gene expression regulation in human primary regulatory CD4+ T cells (Tregs)	coreExome Illumina SNP chip array	120
EGAD00010001850	Epigenome wide DNA methylation assay of OSCC-GB using Illumina methylation array	Illumina Infinium 450K BeadChip Array; Illumina Infinium EPIC Beadchip Array	174
EGAD00010001852	NanoString raw data for a noeadjuvant combination PD-L1 plus CTLA-4 blockade trial on patients with cisplatin-ineligible operable urothelial carcinoma. All samples were FFPE tumor samples. Raw probe count data (.RCC files) were generated from nCounter Digital Analyzer (4.0.0.3).	Nanostring expression immunology panel	34
EGAD00010001854	Genome-wide DNA methylation profiling of genomic DNA isolated from blood PBMCs of breat cancer patients before the start of therapy.	Infinium MethylationEPIC BeadChip array	8
EGAD00010001857	Circulating tumor cells for comprehensive and multiregional non-invasive genetic characterization of multiple myeloma (arrays set)	Affymetrix Cytoscan HD	71
EGAD00010001859	Epigenome of brainstem gliomas	Illumina Methylation Array	123
EGAD00010001861	Genome-wide DNA methylation profiling of Waldenstrom's macroglobulinemia (WM) patient samples	Infinium MethylationEPIC Kit (Illumina)	48
EGAD00010001863	DNA Copy Number. Milan samples.	Affymetrix SNP 6.0	27
EGAD00010001864	Gene expression. Milan samples.	Agilent Sureprint G3 Human Gene Expression 8x60K microarrays (G4851A)	33
EGAD00010001865	DNA Methylation. Milan samples.	Illumina Infinium HumanMethylation450K	32
EGAD00010001867	Yemen and Chad Genotyping	Unknown HumanOmni25-8v1-2	1
EGAD00010001868	Yemen and Chad Genotyping	Unknown HumanOmni25-8v1-2_A1	258
EGAD00010001870	Lebanon_Genotyping	Unknown HumanOmni25M-8v1-1	126
EGAD00010001872	EPIC methylation arrays on PT1-derived PDXs	EPIC arrays	32
EGAD00010001874	Patients with T1DM genotyped on Illumina HiScan using Illumina Infinium OmniExpress Exome-8 v1.4 arrays	Infinium HD Super Microarray	576
EGAD00010001877	DNA methylation profiling from 70 Mantle Cell Lymphoma cases	Infinium MethylationEPIC BeadChip	70
EGAD00010001879	Control human dermal fibroblasts from patient forearm	Illumina 450k	12
EGAD00010001880	Case human dermal fibroblasts from patient forearm	Illumina 450k	12
EGAD00010001886	This dataset contains PLINK processed (PED and MAP) genotype data, from 1000 samples from the UAE using the Illumina Omni5 Exome Bead Chip	Illumina	1000
EGAD00010001888	Illumina 450K DNA methylation profiles of 314 fresh-frozen colorectal mucosa, adenoma or adenocarcinoma samples.	Illumina 450k	314
EGAD00010001895	Multi-omics profiling of paired primary and recurrent glioblastoma patient tissues	AB GeneChip Scanner 3000 7G System Clariom S	22
EGAD00010001901	This dataset contains CEL files and rma normalized expression value for microarray of stage I lung adenocarcinomas from Asian patients. In total, there are 69 patients and 138 samples, including 69 tumor samples and 69 adjacent normal samples.	Affy miRNA 3.0 array	138
EGAD00010001902	This dataset contains PAIR files and processed somatic copy number alteration value for array CGH of stage I lung adenocarcinomas from Asian patients. In total, there are 111 patients and 222 samples, including 111 tumor samples and 111 adjacent normal samples.	NimbleGen HG18 CGH 385K	222
EGAD00010001905	234 samples genotyped at 15 loci LGC Genomics, Hoddesden, UK using the PCR-based KASP assay (Semagn,e tal (2014). Single nucleotide polymorphism genotyping using Kompetitive Allele Specific PCR (KASP): overview of the technology and its application in crop improvement. Mol Breeding 33, 1â€“14.)	PCR-based KASP assay	233
EGAD00010001906	234 samples genotyped at 40 loci using the MassArrayiPLEX genotyping assay using the iPLEX Gold genotyping kit (Agena Biosciences, cat. 10148-2) Gabriel et al. (2009) [Gabriel, S. , Ziaugra, L. and Tabbaa, D. (2009), SNP Genotyping Using the Sequenom MassARRAY iPLEX Platform. Current Protocols in Human Genetics, 60: 2.12.1-2.12.18. doi:10.1002/0471142905.hg0212s60]). Products were detected on a MassArray mass spectrophotometer and data were acquired in real time with MassArray RT software 4.0.0.2 (Agena Biosciences). SNP clustering and validation was carried out with Typer 4.0.26.75 software (Agena Biosciences).	MassArrayiPLEX genotyping assay using the iPLEX Gold genotyping kit	233
EGAD00010001909	Microarray data for 230 DLBCL patients	Affymetrix Human Gene 2.0 ST Array	230
EGAD00010001911	Fresh frozen breast cancer H&E tissue images collected and annotated by the International Cancer Genome Consortium (ICGC), that included the BASIS collaboration. Associated with whole genome sequence data as originally described by Nik-Zainal et al, Nature, 2016 (DOI: 10.1038/nature17676) and deposited with ID EGAS00001001178	H and E image	151
EGAD00010001913	Raw data files for 94 Argentinean samples	Affymetrix Axiom LAT1	94
EGAD00010001917	GWAS genotype data for maternal and fetal (baby) cases of preeclampsia and controls from Uzbekistan. This dataset is a component of the InterPregGen FP7 project. DNA samples for this component were collected by InterPregGen Consortium members in Tashkent, Uzbekistan at the Institute of Immunology, Uzbek Academy of Sciences and at the Republic Specialized Scientific Practical Medical Centre of Obstetrics and Gynecology	Unknown HumanOmni25M-8v1-1	180
EGAD00010001918	GWAS genotype data for maternal and fetal (baby) cases of preeclampsia and controls from Uzbekistan. This dataset is a component of the InterPregGen FP7 project. DNA samples for this component were collected by InterPregGen Consortium members in Tashkent, Uzbekistan at the Institute of Immunology, Uzbek Academy of Sciences and at the Republic Specialized Scientific Practical Medical Centre of Obstetrics and Gynecology	Unknown HumanOmni2-5-8-v1-1-C	2658
EGAD00010001919	GWAS genotype data for maternal and fetal (baby) cases of preeclampsia and controls from Uzbekistan. This dataset is a component of the InterPregGen FP7 project. DNA samples for this component were collected by InterPregGen Consortium members in Tashkent, Uzbekistan at the Institute of Immunology, Uzbek Academy of Sciences and at the Republic Specialized Scientific Practical Medical Centre of Obstetrics and Gynecology	Unknown HumanOmni25-8v1-1	58
EGAD00010001921	EPIC array data from 72 tumor samples with muscle invasive bladder cancer.	EPIC BeadChip (Illumina, San Diego, CA)	72
EGAD00010001923	ChIP-seq narrowPeaks. Software: MACS2 v2.1.2	Illumina HiSeq 2500	23
EGAD00010001924	Gene expression gene-level count values from Stringtie processing including alignment to GRCh37 or GRCh38 genome version. Software: HISAT2 v2.1.0;StringTie v1.3.4d.	Illumina HiSeq 2500/NovaSeq 6000	438
EGAD00010001925	CpG methylation. Software: minimap2 v2.16;Nanopolish.	Oxford Nanopore	202
EGAD00010001926	Allelic imbalance (AI) region calls: start and end positions and the measured mBAF and LRR mean of each region after the BAF segmentation algorithm. Software: Illumina GenomeStudio;PennCNV v. 1.0.4;BAF segmentation v1.2.0.	Illumina Infinium HumanCore-24/HumanOmni2.5-8	2186
EGAD00010001927	Haplotype expression counts. Software: phASER v1.1.1.	Illumina HiSeq 2500/NovaSeq 6000	438
EGAD00010001928	ATAC-seq non-overlapping fixed width peaks with score normalization. Software: MACS2 v2.1.2	Illumina HiSeq 4000	31
EGAD00010001930	metastatic ccRCC	Affymetrix HTA 2.0	409
EGAD00010001932	Samples from Puno, Peru	Axiom LAT Array	61
EGAD00010001933	Samples from eastern Polynesia, Taiwan, and Vanuatu	Axiom LAT Array	354
EGAD00010001934	Samples from Magdalena de Cao, Peru	Illumina MEGA Array	20
EGAD00010001936	Gene expression of 12 colon cancer TSCs (sensitive or resistant) after 12h treatment with 3ÂµM NCT02 or DMSO (control) was analyzed using Illumina microarrays (HumanHT-12 v4 BeadChip).	oligonucleotide beads of HumanHT-12 V4 R2 Expression BeadChips (Illumina)	24
EGAD00010001938	This dataset includes IDAT files from 2,790 blood samples. The samples were profiled using the Illumina Infinium HumanMethylation450 (450k) BeadChip.	Illumina Infinium HumanMethylation450	2790
EGAD00010001940	DNA methylation measures on neutrophils	Illumina MethylationEPIC BeadChip	31
EGAD00010001941	DNA methylation measures on CD34 cells	Illumina MethylationEPIC BeadChip	4
EGAD00010001943	SNP data from 49 paired samples (tumor/germline) with muscle invasive bladder cancer.	Illumina Infinium Human Global Screening Array GSAMD-24v2-0_20024620_a1 BeadChip	98
EGAD00010001945	Preeclampsia (PE) is a syndrome affecting pregnant mothers and fetus/babies characterised by hypertension and proteinuria, and is a leading cause of maternal and fetal death and of premature births worldwide. The InterPregGen Consortium was funded by a European Framework 7 (FP7) grant and grew out of the WTCCC3 GWAS comparing ~2000 UK PE mothers with ~6000 common UK controls. This dataset includes lllumina 2.5-8 genotyping of maternal and fetal PE cases and controls from Kazakhstan. This study is one component of the InterPregGen FP7 project. DNA samples for this component were collected by InterPregGen Consortium collaborators at the Scientific Center of Obstetrics, Gynecology and Perinatology, Almaty, Kazakhstan (Gulnara Svyatova, Principal Investigator).	llumina 2.5-8	3004
EGAD00010001947	Preeclampsia (PE) is a syndrome affecting pregnant mothers and fetus/babies characterised by hypertension and proteinuria, and is a leading cause of maternal and fetal death and of premature births worldwide. The InterPregGen Consortium was funded by a European Framework 7 (FP7) grant and grew out of the WTCCC3 GWAS comparing ~2000 UK PE mothers with ~6000 common UK controls. This dataset includes OmniExpress genotyping of maternal and fetal PE cases and controls from Kazakhstan. This study is one component of the InterPregGen FP7 project. DNA samples for this component were collected by InterPregGen Consortium collaborators at the Scientific Center of Obstetrics, Gynecology and Perinatology, Almaty, Kazakhstan (Gulnara Svyatova, Principal Investigator).	OmniExpress	2305
EGAD00010001949	Preeclampsia (PE) is a syndrome affecting pregnant mothers and fetus/babies characterised by hypertension and proteinuria, and is a leading cause of maternal and fetal death and of premature births worldwide. The InterPregGen Consortium was funded by a European Framework 7 (FP7) grant and grew out of the WTCCC3 GWAS comparing ~2000 UK PE mothers with ~6000 common UK controls. This dataset includes Infinium GSA genotyping of maternal, paternal and fetal PE cases and controls from Kazakhstan. This study is one component of the InterPregGen FP7 project. DNA samples for this component were collected by InterPregGen Consortium collaborators at the Scientific Center of Obstetrics, Gynecology and Perinatology, Almaty, Kazakhstan (Gulnara Svyatova, Principal Investigator).	Infinium GSA	2321
EGAD00010001951	Called genotypes of samples in batch 1 of CRU303 GWAS	Affymetrix Axiom UKB WCSG	682
EGAD00010001952	Raw data files of samples in batch 2 of CRU303 GWAS	Affymetrix Axiom UKB WCSG	190
EGAD00010001953	Raw data files of samples in batch 2 of CRU303 GWAS	Affymetrix Axiom UKB WCSG	36
EGAD00010001954	Raw data files of samples in batch 1 of CRU303 GWAS	Affymetrix Axiom UKB WCSG	692
EGAD00010001955	Raw data files of samples in batch 1 of CRU303 GWAS	Affymetrix Axiom UKB WCSG	397
EGAD00010001956	Called genotypes of samples in batch 2 of CRU303 GWAS	Affymetrix Axiom UKB WCSG	190
EGAD00010001958	Genome-wide data for 98 Native American individuals from Andes and Amazon	Illumina 2.5M Human Omni array	98
EGAD00010001960	Gene expression after cell culture, 12h Hyper-IL6 stimulated		3
EGAD00010001961	Gene expression after cell culture, 12h unstimulated		3
EGAD00010001962	Gene expression after cell culture, 24h unstimulated		3
EGAD00010001963	Gene expression after cell culture, 24h IL6+sgp130-Fc stimulated		3
EGAD00010001964	Gene expression after mammosphere culture, quiescent single cells		10
EGAD00010001965	Gene expression after cell culture, 24h IL6 stimulated		3
EGAD00010001966	Gene expression after mammosphere culture, non-label-retaining cells		5
EGAD00010001967	Gene expression after cell culture, 12h IL6 stimulated		3
EGAD00010001968	Gene expression after mammosphere culture, label-retaining cells		8
EGAD00010001969	Gene expression after cell culture, 24h Hyper-IL6 stimulated		3
EGAD00010001970	Gene expression after cell culture, 12h IL6+sgp130-Fc stimulated		3
EGAD00010001972	Array data for oesophageal and related samples - aks_paper_methyl_barretts_release	Illumina	107
EGAD00010001974	DLBCL DNA methylation data measured by 450k and EPIC Illumina arrays	450k and EPIC Illumina arrays	67
EGAD00010001975	DNA methylation of ICGC CLL patients measured by Illumina 450k array	Illumina 450k	490
EGAD00010001976	DLBCL gene expression data using 133.plus.2 Affymetrix array	133.plus.2	43
EGAD00010001978	1 cell line and 82 Oncoscan SNP tumor initial samples, zipped Affymetrix CEL file types, Oncoscan CNV FFPE Assay Kit, Thermo Fisher Scientific GeneChipTM Scanner 3000 7G	Affymetrix, Thermo Fisher Scientific	85
EGAD00010001980	Affymetrix SNP6.0 data for 341 DLBCL patients	Affymetrix SNP6.0	341
EGAD00010001983	Fixed effect meta-analysis summary statistics combining GWAS of fetal (baby) preeclampsia cases and controls from Europe (UK, Iceland, Norway, and Denmark) and Central Asia (Kazakhstan and Uzbekistan).		4
EGAD00010001984	Fixed effect meta-analysis summary statistics combining GWAS of maternal preeclampsia cases and controls from Europe (UK, Iceland, Norway, Denmark and Finland).		12
EGAD00010001985	Fixed effect meta-analysis summary statistics combining GWAS of maternal preeclampsia cases and controls from Central Asia (Kazakhstan and Uzbekistan).		4
EGAD00010001986	Fixed effect meta-analysis summary statistics combining GWAS of fetal (baby) preeclampsia cases and controls from Europe (UK, Iceland, Norway, and Denmark).		10
EGAD00010001987	Fixed effect meta-analysis summary statistics combining GWAS of fetal (baby) preeclampsia cases and controls from Central Asia (Kazakhstan and Uzbekistan).		4
EGAD00010001988	Fixed effect meta-analysis summary statistics combining GWAS of maternal preeclampsia cases and controls from Europe (UK, Iceland, Norway, Denmark and Finland) and Central Asia (Kazakhstan and Uzbekistan).		4
EGAD00010001990	Genome-wide data for 59 Native American individuals from Peru	Illumina 2.5M Human Omni array	59
EGAD00010001991	Genome-wide data for 71 Native American individuals from Peru	Illumina 2.5M Human Omni array	71
EGAD00010001992	Genome-wide data for 130 Native American individuals from Peru	Illumina 2.5M Human Omni array	130
EGAD00010001994	hormone receptor-positive early breast cancer by Nanostring BC360 panel	Nanostring panel	612
EGAD00010001996	The samples were genotyped on the H3Africa array (~2.3M SNPs) using the Illumina FastTrack Sequencing Service2. The default Illumina pipeline was used for the genotype calling (build GRCh37/hg19). The data was converted to PLINK using the h3abionet/h3agwas/call2plink pipeline and QC done using the h3abionet/h3agwas/qc pipeline	Illumina FastTrack	10776
EGAD00010001998	DNA methylation analysis of JMML patients from Europe, Japan and USA using EPIC arrays	Infinium HumanMethylation450K and EPIC BeadChip	32
EGAD00010001999	DNA methylation analysis of JMML patients from Europe, Japan and USA using 450k arrays	Infinium HumanMethylation450K and EPIC BeadChip	292
EGAD00010002000	DNA methylation analysis of JMML patients from Europe, Japan and USA using EPIC arrays	Infinium HumanMethylation450K and EPIC BeadChip	47
EGAD00010002002	4 oesophageal cancer derived organoid lines and 2 ovarian cancer derived organoid lines	GSA-MD V3	6
EGAD00010002004	PDAC primary cell lines methylation		7
EGAD00010002005	PDAC PDX methylation		18
EGAD00010002007	Mixed exocrine and purified ductal and de-differentiated acinar human cells obtained from normal pancreases. Ductal and de-differentiatedacinar cells where isolated by FACS after 4 days culture of the exocrine mixed population		9
EGAD00010002017	DNA methylation arrays were performed to molecularly subtype these samples based on Capper D, Jones DTW, Sill M, et al. DNA methylation-based classification of central nervous system tumours. Nature. 2018;555(7697):469-474. doi:10.1038/nature26000		3
EGAD00010002019	This dataset includes data from UK Multiple Sclerosis (MS) cases that were recruited through the University of Cambridge and included in the IMSGC exomechip experiment. Data from UK controls and additional UK cases that were recruited through other UK centres is available by direct application to those respective centres, as described in the original paper	Illumina	4478
EGAD00010002026	Clinical remission (ClinR) was defined as the absence of asthma symptoms and medication for at least 12 months, and complete remission (ComR) was defined as ClinR with normal lung function and absence of airway hyperresponsiveness. We analyzed differential DNA methylation of ClinR and ComR comparing to persistent asthma (PersA) in whole blood samples (n=72) and nasal brushing samples (n=97) in a longitudinal cohort of well characterized asthma patients.	Illumina 450K	169
EGAD00010002028	Array data from a family with high prevalence of psychosis	Infinium Global Screening Array-24 v1.0 (GSA) from Illumina	34
EGAD00010002030	Array data from a family with high prevalence of psychosis	Infinium Global Screening Array-24 v1.0 (GSA) from Illumina	12
EGAD00010002032	The genetic structure of Norway	Illumina OmniExpress 24 v 1.1 chip	6368
EGAD00010002034	CASE SAMPLES USING Affymetrix SNP6.0 technology (Thermo Fisher Scientific company): OncoScan FFPE Assay Kit was used for FFPE tissue samples (designed for degraded DNA) and the Cytoscan HD Array Kit was used for the fresh-frozen tissues	Affymetrix	710
EGAD00010002036	CUP samples using 850k	Illumina 850k	55
EGAD00010002038	SNP data for 473 tumor samples	Illumina Infinium Human Global Screening Array GSAMD-24v2-0_20024620_a1 BeadChip	473
EGAD00010002039	SNP data for 473 germline samples	Illumina Infinium OncoArray-500K	473
EGAD00010002041	Contains test sample 1-26	Illumina Iscan	26
EGAD00010002043	USA Multiple Sclerosis cases and controls	Illumina HumanImmuno v1.0	1830
EGAD00010002044	Germany Multiple Sclerosis cases and normal controls	Illumina HumanImmuno v1.0	1066
EGAD00010002045	Belgium Multiple Sclerosis cases and normal controls	Illumina HumanImmuno v1.0	635
EGAD00010002046	France Multiple Sclerosis cases and normal controls	Illumina HumanImmuno v1.0	741
EGAD00010002047	Australia and New Zealand Multiple Sclerosis case	Illumina HumanImmuno v1.0	1021
EGAD00010002048	Finland Multiple Sclerosis case	Illumina HumanImmuno v1.0	471
EGAD00010002049	This dataset includes data from UK Multiple Sclerosis (MS) cases that were recruited through the University of Cambridge and included in the IMSGC immunochip experiment. Data from UK controls and additional UK cases that were recruited through other UK centres is available by direct application to those respective centres, as described in the original paper.	Illumina HumanImmuno v1.0	3907
EGAD00010002051	Second batch of ChIP-seq narrowPeaks. Software: MACS2 v2.1.2	Illumina HiSeq 2500	48
EGAD00010002053	Data from Infinium EPIC 850K DNA methylation beadchip	Infinium EPIC DNA methylation beadchip	139
EGAD00010002055	Illumina EPIC methylation array of frontal lobe tissue from post-mortem human brains of the RiMod-FTD project	Illumina Infinium MethylationEPIC BeadChip	47
EGAD00010002057	This file fileset has 4607 Greenlanders scored on the Illumina MEGA array (1,622,813 sites), and has been put on the plus strand. The data is in PLINK bed/bim/fam format. The Greenlandic individuals originate from two population surveys, B99 and IHIT.	Illumina MEGA array	4607
EGAD00010002059	NIHR BioResource Common Disease Patients 2016. The dataset includes 13489 samples from blood donors, they were not screened for any particular disease, and therefore they are representative of the general population. Genomic data includes 845487 snps collected using the UK BioBank V1 Affymetrix array. Phenotypic data includes gender, age, ethnicity and disease. According to our internal quality check there are 81 duplicates in this dataset.	Genotyped using UK Biobank Axiom Array (Applied Biosystems/Thermofisher), read on GeneTitan Multi Channel System (Affymetrix/ThermoFisher) and analysed with the Axiom Analysis Suite (Applied Biosystems/Thermofisher)	13490
EGAD00010002061	Sample genotyped with Axiom Human Origins (Affymetrix)	Axiom Human Origins (Affymetrix)	37
EGAD00010002063	Genome-wide Genotyping of 620 Arab individuals using Illumina iSelect platform with HumanOmniExpress bead chips	Illumina HumanOmniExpress BeadChip	1
EGAD00010002065	Liverpool Preterm Birth Biomarker Study Transcriptomics	Clariomâ„¢ D Human assay	114
EGAD00010002066	Liverpool Preterm Birth Biomarker Study Genomics	UK Biobank Axiomâ„¢ array	310
EGAD00010002068	Raw methylation from cervical samples of controls (both HPV+ and HPV-)	Illumina MethylationEPIC Array	527
EGAD00010002069	Raw methylation from cervical samples of individuals who did not develop CIN.	Illumina MethylationEPIC Array	218
EGAD00010002070	Raw methylation from cervical samples of cases (CIN1-3+)	Illumina MethylationEPIC Array	513
EGAD00010002071	Raw methylation from cervical samples of individuals who developed CIN 1-4 years after sampling.	Illumina MethylationEPIC Array	226
EGAD00010002073	Raw methylation from breast biopsies in BRCA mutation carriers or controls before and after 3 months of preventive mifepristone treatment.	Illumina MethylationEPIC Array	77
EGAD00010002074	Raw methylation data from normal breast tissue adjacent to a malignancy (TNBC)	Illumina MethylationEPIC Array	14
EGAD00010002075	Raw methylation data from breast tissue collected during risk-reducing surgery in BRCA1/2 mutation carriers.	Illumina MethylationEPIC Array	14
EGAD00010002076	Raw methylation data from normal breast tissue.	Illumina MethylationEPIC Array	14
EGAD00010002077	Raw methylation data from triple negative breast cancer.	Illumina MethylationEPIC Array	14
EGAD00010002079	Raw methylation data from cervical samples in controls.	Illumina MethylationEPIC Array	1094
EGAD00010002080	Raw methylation data from cervical samples in controls.	Illumina MethylationEPIC Array	202
EGAD00010002081	Raw methylation data from cervical samples in individuals with breast cancer.	Illumina MethylationEPIC Array	442
EGAD00010002082	Raw methylation data from buccal samples in individuals with breast cancer.	Illumina MethylationEPIC Array	200
EGAD00010002084	Raw methylation data from cervical samples in individuals with endometrial cancer.	Illumina MethylationEPIC Array	281
EGAD00010002086	Raw methylation data from cervical samples in individuals with ovarian cancer.	Illumina MethylationEPIC Array	289
EGAD00010002088	1,094 genotyped Philippine samples		1094
EGAD00010002090	HumanCytoSNP 850K on tissue DNA	HumanCytoSNP 850K	2
EGAD00010002091	HumanCytoSNP-12 v2.1 on tissue DNA	HumanCytoSNP12-2-1	13
EGAD00010002093	Dublin Aspirin platelet response genomics cohort	UK Biobank Axiom array	91
EGAD00010002094	Liverpool Aspirin platelet response genomics cohort	UK Biobank Axiom array	91
EGAD00010002096	Genome-wide methylation analysis of upper urinary tract urothelial carcinoma using Infinium MethylationEPIC BeadChip Kit	illumina	94
EGAD00010002098	Genome-wide copy number analysis of upper urinary tract urothelial carcinoma using GeneChip Human Mapping 250K Nspl	Affymetrix	205
EGAD00010002100	Genotype data for new samples in Lopez et al 2021	Affymetrix Axiom Genome-Wide Human Origins 1 Array	1243
EGAD00010002102	Genome-Wide Human SNP Array 6.0 or the CytoScan HD array, according to the manufacturerâ€™s instructions (Affymetrix, Santa Clara, CA, USA) now part of Thermo Fisher Scientific (Thermo Fisher Scientific, Inc.)	Genome-Wide Human SNP Array 6.0 or the CytoScan HD array	42
EGAD00010002113	Genome-wide data for population genetics analyses	Illumina Infinium H3Africa_2017_20021485_A2	162
EGAD00010002118	This dataset includes data from UK Multiple Sclerosis (MS) cases that were recruited through the University of Cambridge and included in the IMSGC Replicationchip experiment. Data from UK controls and additional UK cases that were recruited through other UK centres is available by direct application to those respective centres, as described in the original paper.	Illumina iSelect	5766
EGAD00010002124	Genotypes generated for Puno cohort Batch 2. Case (PRE) and control (PUN) families were recruited in hospital. Raw genotypes no QC. Includes unrelated genotyping controls (HG).	Affymetrix Axiom LAT	467
EGAD00010002125	Phenotypes from case families listed in medical records and used in analyses.		558
EGAD00010002126	Genotypes generated for Puno cohort Batch 1. Case (PRE) and control (PUN) families were recruited in hospital, and additional unrelated controls were recruited in university (UNA). Raw genotypes no QC.	Affymetrix Axiom LAT	480
EGAD00010002127	Combined genotypes for Batch 1 and 2 after quality control. All individuals included in analyses.	Affymetrix Axiom LAT	877
EGAD00010002132	SOMAscan plasma proteome datasets generated from participants consuming the fiber blend snack prototype (study 2)	SOMAscan 1.3K Proteomic Assay	70
EGAD00010002133	SOMAscan plasma proteome datasets generated from participants consuming the pea fibre snack prototype (study 1)	SOMAscan 1.3K Proteomic Assay	72
EGAD00010002137	SNP measurement. Illumina BeadArray SNP arrays for the study "Molecular characteristics in Burkitt lymphoma over age groups"	Illumina InfiniumOmniExpressExome-8	93
EGAD00010002139	Genotype data for BaYaka hunter-gatherers Congo	Affymetrix Axiom Genome-Wide Human Origins 1 array	-
EGAD00010002140	Genotype data for Agta hunter-gatherers Philippines	Affymetrix Axiom Genome-Wide Human Origins 1 array	-
EGAD00010002141	Genotype data for Palanan farmers Philippines	Affymetrix Axiom Genome-Wide Human Origins 1 array	1
EGAD00010002143	Illumina HumanCytoSNP-12v2.1 BeadChip	BeadChip	7
EGAD00010002146	metabolite levels provided by UM platform (Creative Dynamics Inc, NY, USA) (the data is raw abundance. Mapping was applied on log10 transformed data)		482
EGAD00010002147	covarites phenotypes, including gender (1=Female/0=Male), age and contraceptive		482
EGAD00010002148	metabolite levels measured by general metabolomics (Boston, USA) (the data is raw abundance. Mapping was applied on log10 transformed data)	flow-injection TOF-M spectrometry.	482
EGAD00010002149	Genotype data from healthy Dutch individuals measured by Illumina humanOmniExpress Exome-8v1.0 SNP chip Calling by Opticall 7.0	Illumina humanOmniExpress Exome-8v1.0 SNP chip	482
EGAD00010002150	metabolite levels measured by Brainshake Metabolomics/Nightingale Health metabolic platform (log2)	Nightingale's technology	482
EGAD00010002152	This resource contains the SV annotations using the AnnotSV tool. The description of annotations can be found in AnnotSV web page https://lbgi.fr/AnnotSV/ or GCAT-BSC web page: http://cg.bsc.es/GCAT_BSC_iberianpanel	Illumina HiSeq 4000	785
EGAD00010002153	This dataset includes the .hap, .legend and .sample files from the GCAT\|Panel (Iberian reference panel), built from 785 samples, after QC, from the 808 WGS GCAT cohort, including 30.3M SNVs, 5M Indels and 89K SVs. This resource has been generated using Shapeit4 and WhatsHap software. Technology used HiSeq 4000, read length 150 bp, inner mate disatance 300 bp.	Illumina HiSeq 4000	785
EGAD00010002155	Third batch of ChIP-seq narrowPeaks. Software: MACS2 v2.1.2	Illumina HiSeq 2500	5
EGAD00010002157	This dataset includes IDAT files from 6 IDH-mutant, 5 IDH-wild-type glioma patient samples of unmatched initial and recurrent timepoints profiled using the Illumina Infinium MethylationEPIC Array.	Illumina Infinium MethylationEPIC BeadChip	11
EGAD00010002165	Genotyping using Global Screening Array	Global Screening Array	50
EGAD00010002166	Genotyping using Illumina OncoArray BeadChip	Illumina OncoArray BeadChip	332
EGAD00010002168	HapMap samples for haplotyping and copy-number profiling via SNP array	Illumina HumanCytoSNP-12 v2.1	11
EGAD00010002169	PGT samples for haplotyping and copy-number profiling via SNP array	Illumina HumanCytoSNP-12 v2.1	39
EGAD00010002171	Cohort: Raw genotype files for Hostage2 cohort. Genotype Chip: Illuminaâ€™s (Illumina Inc., San Diego, U.S.) Global Screening Array-24 Multi Disease (GSA) Version 2.0 B1 genomic build: b37		306
EGAD00010002172	Cohort: Raw genotype files for BRACOVID cohort. Genotype Chip: Axiom_PMRA.r3 array genomic build: b37		348
EGAD00010002173	Cohort: Raw genotype files for Hostage3 cohort. Genotype Chip: Illuminaâ€™s (Illumina Inc., San Diego, U.S.) Global Screening Array-24 Multi Disease (GSA) Version 2.0 B1 genomic build: b37		71
EGAD00010002174	Cohort: Raw genotype files for INMUNGEN_CoV2 cohort. Genotype Chip: HumanCore Exome Chip (Illumina) and Axiom Spanish Biobank Array (Thermofisher) genomic build: b37		367
EGAD00010002176	Cohort: Raw genotype files for SPGRX cohort. Genotype Chip: the Illumina Global Screening Array-24 v3.0 genomic build: b38		364
EGAD00010002177	Cohort: Raw genotype files for GEN_COVID cohort. Genotype Chip: Illumina Global Screening Array-24 v3.0 + Multi-Disease beadchip genomic build: b37		1141
EGAD00010002178	Cohort: Raw genotype files for Hostage4 cohort. Genotype Chip: Illuminaâ€™s (Illumina Inc., San Diego, U.S.) Global Screening Array-24 Multi Disease (GSA) Version 2.0 B1 genomic build: b37		121
EGAD00010002179	Cohort: Raw genotype files for BelCovid2 cohort. Genotype Chip: Illumina Global Screening Array-24 v3.0 + Multi-Disease beadchip genomic build: b37		392
EGAD00010002180	Cohort: Raw genotype files for Hostage1 cohort. Genotype Chip: Illuminaâ€™s (Illumina Inc., San Diego, U.S.) Global Screening Array-24 Multi Disease (GSA) Version 2.0 B1 genomic build: b37		847
EGAD00010002182	Tumor biopsies profiled by DNA methylation array	Illumina Human Methylation EPIC	133
EGAD00010002184	Genotype data of 7,281 individuals with colorectal cancer from the National Study of Colorectal Cancer Genetics (NSCCG) study. Individuals genotyped on the Illumina OncoArray. Data provided in plink format and has not been quality controlled. Control samples used were obtained from the PRACTICAL and BCAC consortia, and are available through the respective Data Access Coordination Committees (http://practical.icr.ac.uk and http://bcac.ccge.medschl.cam.ac.uk/)	Illumina OncoArray	7281
EGAD00010002186	Genotype data of 1,950 individuals from the COIN and COIN-B trials of advanced/metastatic colorectal cancer. Data provided in plink format, and has been quality controlled. Control data used was from the WTCCC2 project National Blood Donors (NBS) Cohort (EGAD00000000024).	Illumina	1950
EGAD00010002188	Genome-wide genotypes of women with misoprostol-induced high fever	Illumina Infinium Global Screening Array	50
EGAD00010002189	Genome-wide genotypes of women with misoprostol-induced high fever	Illumina Infinium Global Screening Array	46
EGAD00010002191	SOMAscan plasma proteome datasets generated from participants consuming the orange fiber snack prototype (study 2)	SOMAscan 1.3K Proteomic Assay	-
EGAD00010002192	SOMAscan plasma proteome datasets generated from participants consuming the pea fiber snack prototype (study 1)	SOMAscan 1.3K Proteomic Assay	-
EGAD00010002194	Raw idat files for 90 RS + DLBCL + CLL samples.	Illumina EPIC microarray	90
EGAD00010002198	(A)FAP Colon Crypt - EPIC Methylation Array		1
EGAD00010002199	Endometrium Gland - EPIC Methylation Array		1
EGAD00010002200	Normal Colon Crypt - EPIC Methylation Array		1
EGAD00010002201	Small Intestine Crypt - EPIC Methylation Array		1
EGAD00010002206	KIR gene content imputation from single-nucleotide polymorphisms in the Finnish population	SNP genotyping array	818
EGAD00010002209	Expression dataset for CD34 sorted primary CML bone marrow samples	Illumina Beadchip HT12v4	34
EGAD00010002210	Expression dataset for DAC+PTC209 treated CD34 sorted primary CML bone marrow samples	Illumina Beadchip HT12v4	44
EGAD00010002211	Expression dataset for DAC treated CD34 sorted primary CML bone marrow samples	Illumina Beadchip HT12v4	48
EGAD00010002213	96 genotyped Philippine samples	Illumina	96
EGAD00010002216	DNA methylation array from primary samples	Illumina 450K	65
EGAD00010002218	TIGER samples PISA genotyping array data	Illumina	127
EGAD00010002220	Single blastomeres from blastocyst and familial samples for haplotyping and copy-number profiling via SNP array	Illumina HumanCytoSNP-12 v2.1	21
EGAD00010002223	Real patient variability Benchmark Dataset for Optimization of DIA data analysis workflows	Orbitrap Eclipse	92
EGAD00010002225	SNP Array from CB1003 using Cytoscan HD, Thermo Fisher Scientific	Cytoscan HD	3
EGAD00010002229	ASD samples using Illumina Infinium Human Core-24 BeadChip platform	Illumina Infinium HumanCore-24 BeadChip platform	139
EGAD00010002231	Raw methylation data from buccal samples.	Illumina MethylationEPIC Array	227
EGAD00010002232	Raw methylation data from cervical samples.	Illumina MethylationEPIC Array	229
EGAD00010002233	Raw methylation data from blood samples.	Illumina MethylationEPIC Array	232
EGAD00010002235	CN Array samples from lymphoma patient tumours on Affymetrix platforms	Affymetrix Oncoscan, Affymetrix SNP6.0, Affymetrix Cytocan HD	95
EGAD00010002237	Proteom characterization in primary colorectal cancer and corresponding liver metastasis	Qexactive Plus	42
EGAD00010002239	Genomics to select patients with metastatic breast cancer for targeted therapy (microarray_cytoscan)	Cytoscan	749
EGAD00010002241	Genomics to select patients with metastatic breast cancer for targeted therapy (microarray_oncoscan)	Oncoscan	349
EGAD00010002243	Genomics to select patients with metastatic breast cancer for targeted therapy (microarray_agilent)	Agilent	56
EGAD00010002248	Genotypes for 2 human skeletal muscle samples	Illumina Infinium multi-ethnic global-8 v1 kit	2
EGAD00010002250	Genotyping array data for normal mammary gland control samples	Illumina	50
EGAD00010002251	Genotyping array data for breast cancer and matched normal mammary gland samples	Illumina	100
EGAD00010002253	Methylation data of tumors using illumina Infinium MethylationEPIC	Infinium MethylationEPIC	57
EGAD00010002255	A total of 87 microarrays from HCC patients treated with anti-PD1 inhibitors	Clariom S Array, human	87
EGAD00010002257	SNP Array Data for EGAS00001004666	Illumina Global Screening Array-24 V1 HTS GSA+Multi-Disease	100
EGAD00010002259	Myeloma methylation data	Illumina Infinium HumanMethylation450 (450k)	442
EGAD00010002261	Genotypes generated for study investigating signals of selection in Peruvians from three ecological regions. 96 genotypes in plink format after QC filtering (missingness per individual, per variant and minor allele freq). See publication for more details on QC filtering.	Illumina MEGA Array	95
EGAD00010002263	Nasal DNA methylation at three CpG sites predicts childhood allergic disease	Illumina 450K	696
EGAD00010002273	Polynesian genotypes	AxiomLAT	78
EGAD00010002275	Raw methylation array data for tumor samples from patients with newly diagnosed, recurrent intermediate or high-grade sarcoma.	Illumina MethylationEPIC BeadChip Array	48
EGAD00010002277	Whole-genome DNA methylation profiling of PBL obtained from male patients with PSC-UC, or UC alone, or healthy individuals.	Illumina Infinium HumanMethylation EPIC BeadChip	47
EGAD00010002279	Illumina EPIC arrays of human osteoblastomas and their mimics	Illumina Infinium MethylationEPIC BeadChip array	50
EGAD00010002281	Assessment of methylation status of ~850,000 sites	Illumina HT12	20
EGAD00010002283	Genome-wide DNA Methylation Data from Illumina HumanMethylationEPIC arrays for whole blood samples from 403 healthy individuals. Additional raw data (IDAT files) and associated phenotype information are available for all individuals included in this study (n=570) directly from CIBMTR. Data are available under controlled access release upon reasonable request and execution of a data use agreement. Requests should be submitted to CIBMTR at info-request@mcw.edu and include the study reference IB17-04	EPIC BeadChip	403
EGAD00010002285	The compressed file contains plink format file for the Affymetrix Human Origins SNP array data of 452 individuals generated and analyzed in Kutanan, Liu et al 2021 study of 33 ethnolinguistic groups in Thailand and Laos.	Affymetrix Axiom Genome-Wide Human Origins array	452
EGAD00010002287	The compressed file contains plink format file for the Affymetrix Human Origins SNP array data of 260 individuals generated and analyzed in Liu et al 2020 study of 22 ethnolinguistic groups in Vietnam.	Affymetrix Axiom Genome-Wide Human Origins array	260
EGAD00010002289	DNA methylation profiles of samples included in the EORTC 26091 TAVAREC trial	Infinium MethylationEpic BeadChip array	125
EGAD00010002291	Blood samples were obtained from 119 healthy individuals of British ancestry. Genomic DNA was isolated from a suspension of PBMCs from each individual using a DNA isolation kit (Qiagen). Genotyping was then performed using the Infinium CoreExome-24 (v1.3) chip (Illumina).	Infinium CoreExome-24 (v1.3) chip (Illumina)	127
EGAD00010002294	Single Nucleotide Polymorphisms in autosomes of Canary Islanders	Axiom® Genome-Wide Human CEU 1 Array	863
EGAD00010002296	nasopharyngeal carcinoma genome-wide human SNP array data for 4083 NPC cases and 4811 controls	Illumina	8894
EGAD00010002298	nasopharyngeal carcinoma genome-wide human SNP array data for 423 NPC cases and 573 controls	Illumina	996
EGAD00010002302	the tar archive contains unflitered genotype data from Reich et al AJHG 2011 study in plink format	Affymetrix 6.0 array	262
EGAD00010002304	The tar archive contains a) the txt file with the genotypes, b) illumina annotation file with info on SNPs, c) sample info file unfiltered illumina data, autosomes only data from Pugach et al MBE 2016 The Complex Admixture History and Recent Southern Origins of Siberian Populations	Illumina 660W-Quad arrays	96
EGAD00010002306	The tar archive contains unflitered genotype data from Pugach et al 2018 in plink format	Affymetrix Axiom Human Origins array	181
EGAD00010002308	Combined genotyping files from 13 PBMC samples	illumina Infinium Omni2.5-8	13
EGAD00010002310	Renal cell carcinoma (RCC) cases comprised adult patients with histologically proven RCC were collected through two sources within the UK. First, 856 cases from SORCE, a MRC collection of surgically treated RCC cases ascertained through UK clinical oncology centres. Second, 189 RCC cases collected through the ICR and Royal Marsden NHS Hospitals Trust. Cases included 590 clear cell carcinomas (CCCs), 42 papillary carcinomas (PCs), 33 chromophobe carcinomas (CCs) and 19 mixed or other histological subtypes. DNA was extracted from EDTA-venous blood samples using the conventional methods and quantified using PicoGreen (Invitrogen). Cases were genotyped using the Human OmniExpress-12 BeadChip according to the manufacturer's recommendations (Illumina Inc, San Diego, CA, USA). After strict QC, 944 cases were retained. Data provided in plink format. Controls used were data from the Wellcome Trust Case Control Consortium 2 (WTCCC2) 1958 birth cohort and the UK Blood Service Control Group (available as EGAS00000000028).Â	Illumina Omni Express BeadChip	944
EGAD00010002312	Illumina EPIC arrays Naevus Melanoma Spitz Case	Illumina EPIC array	24
EGAD00010002314	Comparative proteome-based analysis of different autologous bone entities used for alveolar onlay grafting	Qexactive Plus	75
EGAD00010002316	Column 1 rsid: SNP identifier;Column 2 chromosome: name of chromosome on which the SNP is located;Column 3: position: base pair position on the chromosome;Column 4 minor_test_allele: the base that constitutes the minor allele;Column 5 major_allele: the base that constitutes the major allele;Column 6 maf: the frequency of the minor allele, indicated as a fraction of 1;Column 7 allele_freq_cases: the minor allele frequency in cases;Column 8 allele_freq_controls: the minor allele frequency in controls;Column 9 regression_pvalue: the p-value for the difference in allele frequency between cases and controls;Column 10 odds_ratio: the odds ratio, as calculated using logistic regression under an additive model with adjustment for the first ten principal components of ancestry		1
EGAD00010002319	Japanese COVID-19 PLINK file	Infinium Asian Screening Array (Illumina, USA)	2393
EGAD00010002321	Methylation arrays (850K)	EPIC BeadChips (Illumina)	33
EGAD00010002323	GeneChip HTA 2.0 data of primary renal cell carcinoma (RCC) related to Reustle et al, Genome Med 12:2020 32. Preprocessing of microarray data was performed using Robust Multi-array Average (RMA).	GeneChip HTA 2.0	53
EGAD00010002325	high-risk localized ccRCC	Affymetrix HTA 2.0	236
EGAD00010002327	This study includes 1146 samples of host genotyping data (imputed) from Illumina Omni arrays, using https://imputation.sanger.ac.uk/ with the Haplotype Reference Consortium v1.1. Samples were collected from adults (>16 yrs) patients with CSF confirmed bacterial meningitis in the Netherlands between 2006 and 2015. Metadata includes patient outcome, species of bacteria, and for 467 samples a link to an ENA run with the associated bacterial genome (S. pneumoniae only).	Illumina Human Omni1-Quad beadchip.	1149
EGAD00010002328	This study includes 1146 samples of host genotyping data (genotyped) from Illumina Omni arrays. Samples were collected from adults (>16 yrs) patients with CSF confirmed bacterial meningitis in the Netherlands between 2006 and 2015. Metadata includes patient outcome, species of bacteria, and for 467 samples a link to an ENA run with the associated bacterial genome (S. pneumoniae only).	Illumina Human Omni1-Quad beadchip.	1149
EGAD00010002330	We performed a proteomic serum profiling of patients with non-metastasized breast cancer (BC) who received neoadjuvant chemotherapy (NACT). Samples were collected at three timepoints during NACT. Furthermore, we compared serum samples of BC patients pre-NACT to a control group of healthy volunteers.	Qexactive Plus	84
EGAD00010002336	K562 cells were treated with different HSP90 inhibitors (PuH71 and Coumermycin A1) and the CNV profil was compared to the parental K562 (untreated). In addition, the CNV profile of HSP90AB1 knockout K562 cells was analyzed.	Illumina NextSeq 550	4
EGAD00010002338	Chordoma tumors DNA methylation profiling by genome tiling array	Illumina	68
EGAD00010002340	PLINK file of Japanese controls	Infinium Asian Screening Array (Illumina, USA)	2380
EGAD00010002342	Diagnostic yield of affymetrix optima microarray in patients with non-syndromic autism spectrum disorders in India.	Affymetrix CytoScan Optima	99
EGAD00010002344	Blood DNA samples from 1,433 contemporary ni-Vanuatu were genotyped on the Illumina Infinium Omni 2.5-8 array. Genotype calling was performed using the Illumina GenomeStudio software.	Infinium Omni2.5-8 BeadChip	1433
EGAD00010002346	Human islet samples genotype data	NA	128
EGAD00010002350	Shotgun Proteomics; Glioblastoma samples from 11 patients were obtained at initial and recurrent tumor stages. Proteins were extracted, identified and quantified via tandem mass spectrometry based on a TMT isobaric labelling approach. Quatitative proteomics reveals 146 differentially abundant proteins using a patient-matched statistical modelling. Analysis of proteolytic processing reveals differential proteolytic patterns in recurrent tumors. Proteogenomics reveals the presense of 30 single-amino acid variants present in glioblastoma tumor and 1 of those as increased in recurrent tumor.	Qexactive Plus	22
EGAD00010002352	GeneChip HTA 2.0 data of primary renal cell carcinoma (RCC) related to Reustle et al., Clin Transl Med 12:2022 e883. Microarrays were normalized individually using the SCAN method from the R package SCAN.UPC (version 2.26.0, R version 3.6.1). Probe sets were summarized on the Entrez GeneID level using the annotation provided by BrainArray (version 23).	GeneChip HTA 2.0	124
EGAD00010002353	GeneChip HTA 2.0 data of primary renal cell carcinoma (RCC) related to Buettner et al, Genome Med 2022. Microarrays were normalized individually using the SCAN method from the R package SCAN.UPC (version 2.26.0, R version 3.6.1). Probe sets were summarized on the Entrez GeneID level using the annotation provided by BrainArray (version 23).	GeneChip HTA 2.0	306
EGAD00010002355	Methylation microarray data (Illumina 850K) of 52 thymic epithelial tumors. 13 patients with thymoma A and B, 32 thymic carcinoma (TC) and 7 neuroendocrine tumors of the thymus (NET).	Illumina 850k	52
EGAD00010002357	Methylation files for Roussel-ATRT-TM paper titled "Atypical teratoid/ rhabdoid tumoroids reveal subgroup-specific drug vulnerabilities"	Illumina Infinium MethylationEPIC	12
EGAD00010002359	Tumor and matched normal DNA profiling by SNP array	Illumina Infinium OmniExpress-24 BeadChip array	111
EGAD00010002361	Samples from the Mexican Biobank	Illumina MEGA Array	6057
EGAD00010002363		Affymetrix	46
EGAD00010002365	The 6431 samples were genotyped on the H3Africa array.	Illumina	6431
EGAD00010002367	CONTROL_SAMPLES using platform â€¦..		155
EGAD00010002368	CASE_SAMPLES using platform â€¦..		113
EGAD00010002370	450k methylation arrays of primary and relapse tumor of a single case of sonic hedgehog medulloblastoma with Li-Fraumeni syndrome	HumanMethylation450	2
EGAD00010002372	This dataset includes IDAT files from 160 samples (57 primary prostate cancers, 95 proste-derived brain metastases, and 7 normal tissues). The samples were profiled using the Illumina Infinium MethylationEPIC BeadChips (850K)	Illumina Infinium MethylationEPIC 850K	160
EGAD00010002374	DNA was extracted from saliva samples and genotyping was performed on Illumina Infinium Global Screening Array.	Global Screening Array	1880
EGAD00010002375	DNA was extracted from saliva samples and genotyping was performed on Illumina Infinium HumanCoreExome beadchips.	HumanCoreExome	3295
EGAD00010002377	Gene transcript data from ALI-cultured airway cells, acquired using microarrays.	Affymetrix Genetitan	19
EGAD00010002379	This dataset contains the raw sequencing data (Runs) from all of the 10x Genomics single-cell Visium Experiments, as well as the corresponding imaging data (Analyses).	10x Genomics spatial transcriptomics (Visium)	8
EGAD00010002381	This dataset includes raw label-free mass spectrometry proteomics data of different sinonasal tumor entities as well as normal sinonasal tissue. 72 samples were processed on a Q Exactive HF-X instrument coupled to an easy nanoLC 1200 system using one microgram of peptides and an 110 minutes gradient.	Q Exactive HF-X instrument	72
EGAD00010002383	Data from 59 whole blood samples from pregnant mothers, unexposed and exposed to the Rwandan genocide, was generated using Infinium MethylationEPIC BeadChip Kit.	IlluminaEpic	59
EGAD00010002386	DNA methylation of PDAC prescursors and normal pancreas cell population	Illumina EPIC Array	108
EGAD00010002388	Shotgun Proteomics, Proteomic characterization of the residual PDAC tumor mass after neoadjuvant chemo or combined chemo-radiation therapy	Qexactive Plus	79
EGAD00010002390	Paediatric tumour cell models DNA methylation EPIC array	EPIC	151
EGAD00010002392	GeneChip HTA 2.0 data of primary renal cell carcinoma (RCC) and RCC metastases related to Guergen et al, Front Oncol 12:2022 889789. Microarrays were normalized individually using the SCAN method from the R package SCAN.UPC (version 2.34.0). Probe sets were summarized on the Entrez GeneID level using the annotation provided by BrainArray (version 25).	GeneChip HTA 2.0	24
EGAD00010002394	Genome-wide SNP from 221 individuals from Northwestern Amazonia genotyped on the Affymetrix Human Origins Array	Affymetrix Axiom Genome-Wide Human Origins array	221
EGAD00010002396	Analysis of cocaine use disorder (CUD) associated epigenome-wide DNA methylation (DNAm) alterations in human postmortem brain tissue of Brodmann Area 9. Tissue samples from N=21 CUD cases and N=21 individuals without CUD originating from the Douglas Bell Canada Brain Bank (DBCBB) were included. Epigenome-wide DNAm was investigated using the Illumina Infinium MethylationEPIC array.	Infinium MethylationEPIC array	84
EGAD00010002398	Tumor and matched normal DNA profiling by SNP array	Illumina Infinium OmniExpress-24 BeadChip array	82
EGAD00010002400	We demonstrate that ATRT tumoroids retain subgroup-specific epigenetic and gene expression profiles	Illumina Infinium EPIC	8
EGAD00010002402	Longitudinal whole-genome DNA methylation profiling of PBL obtained from IBD patients (CD and UC) at two different time points	Illumina Infinium HumanMethylation EPIC BeadChip	92
EGAD00010002404	Methylation array	iScan	6
EGAD00010002406	Tertiary lymphoid structure signatures are associated with immune checkpoint inhibitor related acute interstitial nephritis	Nanostring	22
EGAD00010002408	Two primary tumor-derived PDAC organoids were subjected to SNP array, RNA-seq, and single-cell WGS	Illumina Infinium Global Screening Array-24	2
EGAD00010002410	Average methylation difference 12 months vs 0 months at Roadmap Epigenomics chromatin state annotations from different cell types using nanopolish. Data from 8 individuals.	Oxford Nanopore	1
EGAD00010002411	Average hypermethylation on transcription factor binding sites based on nanopolish calls; only positions showing higher methylation than sampleâ€™s average methylation at enhancers were included when defining the average methylation level. Data from 6 individuals at different time points.	Oxford Nanopore	1
EGAD00010002412	Average genome-wide methylation levels per sample at different time points using nanopolish calls. Data from 8 individuals.	Oxford Nanopore	1
EGAD00010002413	Average methylation levels based on nanopolish calls from Roadmap Epigenomics chromatin state annotations using different cell types. Data from 8 individuals at different time points.	Oxford Nanopore	1
EGAD00010002414	Average hydroxymethylation levels based on megalodon calls from Roadmap Epigenomics chromatin state annotations using different cell types. Data from 8 individuals at different time points.	Oxford Nanopore	1
EGAD00010002415	Average hydroxymethylation difference 12 months vs 0 months at Roadmap Epigenomics chromatin state annotations from different cell types. Data from 8 individuals.	Oxford Nanopore	1
EGAD00010002416	Proportion of hyper- and hypomethylated positions at Roadmap annotations. Data from 8 individuals.	Oxford Nanopore	1
EGAD00010002417	Average hydroxymethylation levels based on megalodon calls from Roadmap Epigenomics histone mark annotations using different cell types. Data from 8 individuals at different time points.	Oxford Nanopore	1
EGAD00010002418	CpG hydroxymethylation. Software: minimap2 v.2.16; Megalodon.	Oxford Nanopore	24
EGAD00010002419	Average genome-wide hydroxymethylation levels per sample at different time points using megalodon calls. Data from 8 individuals.	Oxford Nanopore	1
EGAD00010002420	CpG methylation. Software: minimap2 v2.16;Nanopolish.	Oxford Nanopore	24
EGAD00010002421	Average hydroxymethylation levels on transcription factor binding sites obtained from ENCODE (ChIP-sequencing of GM12878 lymphoblastoid cell line). Data from 6 individuals at different time points.	Oxford Nanopore	1
EGAD00010002422	Average methylation levels based on nanopolish calls from Roadmap Epigenomics histone mark annotations using different cell types. Data from 8 individuals at different time points.	Oxford Nanopore	1
EGAD00010002424	The compressed file contains plink format file for the Affymetrix Human Origins SNP array data of 55 individuals generated and analyzed in Liu et al 2023 study of Taiwanese groups.	Affymetrix Axiom Genome-Wide Human Origins array	55
EGAD00010002427	PLINK file of the Japanese population	Infinium Asian Screening Array (Illumina, USA)	142
EGAD00010002431	RCC files of 17 Cartridges' Panel Standards	NanoString nCounterÂ® PanCancer IO 360â„¢	17
EGAD00010002432	RCC files of 17 Cartridges from metastatic melanoma	NanoString nCounterÂ® PanCancer IO 360â„¢	185
EGAD00010002434	51 Ashaninka individuals from Peru (Pasco) genotyped with Axiom Human Origins (Affymetrix)	Axiom Human Origins (Affymetrix)	51
EGAD00010002436	postQC genotype data from the Affymetrix AxiomTM HGCoV2 1 array in plink binary format. QC was carried out using PLINK v1.9	Affymetrix AxiomTM HGCoV2 1	1192
EGAD00010002437	preQC genotype data from the Affymetrix AxiomTM HGCoV2 1 array in plink ped/map format		1226
EGAD00010002441	Methylation data on tumor (n=102) and normal nerve (n=7) DNA samples	Infinium HumanMethylationEPIC beadchip array	109
EGAD00010002443	Microarray data of 14 patient-derived PDAC cultures	Affymetrix Human Clariom S	14
EGAD00010002445	Genotyping data for ACE2 (rs2285666), MX1 (rs469390) and TMPRSS2 (rs2070788) variants. Patients are classified as mild (n=34) and severe (n=32). DNA genotyping was performed using the TaqManÂ® Genotyping Master Mix (Applied Biosystems). Allelic discrimination assays were performed on a 7900HT Fast Real-Time PCR System (Applied Biosystems).	7900HT Fast Real-Time PCR System	66
EGAD00010002447	SomaLogic data	SomaLogic	1188
EGAD00010002449	Genotype data for 343 Japanese subjects obtained with Infinium Asian Screening Array.	Infinium Asian Screening Array	1
EGAD00010002451	Methylation profiling of 345 sarcoma and TFCP2-rearranged rhadomyosarcoma samples, using the approach described "Genomic, transcriptomic, functional, and mechanistic characterization of rhabdomyosarcoma with FUS-TFCP2 or EWSR1-TFCP2 fusions"	Infinium Methylation EPIC BeadChip	345
EGAD00010002453	Synthetic dataset containing genome-wide genotypes of 500.000 individuals was generated using a hybrid approach combining coalescent approach and resampling based methods		500000
EGAD00010002456	This dataset included 110 samples with high hyperdiploid acute lymphoblastic leukemia that were genotyped using Affymetrix SNP Array or Illumina's BeadArray platform.	AffymetrixÂ CytoScanÂ HD, Illumina Human1M-Duo v3.0, Illumina HumanOmni1-Quad v1.0 and Illumina HumanOmni5-4v1	110
EGAD00010002458	The compressed file contains plink format files for the Affymetrix Human Origins SNP array data of 208 Angolan individuals	Affymetrix Axiom Genome-Wide Human Origins array	209
EGAD00010002461	Methylation of peripheral blood leukocytes from patients with Li-Fraumeni syndrome	Illumina HumanMethylation450 BeadChip/Illumina HumanMethylationEPIC BeadChip	400
EGAD00010002463	EPIC Array data from human lung fibroblasts isolated from fresh and cryopreserved lung tissue (16 samples, 3 donors)	Illumina_EPIC	16
EGAD00010002465	The dataset includes IDAT raw files for 10 samples and the analyzed DMP file which describes the differential methylation positions based on Illumina Infinium MethylationEPIC BeadChip. All samples (5 lung cancer cases vs. 5 benign lung disease controls) were obtained from bronchial washings at the site of the lesion under bronchoscopy manipulation. The histological type of the five lung cancer cases is adenocarcinoma and squamous cell carcinoma.	Illumina Infinium MethylationEPIC BeadChip (850 K)	10
EGAD00010002467	Individuals genotyped on the Illumina Omni2.5. Autosome and X chromosome.	Illumina SNP Array, Omni2.5-8 v1.3	2
EGAD00010002468	Individuals genotyped on the Illumina GSA v2. Autosome and X chromosome.	Illumina SNP Array, Global Screening Array v2	30
EGAD00010002470	Raw methylation data from blood samples in breast cancer cases.	Illumina MethylationEPIC Array	105
EGAD00010002471	Raw methylation data from blood samples in controls.	Illumina MethylationEPIC Array	211
EGAD00010002473	84 Indigenous and admixed individuals from Panama genotyped with Axiom Human Origins (Affymetrix)	Axiom Human Origins (Affymetrix)	84
EGAD00010002475	HC genotyping data for lead SNPs using Illuminia Global Array V2.0		1
EGAD00010002476	AS genotyping data for lead SNPs using Illuminia Global Array V2.0	Illuminia Global Array V2.0	40
EGAD00010002478	RNA-seq (Illumina HiSeq 2500) of 142 Human Breast Cancer samples	Illumina	142
EGAD00010002482	SNP array genotyping of multi-site HGSOC samples	InfiniumOmniExpress-24v1-2_A1	305
EGAD00010002484	Genotype and phenotype data on 301 MS patients from Germany, Mainz. All individuals were whole-genome genotyped on the Illumina Global Screening Array and files are provided in plink2 format (pgen / pvar / psam files). Provided phenotypes are sex, year of birth, age and Age Related Multiple Sclerosis Severity (ARMSS) score.	Illumina Global Screening Array	301
EGAD00010002485	Genotype and phenotype data on 575 MS patients from Netherlands. All individuals were whole-genome genotyped on the Illumina Global Screening Array and files are provided in plink2 format (pgen / pvar / psam files). Provided phenotypes are sex, year of birth, age and Age Related Multiple Sclerosis Severity (ARMSS) score.	Illumina Global Screening Array	575
EGAD00010002486	Genotype and phenotype data on 538 MS patients from Italy, OSR. All individuals were whole-genome genotyped on the Illumina Global Screening Array and files are provided in plink2 format (pgen / pvar / psam files). Provided phenotypes are sex, year of birth, age and Age Related Multiple Sclerosis Severity (ARMSS) score.	Illumina Global Screening Array	538
EGAD00010002487	Genotype and phenotype data on 246 MS patients from Austria. All individuals were whole-genome genotyped on the Illumina Global Screening Array and files are provided in plink2 format (pgen / pvar / psam files). Provided phenotypes are sex, year of birth, age and Age Related Multiple Sclerosis Severity (ARMSS) score.	Illumina Global Screening Array	246
EGAD00010002488	Genotype and phenotype data on 209 MS patients from Netherlands. All individuals were whole-genome genotyped on the Illumina Global Screening Array and files are provided in plink2 format (pgen / pvar / psam files). Provided phenotypes are sex, year of birth, age and Age Related Multiple Sclerosis Severity (ARMSS) score.	Illumina Global Screening Array	209
EGAD00010002489	Genotype and phenotype data on 683 MS patients from Germany, TUM. All individuals were whole-genome genotyped on the Illumina Global Screening Array and files are provided in plink2 format (pgen / pvar / psam files). Provided phenotypes are sex, year of birth, age and Age Related Multiple Sclerosis Severity (ARMSS) score.	Illumina Global Screening Array	683
EGAD00010002490	Genotype and phenotype data on 1067 MS patients from Italy, Piedmont. All individuals were whole-genome genotyped on the Illumina Global Screening Array and files are provided in plink2 format (pgen / pvar / psam files). Provided phenotypes are sex, year of birth, age and Age Related Multiple Sclerosis Severity (ARMSS) score.	Illumina Global Screening Array	1067
EGAD00010002491	Genotype and phenotype data on 943 MS patients from UK. All individuals were whole-genome genotyped on the Illumina Global Screening Array and files are provided in plink2 format (pgen / pvar / psam files). Provided phenotypes are sex, year of birth, age and Age Related Multiple Sclerosis Severity (ARMSS) score.	Illumina Global Screening Array	943
EGAD00010002492	Genotype and phenotype data on 151 MS patients from Spain. All individuals were whole-genome genotyped on the Illumina Global Screening Array and files are provided in plink2 format (pgen / pvar / psam files). Provided phenotypes are sex, year of birth, age and Age Related Multiple Sclerosis Severity (ARMSS) score.	Illumina Global Screening Array	151
EGAD00010002493	Genotype and phenotype data on 140 MS patients from Greece. All individuals were whole-genome genotyped on the Illumina Global Screening Array and files are provided in plink2 format (pgen / pvar / psam files). Provided phenotypes are sex, year of birth, age and Age Related Multiple Sclerosis Severity (ARMSS) score.	Illumina Global Screening Array	140
EGAD00010002495	Additional Methylation files for Roussel-ATRT-TM paper titled "Atypical teratoid/ rhabdoid tumoroids reveal subgroup-specific drug vulnerabilities"	Illumina Infinium MethylationEPIC	3
EGAD00010002497	methylation array data of cfDNA from plasma samples of individuals after running a marathon, a 40 min run and resting	Illumina 850K EPIC methylation array	6
EGAD00010002499	Individuals of Native American ancestry from Southern Chile genotyped with the Human Origins SNP Chip	Human Origins Axiom	64
EGAD00010002501	Raw methylation data from technical replicates processed on EPIC v1.0.	Illumina MethylationEPIC Array	48
EGAD00010002503	TANDEMsamplesgenoypedontheIlluminaH3AarrayattheCGPR,SouthAfrica.	Illumina	107
EGAD00010002505	The sys4MS cohort comprises 350 patients with Multiple Sclerosis (MS) and 9 controls, with 2 years of follow-up. Baseline data includes demographics, clinical scales, disease duration and subtype and use of disease-modifying drugs, brain MRI (volumetry and lesion load), retinal thickness by OCT, genomics (GWAS), cytomics, and phosphoproteomics. Data at the end of follow-up includes clinical scales, brain MRI and OCT.	Illumina HumanOmniExpress-24 v1.2 array	400
EGAD00010002507	Active TB patients (sputum smear-positive and GeneXpert-positive) recruited at the Temeke District Hospital in Dar es Salaam, Tanzania, as part of a prospective study that ran between November 2013 and June 2022.	Illumina Infinium H3Africa (V2) with custom add-ons	1409
EGAD00010002509	SNP Genotyping for Lassa Fever cases and population controls from Nigeria and Sierra Leone using Illumina Omni 2.5M and 5M	Illumina Omni 2.5M, Illumina Omni 5M	2667
EGAD00010002510	SNP Genotyping for Lassa Fever cases and population controls from Nigeria and Sierra Leone using Illumina H3Africa array version 1	Illumina H3Africa array version 1	1345
EGAD00010002512	Accesstoproteomicfiles(DIA)ofMIBCpatient-derivedxenografts(N=8)	OrbitrapLumos	12
EGAD00010002513	Accesstoproteomicfiles(DIA)ofpatientswithtreatment-naiveMIBC(N=51),treatment-naiveNMIBC(N=17)andneoadjuvantMIBC(N=11)	OrbitrapLumos	86
EGAD00010002515	51 DNA methylation arrays of human samples initially diagnosed as mesenchymal chondrosarcoma. Microdissection of the cartilage and/or the small round cell component from the same sample may have occurred. As mentioned in the sample descriptions, the diagnoses of four samples have been revised. Additional molecular investigations were conducted for a subset of samples as described in the related publication.	EPIC array (Illumina)	51
EGAD00010002517	Chromosome 12 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002518	Chromosome 22 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002519	Chromosome 8 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002520	Chromosome 18 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002521	Chromosome 3 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002522	Chromosome X imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002523	Chromosome 2 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002524	Chromosome 19 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002525	Chromosome 1 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002526	Chromosome 17 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002527	Chromosome 14 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002528	Chromosome 7 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002529	Chromosome 4 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002530	Chromosome 10 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002531	Chromosome 13 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002532	Chromosome 21 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002533	Chromosome 15 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002534	Chromosome 16 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002535	Chromosome 9 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002536	Chromosome 5 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002537	Chromosome 20 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002538	Chromosome 11 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002539	Chromosome 6 imputed genotypes of samples genotyped on the Axiom Human Genotyping SARS-CoV-2 array (GRCh38)	Axiom Array	1195
EGAD00010002541	TMT-labelled, SCX fractionation, global protein methylation profiled	Thermo orbitrap fusion lumos	30
EGAD00010002543	Gene Expression Profiles measured using Affymetrix HGU133plus2.0 Array	Affymetrix HGU133plus2.0	83
EGAD00010002544	Copy Number profiles measured using Affymetrix SNP Array 6.0	Affymetrix SNP Array 6.0	83
EGAD00010002546	bulk TCR-seq data IMCISION on the PBMCs of responding patients bulkTCR-seq data generated with the immunoSEQ platform (Adaptive Biotechnologies) on PBMCs of responding patients, pre- and post-treatment.	NextSeq 550	18
EGAD00010002548	TMT-labelled, phosphopeptides enriched	Thermo orbitrap fusion lumos	30
EGAD00010002549	TMT-labelled, total proteome profiled	Thermo orbitrap fusion lumos	30
EGAD00010002551	3421 Samples from Nigeria and Ghana, sequenced with the Illumina NestSeq 500	Illumina NestSeq 500	3421
EGAD00010002553	Expression of immune related genes in 12 familial adenomatous polyposis patients. Expression assessed by analyzing whole blood-derived RNA samples using a Nanostring nCounter Immunology V2 panel (579 genes)	Nanostring nCounter	12
EGAD00010002554	Expression of immune related genes in 12 healthy donors. Expression assessed by analyzing whole blood-derived RNA samples using a Nanostring nCounter Immunology V2 panel (579 genes)	Nanostring nCounter	12
EGAD00010002556	SNP array ARID1B patients	Illumina Infinium PsychArray-24 BeadChip v1.3	5
EGAD00010002559	H5 files generated for each sample with Tapestri Pipeline	Tapestri	28
EGAD00010002560	Bed files as whitelist for Tapestri Insights analysis	Tapestri	2
EGAD00010002561	Loom files generated for each sample with Tapestri Pipeline	Tapestri	28
EGAD00010002562	Tap files from Tapestri Insights	Tapestri	28
EGAD00010002564	Array CGH derived copy number variations from dicentric chromosome dic(9;20) positive pediatric Acute lymphocytic leukemia B-lymphocyte samples, by utilizing an Agilent 400K SurePrint G3 Custom CGH Human Genome Microarray (e-Array design 84704)	SurePrint G3 CGH	58
EGAD00010002567	Unfiltered genotype data for a pilot study (Batch 1) of 1,140 DDD Study participants (and 12 "Empty" samples). Samples include 380 mothers, 382 fathers and 378 probands, and form 376 trios. Most of the probands have been previously genoyped on the llumina HumanCoreExome BeadChip (EGAD00010001598) or the Illumina InfiniumCoreExome Beadchip (EGAD00010001600). All samples were genotyped on the Illumina Global Screening Array.	Illumina Global Screening Array	1140
EGAD00010002568	QC-ed data of 9,534 DDD Study participants, including 8,879 individuals with inferred GBR ancestry. Details of genotype QC can be found in https://www.medrxiv.org/content/10.1101/2023.04.20.23288860v1.full.pdf. Genome builds are indicated in the file name. Related individuals have not been removed. Of the 9,534 samples there are 3,148 mothers, 3,138 fathers and 3,248 probands, which form 3,099 trios. Of the 8,879 GBR samples, there are 2,931 mothers, 2,937 father and 3,011 probands, which form 2,788 trios. Most of the probands have been previously genoyped on the llumina HumanCoreExome BeadChip (EGAD00010001598) or the Illumina InfiniumCoreExome Beadchip (EGAD00010001600). All samples were genotyped on the Illumina Global Screening Array.	Illumina Global Screening Array	9534
EGAD00010002569	Unfiltered genotype data for a larger batch (Batch 2) of 8,697 DDD Study participants (and 1 "Blank" sample). Samples include 2,858 mothers, 2,857 fathers and 2,982 probands, and form 2,918 trios. Most of the probands have been previously genoyped on the llumina HumanCoreExome BeadChip (EGAD00010001598) or the Illumina InfiniumCoreExome Beadchip (EGAD00010001600). All samples were genotyped on the Illumina Global Screening Array.	Illumina Global Screening Array	9846
EGAD00010002571	Methylation profile (array data using EPIC_850K) from tumour samples (epithelioid sarcoma)	EPIC_850K	32
EGAD00010002575	This dataset contains the cleaned genotype data from 2173 African eosphageal squamous cell cancer cases and population controls. The genotype data was generated using the H3Africa Illumina Custom microarray.	Illumina HiScan	2173
EGAD00010002577	VaccGene HLA imputation panel variants and HLA allele calls for all individuals across all tested sequence platforms and countries, and the high quality direct genotypes for these individuals (with genotype data available) across the MHC in PLINK format	Illumina HumanOmni25M-8v1-1	2499
EGAD00010002578	Genotype data (in binary PLINK format) and imputed data (with merged 1000Gp3 and AGVP reference panel in GEMMA BIMBAM dosage format) for 1391 individuals from EMaBS in Entebbe, Uganda.	Illumina HumanOmni25-8v1-1	1391
EGAD00010002579	Genotype data (in binary PLINK format) and imputed data (with merged 1000Gp3 and AGVP reference panel in GEMMA BIMBAM dosage format) for 355 individuals from the VAC050 trial performed in Banfora, Burkina Faso - X chromosome.	Illumina HumanOmni25M-8v1-1	353
EGAD00010002580	Genotype data and (in binary PLINK format) and imputed data (with merged 1000Gp3 and AGVP reference panel in GEMMA BIMBAM dosage format) for 750 individuals from Respiratory and Meningeal Pathogens Unit in Soweto, South Africa - X chromosome.	Illumina HumanOmni25M-8v1-1	755
EGAD00010002581	Genotype data (in binary PLINK format) and imputed data (with merged 1000Gp3 and AGVP reference panel in GEMMA BIMBAM dosage format) for 355 individuals from the VAC050 trial performed in Banfora, Burkina Faso - autosomes.	Illumina HumanOmni25M-8v1-1	353
EGAD00010002582	Genotype data and (in binary PLINK format) and imputed data (with merged 1000Gp3 and AGVP reference panel in GEMMA BIMBAM dosage format) for 750 individuals from Respiratory and Meningeal Pathogens Unit in Soweto, South Africa - autosomes.	Illumina HumanOmni25M-8v1-1	755
EGAD00010002583	Genotype data (in binary PLINK format) and imputed data (with merged 1000Gp3 and AGVP reference panel in GEMMA BIMBAM dosage format) for 1391 individuals from EMaBS in Entebbe, Uganda - X chromosome.	Illumina HumanOmni25-8v1-1	1391
EGAD00010002585	Genome-wide CpG methylation information of cell-free DNA samples from healthy controls	NovaSeq 6000	93
EGAD00010002586	Genome-wide CpG methylation information of genomic DNA samples from white blood cells	NovaSeq 6000	12
EGAD00010002587	Genome-wide CpG methylation information of cell-free DNA samples from cancer patients	NovaSeq 6000	16
EGAD00010002588	Genome-wide CpG methylation information of genomic DNA samples from tumor tissue	NovaSeq 6000	20
EGAD00010002590	Longitudinal whole-genome DNA methylation profiling of PBL obtained from UC patients categorized as responders and non-responders	Illumina Infinium HumanMethylation EPIC BeadChip	56
EGAD00010002592	122 unpaired initial tumor samples (122 FFPE samples) of sFL patients measured with OncoScan SNP microarrays, Affymetrix CEL intensity data file types (Thermo Fisher Scientific, Waltham, Massachusetts, USA)	Affymetrix	122
EGAD00010002593	149 unpaired initial tumor samples (133/149 FFPE and 16/149 fresh frozen samples) of lFL patients measured with OncoScan SNP microarrays, Affymetrix CEL intensity data file types (Thermo Fisher Scientific, Waltham, Massachusetts, USA)	Affymetrix	149
EGAD00010002596	The whole study comprises of two patient cohorts. Screening cohort: 40 patients of Germany; validation cohort: 40 patients from Asia. Further, bile duct and CCA cell lines have been analyzed. This dataset contains expression array data for 32 patients of the screening cohort with 8 of them having paired normal tissue plus an additional relapse tumor/normal pair of one of those 8 patients and a patient only with normal tissue. Data was generated on a HumanHT-12 v4 Bead Array (Illumina) and is stored in IDAT file format.	HumanHT-12	41
EGAD00010002597	The whole study comprises of two patient cohorts. Screening cohort: 40 patients of Germany; validation cohort: 40 patients from Asia. Further, bile duct and CCA cell lines have been analyzed. This dataset contains SNP array data for tumor/normal pairs for a subset of 36 patients from the screening cohort plus an additional relapse tumor of one of those 36 patients. Data was generated on a OmniExpress-24 v1.1 Bead Array (Illumina) and is stored in IDAT file format.	OmniExpress-24 v1.1	73
EGAD00010002599	In this study nanopore sequencing was applied to obtain sparse DNA methylation profiles from pediatric CNS tumor samples. A neural network was used to classify the tumor based on the obtained methylation profile.	Illumina Infinium EPIC	94
EGAD00010002608	DNA methylation arrays (850K, Illumina)	EPIC BeadChips (Illumina)	10
EGAD00010002610	Peripheral blood DNA methylome in adalimumab-treated patients with rheumatoid arthritis	Illumina Infinium HumanMethylation EPIC BeadChip	93
EGAD00010002612	Raw .RCC files for NanoString. nCounter PanCancer IO 360 Panel was used	NanoString nCounter PanCancer IO 360 Panel	60
EGAD00010002613	Raw idat files for DNA methylation profiling for 12 CCAs and 7 normal bile duct tissues. DNA methylation profiling was performed using Infinium MethylationEPIC v2.0 Kit.	Infinium MethylationEPIC v2.0 Kit	19
EGAD00010002617	S3 genotype data wave 5 (QC+ SNPs)	Illumina OmniExpress	4411
EGAD00010002618	S3 genotype data wave 6 (phenotypes)		2287
EGAD00010002619	S3 genotype data wave 6 (all SNPs)	Illumina OmniExpress	2287
EGAD00010002620	S3 genotype data wave 6 (QC+ SNPs)	Illumina OmniExpress	2287
EGAD00010002621	S3 genotype data wave 2-4 (phenotypes)		4412
EGAD00010002622	S3 genotype data wave 5 (all SNPs)	Illumina OmniExpress	4411
EGAD00010002623	S3 genotype data wave 1 (phenotypes)		435
EGAD00010002624	S3 genotype data wave 2-4 (all SNPs)	Affymetrix 6.0	4412
EGAD00010002626	S3 genotype data wave 1 (all SNPs)	Affymetrix 5.0	435
EGAD00010002627	S3 genotype data wave 5 (phenotypes)		4411
EGAD00010002628	S3 genotype data wave 2-4 (QC+ SNPs)	Affymetrix 6.0	4412
EGAD00010002633	Genetic characterisation of primary sclerosing cholangitis	Illumina Omni2.5-8Exome BeadChip	1
EGAD00010002635	Buccal sample methylation from breast cancer cases	Illumina HumanMethylationEPIC v1	94
EGAD00010002636	Buccal sample methylation data was generated from healthy controls	Illumina HumanMethylationEPIC v1	93
EGAD00010002638	CONTROL SAMPLES methylation data using Illumina EPIC technology	Illumina EPIC	791
EGAD00010002639	CASE SAMPLES methylation data using Illumina EPIC technology	Illumina EPIC	320
EGAD00010002645	Samples genotyped using Illumina Infinium Global Screening Array v3 for assessing pharmacogenomic genes	Illumina Global Screening Array v3	74
EGAD00010002647	DNA-methylation data of samples included in the GLASS-NL cohort	Illumina Infinium MethylationEpic BeadChip array	231
EGAD00010002649	Longitudinal DNA methylation discovery data as obtained using the Illumina HumanMethylation EPIC BeadChip array (V1) on peripheral blood from CD patients at the AmsterdamUMC prior to and during ustekinumab treatment	Illumina Infinium HumanMethylation EPIC BeadChip	117
EGAD00010002650	DNA methylation validation data as obtained using the Illumina HumanMethylation EPIC BeadChip array (V1) on peripheral blood from CD patients at the John Radcliffe Hospital, Oxford, UK prior to ustekinumab treatment	Illumina Infinium HumanMethylation EPIC BeadChip	34
EGAD00010002651	Longitudinal DNA methylation discovery data as obtained using the Illumina HumanMethylation EPIC BeadChip array (V1) on peripheral blood from CD patients at the AmsterdamUMC prior to and during vedolizumab treatment	Illumina Infinium HumanMethylation EPIC BeadChip	124
EGAD00010002652	DNA methylation validation data as obtained using the Illumina HumanMethylation EPIC BeadChip array (V1) on peripheral blood from CD patients at the John Radcliffe Hospital, Oxford, UK prior to vedolizumab treatment	Illumina Infinium HumanMethylation EPIC BeadChip	25
EGAD00010002654	Two to four sections of 10 Âµm thickness each (depending on sample size) from the respective formalin-fixed paraffin-embedded (FFPE) tissue sample were used for RNA isolation on 46 resected tumors as well as 17 paired biopsies. Digital gene expression analysis was performed on the NanoString nCounter platform, utilizing the NanoString MAX/FLEX system, with the PanCancer Immune Profiling panel as well as the PanCancer Pathway panel provided by NanoString.	NanoString nCounter MAX/FLEX	93
EGAD00010002656	EPIC methylation data of pleural mesothelioma samples and healthy pleura samples	Illumina Infinium HumanMethylation EPIC BeadChip	29
EGAD00010002657	EPIC methylation data of pleural mesothelioma samples and healthy pleura samples	Illumina Infinium HumanMethylation EPIC BeadChip	11
EGAD00010002660	PREGO indivudals' birthplaces in epsg.io/2154 (RGF93 v1 / Lambert-93 -- France) geographic coordinates.		3234
EGAD00010002661	Core set of the PREGO biobank containing experimental genotypes at 209,706 autosomal sites measured on Affymetrix PMRA Axiom array plates for a group of 3,234 individuals from Western France.	Affymetrix PMRA Axiom array plates	3234
EGAD00010002663	86 samples from four human populations: two from Central Asia and two from Southeast Asia	Illumina	86
EGAD00010002667	Illumina Infinium MethylationEPIC Array profiling of 93 pheochromocytoma and paraganglioma tumours with and a germline SDHB mutation	Illumina Infinium MethylationEPIC Array	93
EGAD00010002669	Medulloblastoma (MB) is the most common malignant brain tumor in children. It is a neuroectodermal tumor located in the cerebellum. International consensus recognizes four distinct subgroups of MBs including Wingless (WNT), Sonic Hedgehog (SHH), Group 3 (G3), and Group 4 (G4) with distinct molecular characteristics, prognoses, and mortality rates in patients. Here, we have compiled and evaluated a large international MB cohort for which we generated methylome data for 369 primary MB patient samples.	Illumina	369
EGAD00010002671	Whole-skin DNA Methylation profiled using Illumina Infinium HumanMethylation450 BeadChip Arrays. Methylation was quantified in beta values, which were normalised using the regRCPqn algorithm.	Illumina Infinium HumanMethylation450 BeadChip Arrays	414
EGAD00010002674		Illumina HumanMethlationEPIC version 1	421
EGAD00010002675		Illumina HumanMethlationEPIC version 1	423
EGAD00010002676		Illumina HumanMethlationEPIC version 1	422
EGAD00010002678	Comprehensive data on lifestyle-related modifiable factors, sociodemographic, anthropometric, economic, biochemical, and genetic markers related to the occurrence of cardiometabolic diseases as part of the observational cross-sectional survey (2015 Health Survey of Sao Paulo with Focus on Nutrition (2015 ISA-Nutrition), a population-based study. Data of 805218 SNPs for 841 individuals was genotyped using the Axioma 2.0 Precision Medicine Research Array in the Thermo Fisher Scientific laboratory (Affymetrix Inc, Santa Clara, CA).	Affymetrics Axiom SNP Array 2.0	841
EGAD00010002682	Array data for oesophageal and related samples â€“ Ganguli et al (methylation array)	Illumina	327
EGAD00010002684	GWAS data of the AlpeDPD trail cohort	Illumina GSA V1.0	1146
EGAD00010002686	Single-cell proteomics of human bone marrow, FACS panel 2	Eclipse Tribrid	1
EGAD00010002687	Single-cell proteomics of human bone marrow	Eclipse Tribrid	8
EGAD00010002689	This dataset contains 24 methylation array samples of patients with desmoplastic small round cell tumor measured on Illumina 850k.	Illumina 850k.	24
EGAD00010002691	GWAS data. GRCh37	Illumina H3Africa Custom Array	2018
EGAD00010002695	Description of phenotype and data dictionary		718
EGAD00010002696	PD samples genotypes by NeuroChip SNP array	Illumina iScan	698
EGAD00010002698		Illumina HumanMethlationEPIC version 1	135
EGAD00010002699		Illumina HumanMethlationEPIC version 1	136
EGAD00010002700		Illumina HumanMethlationEPIC version 1	136
EGAD00010002702	Ctrl samples: TMT-proteomics using Orbitrap FusionTM LumosTM TribridTM Mass Spectrometer, TMT10plex (N=8 sample batch) or TMT16plex (2x N=16 sample batches) Isobaric Label Reagent (Thermo Fisher Scientific, Waltham, MA, USA), experiment type: bottom-up proteomics; data-dependent acquisition.	Orbitrap Fusion	40
EGAD00010002703	CUD samples: TMT-proteomics using Orbitrap FusionTM LumosTM TribridTM Mass Spectrometer, TMT10plex (N=8 sample batch) or TMT16plex (2x N=16 sample batches) Isobaric Label Reagent (Thermo Fisher Scientific, Waltham, MA, USA), experiment type: bottom-up proteomics; data-dependent acquisition.		1
EGAD00010002705	Genopyting of 45 individuals using the Illumina Global Screening Array-24 v3	Illumina GSA v3	74
EGAD00010002706	Genopyting of 65 individuals using the Illumina Global Screening Array-24 v2	Illumina GSA v2	11
EGAD00010002707	Genopyting of 45 individuals using the Illumina Global Screening Array-24 v3	Illumina GSA v3	50
EGAD00010002709	This dataset is related to the EMBARCAM BC360 PROJECT. A total of 106 breast cancer samples, from Pregnancy-Associated Breast Cancer (PABC) (n=57) and non-PABC (n=49) patients, were analysed using the nCounter Breast Cancer 360 V2 panel on the NanoString platform. All samples were FFPE tumor samples from breast cancer patients, derived from GEICAM/2012-03, GEICAM/2017-07, and EpiGEICAM studies, sponsored by GEICAM.	Nanostring Technology	108
EGAD00010002711	expression analysis of FACS sorted human cells from primary tumors and distant organs micrometastases in xenografts. Patient Derived xenografts were generated transplanting tumor cells from metastatic breast cancer patients into mammary fat pad of immunocompromised mice	Affymetrix Human Genome U133 Plus 2.0 Array	50
EGAD00010002714	Dataset contains imputed SNP genetics data for human phenotype project https://humanphenotypeproject.org/. The samples are genotypes of 8958 individuals that underwent genetic sequencing based on buccal swab.Sequencing is performed at Neogen GeneSeek Laboratories using Illumina Novaseq at low-pass, targeting coverage levels of 0.5Ã— and 1Ã—. Imputation was done using loimpute-v0.1.0 over the aligned reads. Imputation is done for the autosomal chromosomes (1-22) using The 1000 Genomes phase 3 haplotypes (1KGP3) as a reference panel. The samples are provided in merged PLINK 1 binaries (.bed/.bim/.fam)	Gencove	8958
EGAD00010002718	Illumina MethylationEpic Beachip data generated from tissue samples collected from head and neck cancer patients refractory to anti-PD-1 therapy, collected before and after the 5AZA treatment	Illumina MethylationEPIC BeadChip arrays	16
EGAD00010002724	Raw methylation array data (Illumina Infinium EPIC v2.0) from patients with rare neurodevelopmental disorders. IDAT files were generated from DNA extracted from whole blood at the ASGARD platform, Inserm U1245, Univ Rouen, CHU Rouen, to investigate new episignatures and provide independent validation datasets for existing episignatures.	EPIC Illumina arrays	63
EGAD00010002726	Three different types of samples were used: 19 normal adjacent, 17 adenoma and 19 colorectal tumor tissue samples. This included 10 pairs of colorectal cancer and normal samples and 1 pair of adenoma-normal samples of the same patient. Tissue specimens were formalin-fixed paraffin-embedded (FFPE). DNA was isolated using the QIAamp FFPE Tissue kit (Qiagen, Hilden, DE) according to the manufacturerâ€™s instructions. The processed DNA was run on the Illumina Human MethylEPICÂ® v1.0 BeadChip array.	Illumina EPIC 450K BeadChip assay	50
EGAD00010002730	Pan-cancer phosphoproteomics profiling data for 1000 retrospective and prospective samples of the MASTER/INFORM cohort,Shotgun proteomics	Thermo Orbitrap Tribrid Mass spectrometer	1364
EGAD00010002732	South African breast cancer GWAS genotype data for 2823 female African breast cancer cases. The data was generated using the H3Africa Custom microarray and genotyped on the Illumina HiScan instrument. The dataset is in VCF format.	Illumina HiScan	2823
EGAD00010002734	Tumor and matched normal DNA profiling by SNP array	Illumina Infinium OmniExpress-24 BeadChip array	48
EGAD00010002736	Intensity files (.idat) derived from the Infinium CytoSNP-850K BeadChip microarray (Illumina) were used to analzye a possible Homologous-Recombination-Repair Deficiency (Molecular Rationale for a PARP-Inhibitor therapy) of n= 39 tumor patients (each n= 39 .idat files of the green and red chanel respectively) presented in the Molecular Tumorboard at the University Hospital Erlangen. Using these files a Genomic Instability Score was bioinformatically deducted (R-based scripts, e.g. ASCAT, scarHRD) based upon the presence of three parameteres "Loss of Heterozygosity, LOH", "Large Scale Transitions, LST" and "Telomeric Allelic Imbalances, TAIs". The associated manuscript is published in International Journal of Cancer, IJC.	Infinium CytoSNP-850K v1.4 Bead Chips array, Illumina	78
EGAD00010002738	This dataset contains the DNA methylation data of 904 samples from Dutch healthy individuals.	Illumina EPIC	904
EGAD00010002740	This dataset contains DNA methylation data from 400 individuals (200 Type 2 Diabetes cases and 200 controls) from the GCAT (Genomes for Life) cohort. The methylation profiling was performed using the Illumina Infinium MethylationEPIC v2.0 BeadChip, which offers comprehensive coverage of over 850,000 CpG sites. Data are provided in IDAT file format, enabling raw signal-level analysis and downstream processing.	Infinium MethylationEPIC v2.0	400
EGAD00010002742	This dataset contains WGS data of 234 Japanese, including 88 COVID-19 patients and 146 healthy subjects. Libraries were sequenced on the HiSeqX or NovaSeq 6000 system. Sequenced reads were aligned against the Genome Reference Consortium human genome build 38 using BWA-MEM.	HiSeqX or NovaSeq 6000	234
EGAD00010002744	4 sets of matched patient-derived-organoids (PDOs),Â -neurospheres (PDNs), and parent glioblastoma tissue (12 samples total) underwent SNP array analysis. 200ng of each sample DNA extracted via a commercial kit (Qiagen) and loaded onto theÂ Global Screening Array v2.0 array. Data is provided as raw red and green .idat files.	Illumina Global Screening Array v2.0	12
EGAD00010002745	3 sets of matched patient-derived-organoids (PDOs),Â -neurospheres (PDNs), and parent glioblastoma tissue (9 samples total) underwent methylation array analysis. 200ng of each sample DNA extracted via a commercial kit (Qiagen) and loaded onto the Illumina EPIC v2.0 array. Data is provided as raw red and green .idat files.	Illumina EPIC v2.0	9
EGAD00010002747	Methylation profiling of 228 samples, using the approach described in "Molecularly matched targeted therapies plus radiotherapy in patients with newly diagnosed glioblastoma without MGMT promoter hypermethylation (N2M2/NOA-20 phase I/IIa umbrella trial)"	Infinium Methylation EPIC BeadChip	228
EGAD00010002749	4988 samples issued from GCAT cohort, genotyped with MEGAex-Infinium Array, with data for Cr1-23. Plink files with QC and imputed using TOPMed r2.	Illumina-Genotyping Array	4988
EGAD00010002752	Rounded Log2 CPM well-expressed miRNA levels in plasma; determined using the HTG EdgeSeq miRNA Whole Transcriptome Assay	Illumina NextSeq 500	754
EGAD00010002753	Rounded chronological age of the participants		1
EGAD00010002754	Rounded Log2 CPM well-expressed miRNA levels in plasma; determined using the HTG EdgeSeq miRNA Whole Transcriptome Assay	Illumina NextSeq 500	1930
EGAD00010002755	Rounded chronological age of the participants		1
EGAD00010002758	2746 samples issued from GCAT cohort, genotyped with GSA Array, with data for Cr1-23. Plink files with QC and imputed using TOPMed.	Illumina-Genotyping Array	2746
EGAD00010002760	Methylation profiling of 9 CDS and ES tumor samples, using the approach described in "Patient-derived tumoroids from CIC::DUX4 rearranged sarcoma identify MCL1 as a therapeutic target	Infinium Methylation EPIC BeadChip	9
EGAD00010002762	Blood-derived DNA was obtained from samples within the NICOLA cohort. Genetic data was generated using the Illumina Infinium CoreExome-24 array (Illumina, USA) and is available for 2,978 participants (551,839 directly genotyped and 18,148,478 imputed single nucleotide polymorphisms following initial quality control). Genetic .bed .bim and .fam files are available.	Illumina Infinium CoreExome-24	2978
EGAD00010002765	This dataset contains the DNA methylation data of 439 samples from iMED cohort	Illumina EPIC	439
EGAD00010002767	This dataset contains the DNA methylation data of 384 samples from individuals with one or more comorbidities	Illumina EPIC	384
EGAD00010002769	This dataset contains the DNA methylation data of 260 samples from Dutch healthy individuals.	Illumina EPIC	260
EGAD00010002771	20 genotyped Kiritimati individuals	Illumina	20
EGAD00010002773	This dataset contains ovarian endometriod and ovarian, stomach, pancreatic and colorectal mucinous tumor tissue samples	Illumina Infinium Methylation EPIC array	1
EGAD00010002784	primary pancreatic neuroendocrine tumors mutated in Menin, DAXX or ATRX	Illumina Infinium MethylationEPIC	4
EGAD00010002788	Processed miRNA expression profiles and clinical phenotype data from patients with cardiovascular diseases (dilated cardiomyopathy, acute coronary syndrome, ischemic cardiomyopathy, and coronary artery disease) and healthy controls. MicroRNA expression was assessed using Agilent human miRNA microarrays with RMA normalization and log2 transformation. Dataset includes expression values for all detected case/control miRNAs from miRBase v21, along with clinical metadata including disease status, age, sex, and study center.	Agilent Human miRNA Microarray	4762
EGAD00010002790	Genotype dataset from individuals diagnosed with phenytoin-induced Stevens-Johnson syndrome (SJS) or toxic epidermal necrolysis (TEN) and control individuals from the Thai population. DNA was extracted from whole blood, and SNP genotyping was performed using the Illumina HumanOmniExpressExome-8 v1.2 array.	Illumina HumanOmniExpressExome-8 v1.2 array	1965
EGAD00010002795	Genome-wide data for 432 Admixed individuals from Peru	Illumina 2.5M Human Omni array	432
EGAD00010002801	Whole blood methylation data measured using Illumina MethylationEPIC v2.0 array	Illumina MethylationEPIC v2.0	88
EGAD00010002806	The PacBio control dataset for differential methylation analysis included 39 control samples without methylation disorders. All samples were sequenced using PacBio RevioÂ® HiFi reads at a matched coverage of 15X, and methylation was called using pb-cpg-tools on aligned BAM files.	Pacbio	39
EGAD00010002807	The PacBio cases dataset for differential methylation analysis included 10 case samples with known KMT2A loss-of-function variants, 2 VUS samples. All samples were sequenced using PacBio RevioÂ® HiFi reads at a matched coverage of 15X, and methylation was called using pb-cpg-tools on aligned BAM files.	Pacbio	14
EGAD00010002818	Raw Axiom v2.0 UK Biobank Array CEL files for PCA Atlas donor buffy-coat DNA, used to generate SNP genotypes for demultiplexing single-cell and spatial transcriptomics libraries.	Applied Biosystems Axiom v2.0 UK Biobank Array	24
EGAD50000000001	Targeted sequencing with the Myeloid Solutions™ Panel (MYS) of SOPHiA Genetics for COVID-19 patientds (n=241 deceased, n=239 survivors).	NextSeq 500	480
EGAD50000000005	We used novel processing techniques to obtain whole genome data together with 3D anatomic and histomorphologic analysis in two men (GP5 and GP12) with high risk PrCa undergoing radical prostatectomy. A total of 22 whole genome-sequenced sites (16 primary cancer foci and 6 lymph node metastatic) were analyzed using evolutionary reconstruction tools and spatio-evolutionary models. Probability models were used to trace spatial and chronological origins of the primary tumor and metastases, chart their genetic drivers, and distinguish metastatic and non-metastatic subclones.	Illumina NovaSeq 6000	24
EGAD50000000006	Dataset: AfricanNeo_B; genotyping batch: SE-2209_191110; array: Illumina H3Africa Consortium array (BeadChip type: H3Africa_2017_20021485_A2). Dataset: AfricanNeo_B; genotyping batch: TE-2567_201023_2019; array: Illumina H3Africa Consortium array (BeadChip type: H3Africa_2019_20037295_B1).		156
EGAD50000000007	Dataset: AfricanNeo_F; genotyping batch: OE-0808_150625; array: Illumina Omni2.5-Octo BeadChip.		29
EGAD50000000008	Dataset: AfricanNeo_A; genotyping batch: SE-2209_191110; array: Illumina H3Africa Consortium array (BeadChip type: H3Africa_2017_20021485_A2). Dataset: AfricanNeo_A; genotyping batch: TI-2658_201112; array: Illumina H3Africa Consortium array (BeadChip type: H3Africa_2019_20037295_B1).		1027
EGAD50000000009	Dataset: AfricanNeo_C; genotyping batch: TE-2567_201023_2017; array: Illumina H3Africa Consortium array (BeadChip type: H3Africa_2017_20021485_A2). Dataset: AfricanNeo_C; genotyping batch: TE-2567_201023_2019; array: Illumina H3Africa Consortium array (BeadChip type: H3Africa_2019_20037295_B1).		300
EGAD50000000010	Dataset: AfricanNeo_D; genotyping batch: RK-2011_190308; array: Illumina H3Africa Consortium array (BeadChip type: H3Africa_2017_20021485_A2). Dataset: AfricanNeo_D; genotyping batch: SE-2209_191110; array: Illumina H3Africa Consortium array (BeadChip type: H3Africa_2017_20021485_A2).		151
EGAD50000000011	Dataset: AfricanNeo_E; genotyping batch: TC-2508_200401_A; array: Illumina H3Africa Consortium array (BeadChip type: H3Africa_2017_20021485_A2). Dataset: AfricanNeo_E; genotyping batch: TC-2508_200401_B; array: Illumina H3Africa Consortium array (BeadChip type: H3Africa_2019_20037295_B1).		100
EGAD50000000012	QuantSeq 3'-mRNAseq. 5 donors, 4 stimuli, 2 time points	NextSeq 500	60
EGAD50000000013	Dataset contains text files that describe the chromosome, position, and read count for an amplicon using the RealSeqS assay.	Illumina HiSeq 4000	150
EGAD50000000014	ChIP-seq has been perfomed on 5 tumor fresh-frozen primary breast cancer tissues from female patients. Immunoprecipitation has been performed for ERa (SC-543, Santa Cruz). Raw single-end fastq data have been aligned using bwa-mem using Hg19 genome assebly as reference. Unfiltered aligment bam files are provided.	Illumina HiSeq 2500	6
EGAD50000000015	A oligo-captured STARR-seq library was generated using as targets ERa binding regions indetified by ChIP-seq in Ishikawa (Endometrial cancer) and T47D (breast cancer) cell lines and, breast cancer primary tissues. STARR-seq assays have been performed in MCF-7 and Ishikawa cell lines. Cells have been cultured for 3 days in hormone deprivated (phenol-red free) media and stimulated for 6 hours by 10nM estradiol (E2) or DMSO (negative control).	Illumina NovaSeq 6000	15
EGAD50000000016	We evaluate the analytical performance of the PGDx™ elio™ tissue complete assay, a 505 gene next-generation sequencing (NGS) tissue-based assay, that has now been FDA-cleared for use by physicians to help guide treatment decisions for cancer patients, using a NSCLC cohort of 38 patients.	NextSeq 500	38
EGAD50000000017	Results of sequencing the targeted RNA-seq panel of breast cancer patients and controls. This dataset comprises of FASTQ files of 295 samples (both breast cancer patients and controls) and the results of mapping the FASTQ files to the reference human genome (hg38) available in a form of raw gene expression matrix.	HiSeq X Ten	295
EGAD50000000018	SLE serum stimulation induced a unique expression profile in human colon organoids marked by a reduction in goblet cell marker expression and mucus composition. Transcriptomic analysis of SLE human colon biopsies displayed a downregulation of epithelial secretory markers. Collection of raw sequencing data used in this publication https://doi.org/10.1101/2023.07.04.547690	Illumina NovaSeq 6000	64
EGAD50000000019	Dataset includes fastq files for RNA-Seq experiments for tumor samples of PPGL patients. Single end reads fastq files are available for 81 different samples	Illumina NovaSeq 6000	81
EGAD50000000020	Here we performed single-cell RNA sequencing to characterize NK cell subsets. Sorted NK cell subsets representing discrete stages of NK cell differentiation as well as bulk NK cells were sequenced.	Illumina HiSeq 4000	17
EGAD50000000021	Cholangiocarcinomas (CCAs) is a type of cancer with few effective systemic therapies. Elucidation of the molecular landscape of the disease from genomic studies based on next generation sequencing (NGS) has contributed to the introduction of new targeted therapies. One of these treatments consists of a class of small molecules that target members of the FGFR family of receptor tyrosine kinases. These drugs are effective and have been approved for cholangiocarcinomas with fusions or rearrangements of FGFR genes. In contrast, the role of these inhibitors in cholangiocarcinomas with mutations in FGFR genes is less well defined. We report here a patient with a cholangiocarcinoma bearing a FGFR2 p.Ser252Trp mutation. The patient was treated with two different FGFR inhibitors, as the first caused ocular toxicity. She obtained clinical benefit from both. This case illustrates the efficacy of FGFR inhibitors on cholangiocarcinoma with specific point mutations. This is the first case to report the clinical benefit of these drugs in FGFR2 p.Ser252Trp mutation. Clinical benefit can be sustained, as seen in our patient. Our case also shows that FGFR inhibitors-induced adverse effects, such as ocular toxicities, may not recur after re-challenge with an alternative drug of the same class.	Ion Torrent S5	1
EGAD50000000022	7 Isoform sequencing or Long-read RNAseq mapped bam files. Reads were aligned to HG38 reference using Minimap2.	PacBio RS II	7
EGAD50000000023	19 H3K27ac HICHIP from T-ALL patient samples and one healthy normal control sample. Reads were aligned to HG38 reference.	Illumina NovaSeq 6000	20
EGAD50000000024	19 ATAC-seq from T-ALL patient samples and one healthy normal control sample. Reads were aligned to HG38 reference	Illumina NovaSeq 6000	20
EGAD50000000027	RNA-seq data for 85 patients with acute lymphoblastic leukemia expressing the gamma delta T cell receptor (γδ T-ALL)	Illumina HiSeq 3000	85
EGAD50000000028	Whole genome sequencing data for 61 samples with acute lymphoblastic leukemia expressing the gamma delta T cell receptor (γδ T-ALL) and 29 germline samples	Illumina HiSeq 3000	90
EGAD50000000029	Single-cells from primary cB-ALL samples were isolated using an inverted microscope coupled to a micromanipulator equipped with glass capillary for cell collection. A minimum of 20 cells were isolated per sample in microdrops of 2.5 μl of phosphate buffered saline (PBS) with 0.5% polyvinylpyrrolidone. Cell lysis and DNA amplification was performed using the SurePlex DNA Amplification System (Illumina). Genomic DNA was subsequently fragmented and tagged with the VeriSeq PGS transposome and the TruSeq Index adapters by PCR for library preparation (VeriSeq PGS Library Prep Kit, Illumina). Equal volumes of normalized libraries were pooled and sequenced on an Illumina MySeq platform with 1×75-bp single-end sequencing. Reads were subsequently aligned to the human reference genome (GRCh38/hg38) using Bowtie2 (version 2.2.4).	Illumina MiSeq	9
EGAD50000000030	This dataset contains 224 paired fastq files (112 single cells) from the following samples: brains from two Multiple System Atrophy patients and one control, and non-brain controls (fibroblasts, NA12878)	unspecified	112
EGAD50000000031	This dataset comprises a genomic and a phenotypic excel sheet displaying 137 cases classified as suspected Lynch syndrome. Included are genotypic data such as tumour mutational burden, tumour mutational signatures and germline/somatic variant calls for each colorectal, endometrial or sebaceous skin tumour screened. Genomic data is derived from targeted multigene panel sequencing including ~300 hereditary cancer genes.	Illumina NovaSeq 6000	137
EGAD50000000032	RNAseq for #1049, #111, #1217, #206, COV362	Illumina NovaSeq 6000	5
EGAD50000000033	WGS of tumour PDX and matched patient blood	Illumina NovaSeq 6000	5
EGAD50000000034	Exome sequencing of PDX and matched patient blood	NextSeq 500	4
EGAD50000000035	BROCA panel sequencing using the BROCA-HR v8 and BROCA-GO v1 versions of the gene panel	Illumina HiSeq 2500 Illumina NovaSeq 6000	15
EGAD50000000036	Deposited here are whole-genome sequencing bam or fastq files from the experimental isogenic cell lines used in the study "The chemotherapeutic CX-5461 is extremely mutagenic and may increase cancer risk, Koh, Gene (2023)". The bam files were aligned to GRCh38/hg38 using BWA-MEM. The corresponding variant call data have been deposited on Mendeley Data, V2, doi: 10.17632/d58cv549v6.2	Illumina NovaSeq 6000	65
EGAD50000000037	Genotype measured by Illumina GSA		1063
EGAD50000000038	Solitary fibrous tumor/Hemangiopericytoma (SFT/HPC) is a rare subtype of soft tissue sarcoma associated with NAB2-STAT6 gene fusions. In this study, a novel SFT/HPC was characterized using whole genome sequencing.	Illumina NovaSeq 6000	2
EGAD50000000039	Solitary fibrous tumor/Hemangiopericytoma (SFT/HPC) is a rare subtype of soft tissue sarcoma associated with NAB2-STAT6 gene fusions. This study established and characterized a novel SFT/HPC patient-derived cell line called SFT-S1. Potential drug candidates that could be repurposed for the treatment of SFT/HPC were screened. Screening was performed through RNA-Seq	Illumina NovaSeq 6000	9
EGAD50000000040	Solitary fibrous tumor/Hemangiopericytoma (SFT/HPC) is a rare subtype of soft tissue sarcoma associated with NAB2-STAT6 gene fusions. This study established and characterized a novel SFT/HPC patient-derived cell line called SFT-S1 using the twist human methylome panel.	Illumina NovaSeq 6000	3
EGAD50000000042	sEVs were isolated from postmortem human brain tissue. RNA was extracted from the source brain tissue and the sEVs and prepared for RNA sequencing. cDNA was sequenced on the Illumina HiSeq using a 2x150 paired-end read configuration.	Illumina HiSeq 2500	48
EGAD50000000043	Human brain sEVs were isolated from postmortem human brain tissue. RNA was extracted from both the source brain tissue and sEVs and prepared for long-read sequencing. cDNA was sequenced on the PacBio Sequel II. The data provided have undergone processing using the tool ccs (v6.0.0) to generate circula consensus reads. For further processing, it is recommended to follow the guidelines at https://isoseq.how/, starting with using lima to remove cDNA primers.	Sequel II	48
EGAD50000000044	Single-nucleus mRNA Sequencing of prenatal and postnatal samples from the brain and its border regions. Most samples were multiplexed with several samples run in one 10X reaction. A separate immune cell dataset was combined with published data from Braun et al 2023 and Yang et al 2021 integrated using harmony is included.	NextSeq 1000 NextSeq 550	11
EGAD50000000045	Single-cell RNA-sequencing. Some samples were analyzed using CITE-Seq and samples were multiplexed and run on the same 10X reaction	Illumina HiSeq 1000 NextSeq 1000 NextSeq 550	6
EGAD50000000046	mCEL-Seq2 analysis with reference mapping of human brain tissues and border regions	Illumina HiSeq 4000	17
EGAD50000000047	Single-nucleus fixed mRNA profiling of FFPE samples from patients after sex-mismatched blood stem cell transplantation and matched controls.	NextSeq 1000	5
EGAD50000000048	CGMH-OCCC-WES data (tumor-normal paired) from 104 patients with ovarian clear cell carcinoma.	Illumina NovaSeq 6000	208
EGAD50000000049	This dataset contains exome sequencing data, phenotypic information, and somatic mutation analysis results for 44 diagnosis-relapse DLBCL pairs. Exon sequencing data were captured using the Agilent HaloPlex exon kit. bam files were obtained from illumina sequencing followed by BWA alignment.	unspecified	108
EGAD50000000050	The Papua New Guinean Lowlanders dataset includes 41 whole genome sequences for Papua New Guinean individuals sampled in Daru. DNA was extracted from saliva samples (Oragen kit). Sequencing libraries were prepared using the TruSeq DNA PCR-Free HT kit. 150 bp paired-end sequencing was performed on the Illumina HiSeq X5 sequencer. The PGAP dataset provides Fastq, mapped cram files (GRCh38) and phenotype measurements.	HiSeq X Five	41
EGAD50000000051	This dataset contains 84 paired fastq files of Bulk RNAseq collected from 14 patients with Extramedullary multiple myeloma, 14 patients with newly diagnosed multiple myeloma and 14 patients with Relapsed/refractory multiple myeloma	Illumina NovaSeq 6000	42
EGAD50000000052	This dataset contains 72 paired fastq files of raw Whole exome sequencing data from 14 patients with Extramedullary myeloma, 8 paired samples from the same patients at the time of Newly diagnosed myeloma, and the corresponding normal samples	Illumina NovaSeq 6000	36
EGAD50000000053	This dataset contains 16 fastq files of raw single-cell RNAsequencing data from 6 patients with Extramedullary multiple myeloma used for the study "Longitudinal Multi-Omics Study Reveals Molecular Drivers and Tumor Microenvironment in Extramedullary Multiple Myeloma"	Illumina NovaSeq 6000	6
EGAD50000000054	We performed 10X Chromium 5' scRNA and scTCR sequencing on 4 on-treatment HNSCC PBMC samples.	Illumina NovaSeq 6000	4
EGAD50000000055	We performed 10X Chromium 3' scRNA sequencing of 23 pre- and on-treatment HNSCC biopsy samples.	Illumina NovaSeq 6000	23
EGAD50000000056	We performed 10X Chromium 5' scRNA (n=50), scTCR (n=48) and scBCR (n=49) sequencing of pre- and on-treatment HNSCC biopsy samples.	Illumina NovaSeq 6000	50
EGAD50000000057	Single-cell RNA-seq data of prospectively collected normal ovarian tissue sample from 1 healthy individual and tumor tissue samples with or without chemotherapy from 19 HGSOC patients.	Illumina NovaSeq 6000	29
EGAD50000000058	This dataset is generated for EpiHK. It contains Histone PTM ChIP-seq, WGBS, RNA-seq of 7 Human Hepatocellular Carcinoma and 7 tumor adjacent normal tissue sample.	Illumina HiSeq 1500 Illumina HiSeq 4000 NextSeq 500	14
EGAD50000000059	Skeletal muscle of Inuit homozygous carriers of the common Greenlandic TBC1D4 p.Arg684Ter variant is severely insulin resistant but have normal metabolic responses during exercise	Illumina HiSeq 4000	58
EGAD50000000060	We aimed to identify somatic mutations and transcriptional differences that could explain the resistance to Doxorubicin. This dataset includes RNA-Seq of HCC biopsies and Organoids and WES of Organoids.	Illumina HiSeq 2500 Illumina NovaSeq 6000 unspecified	52
EGAD50000000062	This dataset includes combined single-cell RNA-Seq and T cell receptor profiling data for SARS-CoV-2 spike protein-reactive CD4+ T cells and NK cells from blood, liver, lungs, and bone marrow of human donors.	Illumina NovaSeq 6000	16
EGAD50000000063	The data set consists of fastq raw files from RNA-seq of seven mucosal biopsies of the colon from seven patients, among them three patients with irritable bowel syndrome with diarrhea-predominant symptoms. Paired end sequencing on Illumina NovaSeq 6000 was used.	Illumina NovaSeq 6000	8
EGAD50000000064	These data consist of transcriptome and chromatin accessibility data (RNAseq and ATACseq respectively) derived from five High-Grade Serous Ovarian Carcinoma (HGSC) cell lines. HGSC lines included PEO1, PEO4, PEA2, OVCAR5 and OVCAR8. Samples were taken pre- and post-treatment with a novel epigenetic compound, HKMTi-1-005, which targets the activity of two histone methyltransferases, EZH2 and G9a.	Illumina NovaSeq 6000	36
EGAD50000000065	Fastq files of 18 pediatric sarcoma PDX WXS samples. Each sample was sequenced on 2 Lanes.	unspecified	36
EGAD50000000066	Heterozygous (HET) truncating mutations in the TTN gene (TTNtv) encoding the giant titin protein are the most common genetic cause of dilated cardiomyopathy (DCM). We investigated 127 clinically identified DCM human cardiac samples with targeted sequencing using the TruSight Cardio panel on an Illumina MiSeq system with a special focus on TTNtvs. This dataset belongs to the publication of Kellermayer, D et al. Truncated titin is integrated into the human dilated cardiomyopathic sarcomere	Illumina MiSeq	127
EGAD50000000067	single-cell RNAseq dataset	Illumina NovaSeq 6000	53
EGAD50000000068	Bulk RNAseq PBMC	Illumina NovaSeq 6000	10
EGAD50000000069	Bulk CD14 RNAseq	Illumina NovaSeq 6000	10
EGAD50000000070	PBMC RNAseq drug in vitro	Illumina NovaSeq 6000	47
EGAD50000000072	Agilent CNS cohort	NextSeq 500	97
EGAD50000000073	Agilent Sarcoma cohort	NextSeq 500	52
EGAD50000000074	Illumina cohort	NextSeq 500	40
EGAD50000000075	Enzymatic conversion-based methylation sequencing data (EM-seq) for colon cancer used in the MESA study	Illumina NovaSeq 6000	228
EGAD50000000076	This dataset contains long read transcriptome sequencing of 19 chronic lymphocytic leukemia (CLL) patients with or without mutation in SF3B1, as well as 6 B cell samples from healthy individuals. The BAM files are unaligned CCS reads.	Sequel II	25
EGAD50000000077	This dataset contains long read transcriptome sequencing of 25 myelodysplastic sndrome (MDS) patients with or without mutation in SF3B1. The BAM files are unaligned CCS reads.	Sequel II	25
EGAD50000000078	This dataset contains Illumina stranded RNA-seq from of 19 chronic lymphocytic leukemia (CLL) patients with or without mutation in SF3B1, as well as 6 B cell samples from healthy individuals. RNA was extracted from CLL cells.	Illumina HiSeq 2500	27
EGAD50000000079	387 swab samples were sequenced. The bam files contain consensus reads.	DNBSEQ-G400	387
EGAD50000000080	40 buccal mucosa samples and a paired blood sample were whole-exome sequenced.	DNBSEQ-G400	41
EGAD50000000081	This dataset contains 10x scRNA sequencing of glioblastoma samples. Sequencing was performed on a Illumina HiSeq 4000. The sequencing was always paired.	Illumina HiSeq 4000	22
EGAD50000000082	Recordings of elevated calcium levels allowed selections of cells from heterogenous populations for transcriptomic analysis. Paired RNA-Seq from S24 cells using Takara SMARTer Ultra Low Input RNA v4 kit and sequencing on a Illumina NovaSeq 6000.	Illumina NovaSeq 6000	18
EGAD50000000083	H3K27ac ChIP-seq and RNA-seq of lung neuroendocrine tumors	Illumina HiSeq X Illumina NovaSeq 6000 NextSeq 500	76
EGAD50000000084	This dataset consists of amplicon targeted NGS paired-end raw data (FASTQ R1 and R2) obtained from 148 colon cancer patients. Specifically, we have 148 primary tumor tissue samples, 148 white blood cell samples, and 118 plasma samples collected at baseline.	NextSeq 550	414
EGAD50000000085	This dataset contains VCF files for 124 central nervous system glioma samples for the study "Cerebrospinal fluid cfDNA sequencing for classification of central nervous system glioma"		124
EGAD50000000086	Medullary thyroid cancer (MTC) is a rare malignant tumor that arises from parafollicular cells. Approximately 8% of thyroid cancer cases are MTC, and about 25% of these have a hereditary component. Incorporating molecular parameters into tumor classification is important. Besides, the presence of pathogenic germline variants can impact directly on cancer prevention. Thus, the aim of this study was to perform whole exome sequencing (WES) on a consecutive series of hereditary RET wild-type MTC patients to identify genetic variants that may be involved in the carcinogenesis of this tumor. WES was performed on 28 patients negative for germline RET pathogenic variants using the NovaSeq 6000 platform. Variant classification followed American College of Medical Genetics and Genomics guidelines.Our study represents a significant advancement in gene discovery for MTC genetics.	Illumina NovaSeq 6000	28
EGAD50000000087	The project concerns whole genome sequencing (short-read and long-read) and RNA-sequencing of lipomatous tumors with 12q-amplification. A subset of lipomatous tumors is driven by amplification of genes mapping to chromosome arm 12q, including the MDM2 gene. The goals of the study were to compare expression levels of genes mapping to 12q in tumors with amplification in rod-shaped or circularized chromosomes as well as to assess and compare the structural variants in those tumors. In total, 20 samples were analyzed, and the data were correlated with genomic data on bulk and single cell DNA from the same tumors. The fastq files from the tumors were uploaded to EGA.	BGISEQ-500 NextSeq 500 Sequel	21
EGAD50000000088	Human engineered CRC organoids (APC KO; KRAS G12D; TP53 KO) were grown in glucose, lactate or with DCA . Samples were collected and processed for bulk ATAC-seq.	NextSeq 500	9
EGAD50000000089	Human engineered CRC organoids (APC KO; KRAS G12D; TP53 KO) were grown in either glucose or lactate, in the presence and absence of DCA or BRD4 inhibition (JQ1). Samples were collected and processed for bulk RNA-seq.	NextSeq 2000 NextSeq 500	21
EGAD50000000090	Human engineered CRC organoids (APC KO; KRAS G12D; TP53 KO) were grown in glucose, lactate or with DCA. Samples were collected 7 hours after the treatments and processed for bulk CHIC-seq.	NextSeq 500	3
EGAD50000000091	This dataset contains 2 BAM files sequenced with Illumina NextSeq 500 e NovaSeq 600 and 20 files with variant calling sequenced with Illumina NextSeq 500 e NovaSeq 600.	Illumina NovaSeq 6000 NextSeq 500	22
EGAD50000000092	This dataset contains scRNA sequencing data for CD8-Positive lymphocytes samples. Sequencing was performed on a Illumina NextSeq 550. The sequencing was always paired.	NextSeq 550	1
EGAD50000000093	This dataset contains WGS and RNA sequencing data for 4 melanoma samples. Sequencing was performed on Illumina Novaseq 6000 and Illumina HiSeq X. The sequencing was always paired.	HiSeq X Ten Illumina NovaSeq 6000	3
EGAD50000000094	BAM file from Whole Exome Sequencing data of SARS-CoV-2 patients	Illumina NovaSeq 6000	392
EGAD50000000095	RNA-seq data for 101 samples with B-cell acute lymphoblastic leukemia	Illumina HiSeq 2500	101
EGAD50000000096	Whole genome sequencing data for 45 samples with B-cell acute lymphoblastic leukemia	Illumina HiSeq 2500	45
EGAD50000000097	Whole exome sequencing data for 69 samples with B-cell acute lymphoblastic leukemia	Illumina HiSeq 2500	69
EGAD50000000098	CD45- Single Cell RNA Sequencing data on 8 High Grade Serous Carcinoma Primary Tumors	Illumina NovaSeq 6000	2
EGAD50000000099	Whole genome bisulfite sequencing of prostate cancer samples upon oral pimonidazole administration	Illumina HiSeq 2000	24
EGAD50000000100	Long-read (PacBio) sequencing of two retinal organoid samples. One sample was treated with cyclohexamide. Files are ccs BAM format files generated by a Sequel IIe machine.	Sequel IIe	2
EGAD50000000101	Long-read (PacBio) RNA sequencing dataset of three neural retinal samples. Files are raw BAM format files generated by a Sequel II machine. Additionally, the ccs3 BAM format files are included.	Sequel II	4
EGAD50000000102	Whole genome sequencing data for 423 samples, including diagnosis and germline, across multiple cancer types for cfDNA cohort	Illumina HiSeq 2500	423
EGAD50000000103	Whole exome sequencing data for 457 samples, including diagnosis and germline sample, across multiple cancer types for cfDNA cohort	Illumina HiSeq 2500	457
EGAD50000000104	Cell free DNA sequencing data for 233 samples across multiple cancer types for cfDNA cohort	Illumina NovaSeq 6000	273
EGAD50000000105	In-patient comparison of single-cell RNA-sequencing (scRNA-seq) and single-nucleus (snRNA-seq) technologies and accompanying tissue processing protocols on transjugular liver biopsy from decompensated cirrhosis patients (n = 3).	Illumina NovaSeq 6000	6
EGAD50000000107	The dataset includes DNA targeted paired-wise (2 FASTQ per sample) sequencing of a manually curated panel of 44 genes with a documented role in homologous recombination (HR), a pathway of DNA damage response. Sequencing was conducted in 69 tumors from patients with metastatic colorectal cancer after serial passaging in mice (patient-derived xenografts, PDXs). For each PDX model profiled for HR gene mutations, therapeutic annotation of response to FOLFIRI (a chemotherapeutic regimen consisting of the combination of 5-fluorouracil and irinotecan) is available.	Illumina HiSeq 4000	69
EGAD50000000111	This dataset contains 184 paired fastq files sequences with Illumina NovaSeq600 from 38 participants. Relevant clinical and demographical data is also included.	Illumina NovaSeq 6000	38
EGAD50000000112	WES data of 17 tumors from 9 individuals that have biallelic germline CHEK2 pathogenic variants. Shallow whole genome sequencing files from 16 tumors samples from 9 individuals with germline biallleic pathogenic variants in CHEK2.	Illumina NovaSeq 6000	17
EGAD50000000113	Whole exome sequencing data from the tumors of individuals with constitutional mismatch repair deficiency. These are 16 tumors from the bigger study of 41 tumors in total (from 17 individuals in total).	Illumina NovaSeq 6000	16
EGAD50000000114	This study includes an open-label, phase 2 study to determine the activity of the anti-VEGF receptor tyrosine-kinase inhibitor, pazopanib, combined with the anti-PD-L1 immune checkpoint inhibitor, durvalumab, in unselected advanced sarcomas. We conducted whole exome and transcriptomic sequencing with pre-treatment tissue biopsy to correlate clinical outcomes with molecular and genomic biomarkers to identify patients who would most likely benefit from the combination treatment.	Illumina HiSeq 2500	83
EGAD50000000115	Intestinal organoids treated with interferon-gamma 1 ng/mL either for 6h or 18h.	NextSeq 500	9
EGAD50000000116	RNAseq for #111, #1177, #206, #201, #29, #931, WO-19, WO-2	Illumina NovaSeq 6000	8
EGAD50000000117	This study explores the evolution of tumor and blood immune microenvironment and related mechanisms that shape breast cancer progression.	HiSeq X Ten Illumina NovaSeq 6000	24
EGAD50000000119	Synthetic - This submission contains a subset of a synthetic dataset derived from the project Heilsa Tryggvedottir - a Nordic collaboration on sharing sensitive human data. Heilsa Tryggvedottir is funded by the Nordic e-Infrastructure Collaboration (NeIC), the ELIXIR nodes of Finland, Norway, and Sweden, Computerome in Denmark, and the Estonian Scientific Computing Infrastructure (ETAIS). In the synthetic data creation process, it was attempted to strike a fine balance between the usability of the datasets (e.g. technical FEGA development, testing, user training, and basic bioinformatics) and compliance with GDPR. File names and file content (e.g. headers in fastq) are anonymized. Moreover, the X, Y, and mitochondrial sequences have been discarded from the original data since these data can be used for maternal, paternal, or ethnic origin tracing. The dataset does not follow natural haplotype distribution (inherent to imputation panels). The only inputs derived from real sequence data are variant distribution density per chromosome and learning sequencing error models. The synthetic dataset consists of two fastq files, a cram file, a vcf file, and two index files.	unspecified	1
EGAD50000000120	We used RNA sequencing of 81 cervical cancer specimens to developed an immune-based gene expression signature to predict distant metastasis in cervical cancer patients treated with RT/cisplatin. Our 55-gene risk score, validated across independent cohorts, was strongly linked to higher rates of metastasis and lower survival. The score also correlated with higher copy-number alteration and a less immune-responsive tumor microenvironment, indicating its potential in identifying high-risk patients and informing targeted therapies.	Illumina HiSeq 2500 NextSeq 500	81
EGAD50000000123	ATAC-seq dataset for the paper: Title Multi-omics analysis of human population variation in immune function and in vivo response to BCG vaccination Abstract Immune responses are tightly regulated, yet highly variable between individuals. To investigate human population variation of trained immunity, we immunized healthy individuals with Bacillus Calmette-Guérin (BCG). This live attenuated vaccine induces not only an adaptive immune response against tu-berculosis, but also triggers innate immune activation and memory. We established personal immune profiles and chromatin accessibility maps over a time course of BCG vaccination in 323 individuals. This large resource uncovered genetic and epigenetic predictors of baseline immunity and BCG vaccine response. We found that BCG vaccination enhances the innate immune response only in individuals with dormant immune states at baseline, suggesting that exogeneous induction of trained immunity is not a universal booster of innate immunity, but specifically elevates weak innate immune responses. This study advances our understanding of BCG’s heterologous immune-stimulatory effects and trained immunity in humans. Moreover, our results highlight the value of epigenetic cell states as an “endo-phenotype” that connects immune function with genotype and the environment.	Illumina HiSeq 3000	861
EGAD50000000124	This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 10 samples (5 female, 5 male of Opole Voivodeship, Poland from POPULOUS collection). Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation. Reference Genome: GRCh37.		10
EGAD50000000125	This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 12 samples (6 female, 6 male of Podlaskie Voivodeship, Poland from POPULOUS collection). Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation. Reference Genome: GRCh37.		12
EGAD50000000127	This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 19 samples (8 female, 11 male of Lubusz Voivodeship, Poland from POPULOUS collection). Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation. Reference Genome: GRCh37.		19
EGAD50000000128	This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 15 samples (8 female, 7 male of Warmian-Mazurian Voivodeship, Poland from POPULOUS collection). Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation. Reference Genome: GRCh37.		15
EGAD50000000129	This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 17 samples (7 female, 10 male of West Pomeranian Voivodeship, Poland from POPULOUS collection). Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation. Reference Genome: GRCh37.		17
EGAD50000000130	This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. The dataset includes BAM and BAM.BAI files from Whole Exome Sequencing of 450 samples (227 female, 223 male, Poland from POPULOUS collection). Library Construction Protocol: Illumina DNA Prep with Enrichment. Reference Genome: GRCh37.		450
EGAD50000000131	Buccal epithelial cells of chimeric twins were isolated using laser-capture microdissection. Cells were pooled (13-60 cells) per batch to create a genomic DNA library using an NEB low-input kit.	Illumina NovaSeq 6000	2
EGAD50000000132	scRNAseq data generated with 10x genomics	Illumina NovaSeq 6000	2
EGAD50000000134	ChIPseq using anti-EZH2 with a sheared input DNA control to assess EZH2 genomic biding in one longitudinal pair of samples (pre- and post-treatment) from one UP and one DOWN responder GBM patient	NextSeq 500	4
EGAD50000000135	Bulk RNA-seq	Illumina NovaSeq 6000	5
EGAD50000000136	Spatial Transcriptomic	Illumina NovaSeq 6000	2
EGAD50000000137	Single cell RNA and TCR sequencing of γδ T cells FACS sorted from peripheral blood of two Merkel cell carcinoma patients	Illumina NovaSeq 6000	4
EGAD50000000138	The AVENIO ctDNA Expanded Kit is a next-generation sequencing (NGS) liquid biopsy assay with a 77 gene panel (192 kb) containing genes in U.S. National Comprehensive Cancer Network (NCCN) Guidelines and emerging cancer biomarkers. This pan-cancer assay was applied to 100 plasma samples from patients with lung cancer undergoing treatment in the OSCILLATE trial. After 150 bp paired-end sequencing, reads were aligned to the human genome reference with the AVENIO Oncology Analysis Software. These files are the sorted non-deduplicated alignments generated by the analysis software used for subsequent variant, indel and CNV calling.	NextSeq 500	100
EGAD50000000139	Source data for Figure.2A heatmap showing the gene expression across samples.		1213
EGAD50000000140	Cell annotation for single T and NK cells of PBMC collected from 113 patients, including UMAP coordinates and annotation.		1213
EGAD50000000141	Raw count matrix for CITE-seq data of PBMC collected from 113 patients.		1213
EGAD50000000142	Raw count matrix for scRNA-seq data of PBMC collected from 113 patients.		1213
EGAD50000000143	Clinical data for IMvigor130 cohort of patients, including treatment, response, survial time, and tumor PDL1-IC staining score.		1213
EGAD50000000144	Cell annotation for single CD8 T cells of PBMC collected from 113 patients, including UMAP coordinates and annotation		1213
EGAD50000000145	Cell annotation for single cells of PBMC collected from 113 patients, including UMAP coordinates and annotation.		1213
EGAD50000000146	A list of ctDNA samples included in the dataset.		171
EGAD50000000147	ctDNA sample data (one sample per line) for BFAST Cohort D including sample-level summaries like tumor fraction (cTF).		171
EGAD50000000148	ctDNA mutation calls (one mutation per sample per line) for BFAST Cohort D including mutation-level data like coding change, allele frequency, etc.		171
EGAD50000000149	Clinical data for BFAST Cohort D, including time-to-event, tumor response, and baseline prognostics data.		231
EGAD50000000150	Whole exome sequencing data for 425 samples with B-cell acute lymphoblastic leukemia	Illumina HiSeq 2500	425
EGAD50000000151	RNA-seq data for 14 samples with B-cell acute lymphoblastic leukemia	Illumina HiSeq 2500	14
EGAD50000000152	We obtained different clones from early passage CRC tumoroids to study mutational signatures specific for truncal or private somatic alterations. These whole genome sequences are part of a larger study on the heterogeneity and evolution of DNA mutation rates in microsatellite-stable colorectal cancer.	unspecified	15
EGAD50000000153	Dataset of 46 mCRC tumoroids from different passages (23 early-passage 3, 23 late-passage 8-12). FASTQ files of paired sequencing of PolyA-enriched total RNA	Illumina NovaSeq 6000	46
EGAD50000000154	3 datasets included: - H3K27ac ChIP in ETS2-edited and unedited TPP macrophages from 3 biological replicates - H3K27ac ChIP in ETS2-overexpressing and control M0 (resting) macrophages from 3 biological replicates - ATAC-seq (deep sequencing) in ETS2-edited and unedited TPP macrophages from 3 biological replicates	Illumina NovaSeq 6000	30
EGAD50000000155	This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 24 samples (14 female, 10 male of Lublin Voivodeship, Poland from POPULOUS collection). Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation. Reference Genome: GRCh37.		24
EGAD50000000156	This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 30 samples (15 female, 15 male of Lower Silesian Voivodeship, Poland from POPULOUS collection). Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation. Reference Genome: GRCh37.		30
EGAD50000000157	This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 22 samples (11 female, 11 male of Subcarpathian Voivodeship, Poland from POPULOUS collection). Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation. Reference Genome: GRCh37.		22
EGAD50000000158	This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 22 samples (11 female, 11 male of Kuyavian-Pomeranian Voivodeship, Poland from POPULOUS collection). Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation. Reference Genome: GRCh37.		22
EGAD50000000159	Whole genome/exome sequencing to detect spontaneous acquired mutations in mismatch repair-deficient human colon organoids.	Illumina NovaSeq 6000 Illumina NovaSeq X	9
EGAD50000000160	This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 25 samples (14 female, 11 male of Pomeranian Voivodeship, Poland from POPULOUS collection). Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation. Reference Genome: GRCh37.		25
EGAD50000000161	This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 37 samples (19 female, 18 male of Greater Voivodeship, Poland from POPULOUS collection). Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation. Reference Genome: GRCh37.		37
EGAD50000000162	This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 6 samples (6 female of Holy Cross Voivodeship, Poland from POPULOUS collection). Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation. Reference Genome: GRCh37.		6
EGAD50000000163	This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 47 samples (23 female, 24 male of Silesian Voivodeship, Poland from POPULOUS collection). Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation. Reference Genome: GRCh37.		47
EGAD50000000164	This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 43 samples (16 female, 27 male of Mazovia Voivodeship, Poland from POPULOUS collection). Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation. Reference Genome: GRCh37.		43
EGAD50000000165	This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 41 samples (20 female, 21 male of Lodz Voivodeship, Poland from POPULOUS collection). Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation. Reference Genome: GRCh37.		41
EGAD50000000166	This project used NGS (next generation sequencing), using Illumina NOVASEQ 6000 and Illumina DRAGEN aligner. The dataset includes BAM and BAM.BAI files from Whole Genome Sequencing of 32 samples (14 female, 18 male of Lesser Voivodeship, Poland from POPULOUS collection). Library Construction Protocol: Illumina DNA PCR-Free Prep, Tagmentation. Reference Genome: GRCh37.	Illumina NovaSeq 6000	32
EGAD50000000167	Fragle is a deep learning based two stage model that quantifies ctDNA from blood plasma derived cfDNA bam files. Fragle was developed using some previously published datasets and some newly generated data. Fragle was evaluated using some validation cohorts and some unseen cohorts. Some of these cohorts are newly created consisting of total 365 low pass (2-3X) whole genome sequencing bam files mapped to hg19/GRCh37. This dataset contains these newly generated bam files.	Illumina NovaSeq 6000	365
EGAD50000000168	Arcagen is an EORTC/SPECTA pan-European project that aims to recruit 1000 rare cancer patients from different tumour domains of EURACAN. This study collected samples from advanced or metastatic rare cancer from patients older than 12, and analysed them using Foundation Medicine next-generation sequencing (NGS) panels (FoundationOne CDx for FFPE samples or FoundationOne Liquid CDx for blood samples). Here we are submitting the dataset that contain NGS files from rare thoracic malignancies (n=102)	Illumina HiSeq 4000	102
EGAD50000000169	This dataset consists of RNA-seq CRAM raw-data files of 1063 primary colorectal cancer and 120 adjacent normal tissue samples. The expression profiles for these samples can be found at the ArrayExpress with accession number E-MTAB-12862.	BGISEQ-500	1183
EGAD50000000171	This study investigates translocation renal cell carcinoma (tRCC) to identify mutations contributing to tRCC progression. The analysis relies on Whole Exome Sequencing (WES) data obtained from 11 patients treated at University of Texas Southwestern Medical Center (UTSW) affiliated hospitals, including Parkland Hospital and Children’s Medical Center. Tumor samples and their corresponding normal DNA were sequenced using the Illumina platform.	Illumina NovaSeq 6000	34
EGAD50000000172	This study investigates differentially expressed genes associated with the progression of translocation renal cell carcinoma (tRCC) by analyzing RNA-Seq data from 23 tumor samples obtained from 12 patients. The samples were collected from affiliated hospitals of the University of Texas Southwestern Medical Center (UTSW), including Parkland Hospital and Children’s Medical Center. RNA isolated from tumor samples was sequenced using the Illumina platform.	Illumina NovaSeq 6000	23
EGAD50000000173	This dataset encompasses single-cell RNA sequencing data derived from nasal swabs of a paired cohort of school-aged children with cystic fibrosis. The study includes samples both before (n=13) and after initiation (n=13) of elexacaftor/tezacaftor/ivacaftor (ETI) treatment. Additionally, age- and sex-matched controls were included (n=12). Detailed information about the study design and methodology can be found in the manuscript: “Pharmacological improvement of CFTR function rescues airway epithelial homeostasis and host defense in children with cystic fibrosis”.	Illumina NovaSeq 6000	38
EGAD50000000174	T-ALL relapse usually occurs early but can occur much later, which has been suggested to represent a de novo leukemia. However, we conclusively demonstrate late relapse can evolve from a pre-leukemic subclone harbouring a non-coding mutation that evades initial chemotherapy. Data include 19 WGS samples: - 5 cases with presentation, relapse and remission (germline) samples - 2 cases with presentation and relapse samples, but not remission (germline)	Illumina NovaSeq 6000	19
EGAD50000000175	This dataset contains RNA, ATAC and WGS sequencing data of 9 acute myeloid leukemia samples. Sequencing was performed on Illumina HiSeq 2000, HiSeq 4000, NovaSeq 6000 and HiSeq X Ten. The sequencing was always paired.	Illumina HiSeq 2000 Illumina HiSeq 4000 Illumina HiSeq X Illumina NovaSeq 6000	9
EGAD50000000176	This dataset includes FASTQ files of low coverage whole genome sequencing of normal tissue (n = 8), tumor tissue (n = 55) and cell free DNA from plasma samples (n=101) from patients with metastatic colorectal cancer treated with bevacizumab. A total of 164 samples are present from two different cohorts. The first batch of samples (n = 139, sample names sAPD302T until sAPD502_P0) were collected from the AC-Angiopredict Phase 2 trial (NCT01822444), the second batch of samples (n = 25m, sample names sAPD503_P0 until sAPD527_P0) were collected from the UMM cohort.	Illumina HiSeq 4000	164
EGAD50000000177	Whole genome sequencing data of 19 high-grade serous carcinoma (HGSC) patients (48 samples) sequenced with MGISEQ-2000	unspecified	48
EGAD50000000178	Nuclei were isolated from postmortem human brain tissue, and single-nucleus barcoded cDNA libraries were generated using the 10X Genomics 3' v3.1 kit. Samples were sequenced on an Illumina NovaSeq 6000 to an average of 343 million reads per sample and 52,000 reads per cell.	Illumina NovaSeq 6000	25
EGAD50000000179	Nuclei were extracted from postmortem human brain, and cDNA libraries were generated using the 10X Genomics 3' v3.1 kit. Prior to fragmentation, 75% of the library was reserved for long-read sequencing. This unfragmented library was used for target enrichment with a custom 50 gene probe panel from Twist Biosciences. The resulting amplified cDNA libraries were sequenced on a Sequel II. The files provided have undergone CCS generation, and we recommend following the guidelines in https://isoseq.how/ for further analysis starting with removal of primers using lima. The primer sequences that should be supplied are the Read 1 and TSO primers from the 10X Genomics kit.	Sequel II	14
EGAD50000000180	16S v3-v4 amplicon sequencing of milk, fecal and oral cavity samples, along with sequencing controls, of a subset of newborn infants from Lifelines NEXT cohort	Illumina MiSeq	98
EGAD50000000181	16S-ITS-23S amplicon long read PacBio sequencing of milk, fecal and oral cavity samples, along with sequencing controls, of a subset of newborn infants from Lifelines NEXT cohort	Sequel II	66
EGAD50000000183	This dataset consists of raw unimputed genotype data for 81 individuals with multiple sclerosis (n=33) and other neurological disease (n=48).		81
EGAD50000000184	This dataset contains fastq files from single-cell RNA sequencing of cerebrospinal fluid of 81 patients with multiple sclerosis (n=33) or other neurological disease (n=48) using the 10x Genomics Chromium single cell 3’ v2 chemistry. Sequencing was performed using an Illumina HiSeq4000 sequencer.	Illumina HiSeq 4000	71
EGAD50000000185	This dataset contains 10x scRNA sequencing data of 16 NSCLC samples. Sequencing was performed on Illumina HiSeq 4000. The sequencing was always paired.	Illumina HiSeq 4000	16
EGAD50000000186	COVID-19 whole blood bulk trasncriptomics single-center samples (all Dexamethasone)	Illumina NovaSeq 6000	92
EGAD50000000187	COVID-19 whole blood bulk trasncriptomics multi-center	Illumina NovaSeq 6000	90
EGAD50000000188	COVID-19 PBMC single-cell transcriptomics, multiplexed	Illumina NovaSeq 6000	16
EGAD50000000189	exon 11 mutated UWB1.289 and COV362 cell lines	Illumina NovaSeq 6000	2
EGAD50000000190	Spatial transcriptomic data of HGSOC patients before and after treatment	Illumina NovaSeq 6000	10
EGAD50000000191	scRNAseq data of HGSOC patients before and after treatment	Illumina NovaSeq 6000	12
EGAD50000000192	Bulk RNAseq of cultured fibroblasts	Illumina NovaSeq 6000	6
EGAD50000000193	Paired RNA-Seq data from 16 samples of different tumors CD8+ T cells added to the study "Proteogenomic analysis reveals RNA as a source for tumor-agnostic neoantigen identification (H021)". Sequencing was performed on Illumina NextSeq 500. The sequencing was always paired	NextSeq 500	16
EGAD50000000194	Human RNA-seq data for TRIM24-MET fusion tumor primary samples and patient/PDOX-derived cell lines. Cells were treated with 0.1% DMSO or EC90 of MET inhibitors (capmatinib, cabozantinib, crizotinib) for 4 hours. BAM files containing aligned (and unaligned) reads to hg19 are provided.	Illumina NovaSeq 6000	32
EGAD50000000195	Relevant clinical data for IMpower133 (GO30081) including treatment arm, overall survival, progression-free survival, best overall response, PD-L1 IHC, molecular subtype, and baseline clinical features.		271
EGAD50000000196	This dataset contains log2(TPM + 1) for 271 tumor samples profiled by RNA-seq for the entire transcriptome for samples originating from IMpower133 (GO30081).		271
EGAD50000000197	We generated single-cell transcriptomics, single-nuclei chromatin accessibility, CITE-seq and Multiomics of patients with PML. Pools are genetically multiplexed across donors, genotypic variation is included in this dataset to enable demultiplexing.	Illumina NovaSeq 6000	14
EGAD50000000198	We performed multi-omics profiling of 38 Crohn's disease and Ulcerative colitis patients across several stimulations (RPMI, LPS, Salmonella, in total 80 samples). Nuclei were profiled using the 10X Multiome protocol which offers paired RNA+ATAC from the same nucleus (e.g. shared barcodes). Per library, the ATAC (I1, R1, R2, R3 reads) and RNA (I1, I2, R1, R2 reads) are provided. Pools are genetically multiplexed across donors. Genotype files are provided to allow genetic demultiplexing.	Illumina NovaSeq 6000	20
EGAD50000000199	In this study, we analyzed 28 primary cutaneous follicle center lymphoma samples from 20 patients and compared the copy number profiles to a cohort of diagnostic samples of 64 nodal follicular lymphoma patients using low-coverage whole genome sequencing (lcWGS).	NextSeq 2000	92
EGAD50000000200	RNA has been extracted by Rneasy mini kit from around 30mg of flash frozen tissue derived from 3 healthy and 3 tumor endometrial tissues of post-menopausal patients. Raw paired-end fastq.gz files are provided.	Illumina NovaSeq 6000	6
EGAD50000000201	Single cell full transcriptome sequencing of CD19 CAR T-cell infusion products used for standard of care treatment for relapsed/refractory large B-cell lymphoma.	Illumina HiSeq 4000	59
EGAD50000000202	We generated 10X, droplet-based paired snRNAseq+snATACseq (Multiome) of patients suffering from Long Covid. In total, we included 31 patients. Single nuclei libraries are genetically multiplexed across donors, genotype files are available to enable demultiplexing. Phenotype sheets provide information of pools/donors as well as donor phenotype.	Illumina NovaSeq 6000	25
EGAD50000000203	We generated 10X, droplet-based paired snRNAseq+snATACseq (Multiome) of the response to P. Aeruginosa or RPMI in patients suffering from Long Covid. In total, we included 15 patients. Single nuclei libraries are genetically multiplexed across donors, genotype files are available to enable demultiplexing. Phenotype sheets provide information of pools/donors as well as donor phenotype.	Illumina NovaSeq 6000	31
EGAD50000000205	78 bulk-RNAseq from HGSOC patients from SCANDARE MACARON	Illumina NovaSeq 6000	78
EGAD50000000206	These are 28 targeted exome sequencing fastq files (tumour and germline) generated from ovarian cancer samples which had been exposed to PARP inhibitors, for the described project.	NextSeq 500	28
EGAD50000000207	snRNA-sequencing data from 3 FSHD and 1 control multinucleated myotube cell cultures	Illumina NovaSeq 6000	4
EGAD50000000208	Dataset comprising 27 pairs of high-depth (300x) WES results obtained from 54 PDXs derived from primary CRCs resected synchronously with their corresponding metastases. These exoms sequences are part of a larger study on the heterogeneity and evolution of DNA mutation rates in microsatellite-stable colorectal cancer.	Illumina NovaSeq X	81
EGAD50000000209	Deposited here are whole-genome sequencing bam files from the experimental isogenic cell lines used in the study "Redefined indel taxonomy reveals insights into mutational signatures, Koh, Gene (2023)".	Illumina NovaSeq 6000	47
EGAD50000000210	Ultra-deep next-generation panel-sequencing on 59 FFPE-samples (20 LTR, 26 relapsed (rHL: 11 initial-diagnosis, 15 relapse) and 13 primary-refractory (prHL: 8 initial-diagnosis, 5 progression) from 44 cHL-patients applying a hybrid-capture approach. Data was processed as described in the publication and mark duplicated bam files were uploaded.	Illumina NovaSeq 6000	84
EGAD50000000211	Whole genome sequencing data of 25 high-grade serous carcinoma (HGSC) patients (65 samples) sequenced with Illumina Novoseq 6000	Illumina NovaSeq 6000	65
EGAD50000000212	Dataset contains raw fastq-files from RNA-sequencing analysis of total RNA extracted from Brodmann Area 9 human postmortem brain tissue samples.	Illumina NovaSeq 6000	25
EGAD50000000213	The dataset contain 20 lung cancer and 20 healthy control cfDNA samples from plasma sequenced using 150bp PE sequencing.	Illumina NovaSeq 6000	40
EGAD50000000214	Whole Exome Sequencing of baseline tumor and matched normal samples from patients in the GO29781 mosunetuzumab monotherapy trial.	unspecified	306
EGAD50000000215	Exome sequencing of tumour PDX and matched patient blood	Illumina NovaSeq 6000	10
EGAD50000000216	Examination of Sample Multiplexing Reagents for Single Cell RNA-Seq. Nine techniques applied to samples from four PDX models: #105, #177, #233, #264	NextSeq 2000	9
EGAD50000000217	Whole genome sequencing of patient tumour and blood	Illumina NovaSeq 6000	2
EGAD50000000218	Genome-wide NanoRCS of cfDNA from plasma of Granulosa cell tumor patients	Illumina NovaSeq 6000 MinION PromethION	8
EGAD50000000219	Genome-wide NanoRCS of cfDNA from ascites of Ovarian cancer patients	Illumina NovaSeq 6000 MinION PromethION	18
EGAD50000000220	Genome-wide NanoRCS of cfDNA from plasma of healthy individuals	Illumina NovaSeq 6000 MinION PromethION	9
EGAD50000000221	Genome-wide NanoRCS of cfDNA from plasma of Esophageal cancer patients	Illumina NovaSeq 6000 MinION	14
EGAD50000000222	Single-cell Cut&Tag of three histone modifications and chromatin accessibility over a timecourse of brain organoid development from pluripotency. We profiled 5 time points of brain organoid development and 2 time points of retinal organoid development. At each time point used scCut&Tag to profile histone modifications (H3K27me3, H3K27ac, H3K4me3) as well as multiome-seq (10X) to jointly profile transcriptome and chromatin accessibility from the same cell suspension. Library preparation was carried out with the 10x genomics platform.	unspecified	8
EGAD50000000223	Single-cell Cut&Tag of three histone modifications and chromatin accessibility over a timecourse of brain organoid development from pluripotency. We profiled 5 time points of brain organoid development and 2 time points of retinal organoid development. We used scRNA-seq to profile transcriptome. Library preparation was carried out with the 10x genomics platform.	unspecified	8
EGAD50000000224	Transcriptomics and epigenomics after chemical inhibition of EED during early human brain organoid development. To investigate the role of H3K27me3 inhibition during neuroectoderm induction, we treated brain organoids with 3 concentrations of EED inhibitor (and control) during the transition of pluripotency to neuroepithelium. At the neuroepithelium stage, we profiled the organoids using scRNA-seq (transcriptome) and bulk Cut&Tag (H3K27me3, H3K27ac). Library preparation for scRNA-seq was carried out with the 10x genomics platform.	unspecified	2
EGAD50000000225	scRNA-seq and Cut&Tag from a sample of the human fetal brain. To compare neurogenesis in the primary human brain with that in organoids, we profiled transcriptome and chromatin accessibility (10X multiome) as well as bulk Cut&Tag of histone modifications H3K27me3, H3K27ac and H3K4me3. We used a human brain sample from GW 19 for all experiments.	unspecified	1
EGAD50000000226	Patients included in this study were over 18 years of age and had a histology-confirmed diagnosis of glioblastoma multiforme (GBM). Exclusion criteria were the previous administration of any anti-tumor therapy including radiation therapy. All patients gave written informed consent. The study was approved by the local ethics committee (TUM Medical school) and conducted following the Declaration of Helsinki. During resection of the tumors, tumor tissue and tissue from normal appearing brain within the operative channel was collected. Blood was drawn during the surgical procedure. Single cell suspensions were prepared from the tumor tissue, the normal appearing brain, and the blood. CD4+ T cells and CD8+ T cells were sorted by flow cytometry. Only patients with a complete set of specimens (CD4+ tumor infiltrating lymphocytes (TIL), CD8+ TIL, CD4+ T cells from normal appearing brain, CD8+ T cells from normal appearing brain, blood-derived CD4+ and CD8+ T cells) containing a minimum of 1000 cells in each sorted sample were further analyzed (n=9). Total RNA was isolated from sorted cell populations using the RNAeasy Plus micro kit (Qiagen, 74034). Quality and integrity of total RNA was controlled on a Bioanalyzer 2100 (Agilent Technologies). Library preparation for bulk-sequencing of poly(A)-RNA was done as described previously (Parekh et al., 2016). Briefly, barcoded cDNA of each sample was generated with a Maxima RT polymerase (ThermoFisher Scientific, EP0742) using oligo-dT primer containing barcodes, unique molecular identifiers (UMIs) and an adaptor. Ends of the cDNAs were extended by a template switch oligo (TSO) and full-length cDNA was amplified with primers binding to the TSO-site and the adaptor. NEB UltraII FS kit was used to fragment cDNA. After end repair and A-tailing, a TruSeq adapter was ligated and 3'-end-fragments were finally amplified using primers with Illumina P5 and P7 overhangs. In comparison to previous descriptions (Parekh et al., 2016), the P5 and P7 sites were exchanged to allow sequencing of the cDNA in read 1 and barcodes and UMIs in read 2 to achieve a better cluster recognition. The library was sequenced on a NextSeq 500 (Illumina) with 59 cycles for the cDNA in read 1 and 16 cycles for the barcodes and UMIs in read 2. Data were processed using the published Drop-seq pipeline (v1.0) to generate sample- and gene-wise UMI tables (Macosko et al., 2015). Reference genome (GRCh38) was used for alignment. Transcript and gene definitions were used according to the Genecode Annotation Version 35.	NextSeq 500	60
EGAD50000000227	This dataset contains single-end fastq files for human oocytes (n=12), zygotes (n=5), and early embryos (n=10) sequenced with Illumina NextSeq500. The sequencing libraries were prepared from single oocytes, zygotes, and embryos in two batches.	NextSeq 500	27
EGAD50000000228	Deposited here are time series bulk RNA-seq data generated from isogenic wild-type (WT), XPA and XPG gene knockouts hiPSC cells during cortical neuronal differentiation.	Illumina NovaSeq 6000	61
EGAD50000000229	ANCA-associated glomerulonephritis (AGN) associates with a high risk of end-stage kidneydisease. The role of kidney immune cells in local inflammation remains unclear. This study investigates kidney immune cell diversity and function. Kidney tissue from AGN patients (n=5) and a lupus nephritis (LN) patient (n=1) were aquired during a biopsy procedure for a clinical indication. Needle-core biopsies were obtained for histopathological examination, and an additional pass was performed to retrieve kidney tissue for scRNA-seq. Healthy kidney tissue (n=1) was obtained from a kidney that was surgically removed do tue due to a (non-invasive) papillary urothelial carcinoma. Immediately after collection, kidney tissue was processed into a single-cell suspension and sorted using a 4-color flow cytometry panel to isolate living, CD45+immune cells. To aid in the multi-omic characterization, surface markers and T and B cell repertoires were sequenced in 2 samples (1 AGN patient and the nephrectomy control). These samples were incubated with an oligo-antibody TotalSeq-C cocktail containing 130 unique cell surface antigens.	Illumina NovaSeq 6000	7
EGAD50000000230	Deposited here are time series bulk RNA-seq data generated from XP patients and controls as well as rescued patient hiPSC cells during cortical neuronal differentiation.	Illumina NovaSeq 6000	87
EGAD50000000231	Human datasets associated with paper 'Clonally heritable gene expression imparts a layer of diversity within cell types' published in Cell Systems	Illumina HiSeq 2500 Illumina NovaSeq 6000	7092
EGAD50000000232	Deposited here are WGS data generated from differentiated neural stem cells (NSCs) of NXPG-32 patient (XPG patient with neurodegeneration) and a matched healthy control CTRL-33.	Illumina NovaSeq 6000	9
EGAD50000000233	This dataset contain WGS sequencing data of head and neck cancer samples as well as blood plasma controls. Sequencing was performed on Illumina Novaseq 6000 using KAPA HyperPrep Kit. The sequencing was always paired.	Illumina NovaSeq 6000	191
EGAD50000000234	To explore how JAK2V617F disrupts cell-fate and the regulatory chromatin landscape of HSCs, we applied GoT-ChA to CD34+ sorted progenitor cells from 21 human primary samples, comprising 18 patients with JAK2V617F-mutated MF (no additional mutations, except for Pt-08), who either had no treatment (n = 12; including three longitudinal samples from a PV patient who progressed to MF) or who were treated with ruxolitinib (n = 6), a JAK1/2 inhibitor. We included a JAK2V617F CH sample (Pt-19) to explore early epigenetic changes, before the onset of overt hematological abnormality.	Illumina NovaSeq 6000	21
EGAD50000000235	Data set for scRaCH-seq manuscript. We base call the fast5 using guppy.	PromethION	21
EGAD50000000236	This dataset contains data from nanopore amplicon sequencing of FLG exon 3 in 22 patients and Nanopore adaptive sampling sequencing of the EDC region in two patients.	GridION	22
EGAD50000000237	Whole Genome Sequencing (120X) of the two tumors and blood from 4 pediatric cases with second malignancies (12 WGS)	Illumina NovaSeq X	12
EGAD50000000238	Whole Genome Sequencing of normal tissues from case 3 (9 WGS 120X), including parents blood Whole Genome Sequencing (2 WGS 30X).	Illumina NovaSeq X	11
EGAD50000000239	Duplex Sequencing of normal and tumor tissues from case 2 and case 3 (10 DS)	Illumina NovaSeq X	10
EGAD50000000240	Whole Genome Sequencing of 2 expanded clones from a cell line derived from the rhabdoid tumor case 3 (2 WGS 20X)	Illumina NovaSeq X	2
EGAD50000000241	This set contains a total of 78 files cram files with RNA sequencing data from 20 patients included in the PANDA study treated with 1 cycle of monotherapy atezolizumab and 4 cycles atezolizumab plus chemotherapy (docetaxel, oxaliplatin and capecitabine). RNA was isolated from fresh frozen material and sequenced at 4 timepoints baseline, after monotherapy atezolizumab, after combination atezolizumab plus chemotherapy and at resection (due to 2 missing samples, there is a total of 78 samples).	Illumina NovaSeq 6000	78
EGAD50000000242	This set contains 40 bam files with whole exome sequencing data from 20 patients included in the PANDA study treated with 1 cycle of monotherapy atezolizumab and 4 cycles atezolizumab plus chemotherapy (docetaxel, oxaliplatin and capecitabine). Tumor DNA was isolated from tumor samples and germline DNA was isolated from PBMCs, which was used for whole exome sequencing.	Illumina NovaSeq 6000	40
EGAD50000000243	We performed whole exome sequencing, whole genome sequencing and transcriptome sequencing of multiple tumor regions from 65 patients with SCLC.	unspecified	423
EGAD50000000244	1034 shotgun metagenomes from baseline FIT CRCbiome samples	Illumina NovaSeq X	1034
EGAD50000000245	Exome sequencing of FFPE and patient-derived cultures from patients enrolled in clinical study NCT03860376	DNBSEQ-G400 Illumina NovaSeq 6000	13
EGAD50000000246	Transcriptome sequencing of FFPE and patient-derived cultures from clinical study NCT03860376	DNBSEQ-G400 Illumina NovaSeq 6000	15
EGAD50000000247	The data published here contains single-cell RNA-sequencing (scRNAseq) data as obtained using the 3' scRNAseq using Chromium Single Cell 3’ Reagent from 10X Genomics on peripheral blood mononuclear cells (PBMC) from patients with colorectal cancer (CRC) and peritoneal metastases (PM). Sequencing was performed in a paired-ended fashion on the NovaSeq6000.	Illumina HiSeq 4000 Illumina NovaSeq 6000	13
EGAD50000000248	The data published here contains single-cell RNA-sequencing (scRNAseq) data as obtained using the 3' scRNAseq using Chromium Single Cell 3’ Reagent from 10X Genomics on peritoneal fluid (PF) from patients with colorectal cancer (CRC) and peritoneal metastases (PM). Sequencing was performed in a paired-ended fashion on the NovaSeq6000.	Illumina HiSeq 4000 Illumina NovaSeq 6000	13
EGAD50000000249	The data published here contains single-cell RNA-sequencing (scRNAseq) data as obtained using the 3' scRNAseq using Chromium Single Cell 3’ Reagent from 10X Genomics on peritoneal metastases (PM) from patients with colorectal cancer (CRC) and PM. Sequencing was performed in a paired-ended fashion on the NovaSeq6000.	Illumina HiSeq 4000 Illumina NovaSeq 6000	8
EGAD50000000250	The data published here contains single-cell RNA-sequencing (scRNAseq) data as obtained using the 3' scRNAseq using Chromium Single Cell 3’ Reagent from 10X Genomics on peripheral blood mononuclear cells (PBMC) from patients with achalasia. Sequencing was performed in a paired-ended fashion on the NovaSeq6000.	Illumina NovaSeq 6000	5
EGAD50000000251	The data published here contains single-cell RNA-sequencing (scRNAseq) data as obtained using the 3' scRNAseq using Chromium Single Cell 3’ Reagent from 10X Genomics on peritoneal fluid (PF) from patients with achalasia. Sequencing was performed in a paired-ended fashion on the NovaSeq6000.	Illumina NovaSeq 6000	5
EGAD50000000252	The data published here contains cellular indexing of transcriptomes and epitopes sequencing (CITEseq) using the oligo-tagged TotalSeq™-B Human Universal Cocktail, V1.0 from BioLegend on peritoneal fluid (PF) from patients with achalasia. Sequencing was performed in a paired-ended fashion on the NovaSeq6000. Below an overview of the epitopes to oligotags as obtained from BioLegend: DNA_ID Description clone Sequence Ensembl ID B0006 anti-human CD86 IT2.2 GTCTTTGTCAGTGCA ENSG00000114013 B0007 anti-human CD274 (B7-H1, PD-L1) 29E.2A3 GTTGTCCGACAATAC ENSG00000120217 B0020 anti-human CD270 (HVEM, TR2) 122 TGATAGAAACAGACC ENSG00000157873 B0023 anti-human CD155 (PVR) SKII.4 ATCACATCGTTGCCA ENSG00000073008 B0024 anti-human CD112 (Nectin-2) TX31 AACCTTCCGTCTAAG ENSG00000130202 B0026 anti-human CD47 CC2C6 GCATTCTGTCACCTA ENSG00000196776 B0029 anti-human CD48 BJ40 CTACGACGTAGAAGA ENSG00000117091 B0031 anti-human CD40 5C3 CTCAGATGGAGTATG ENSG00000101017 B0032 anti-human CD154 24-31 GCTAGATAGATGCAA ENSG00000102245 B0033 anti-human CD52 HI186 CTTTGTACGAGCAAA ENSG00000169442 B0034 anti-human CD3 UCHT1 CTCATTGTAACTCCT ENSG00000167286 B0046 anti-human CD8 SK1 GCGCAACTTGATGAT ENSG00000153563 B0047 anti-human CD56 (NCAM) 5.1H11 TCCTTTCCTGATAGG ENSG00000149294 B0050 anti-human CD19 HIB19 CTGGGCAATTACTCG ENSG00000177455 B0052 anti-human CD33 P67.6 TAACTCAGGGCCTAT ENSG00000105383 B0053 anti-human CD11c S-HCL-3 TACGCCTATAACTTG ENSG00000140678 B0058 anti-human HLA-A,B,C W6/32 TATGCGAGGCTTATC ENSG00000206503 B0063 anti-human CD45RA HI100 TCAATCCTTCCGCTT ENSG00000081237 B0064 anti-human CD123 6H6 CTTCACTCTGTCAGG ENSG00000185291 B0066 anti-human CD7 CD7-6B7 TGGATTCCCGGACTT ENSG00000173762 B0068 anti-human CD105 43A3 ATCGTCGAGAGCTAG ENSG00000106991 B0070 anti-human/mouse CD49f GoH3 TTCCGAGGATGATCT ENSG00000091409 B0071 anti-human CD194 (CCR4) L291H4 AGCTTACCTGCACGA ENSG00000183813 B0072 anti-human CD4 RPA-T4 TGTTCCCGCTCAACT ENSG00000010610 B0073 anti-mouse/human CD44 IM7 TGGCTTCAGGTCCTA ENSG00000026508 B0081 anti-human CD14 M5E2 TCTCAGACCTCCGTA ENSG00000170458 B0083 anti-human CD16 3G8 AAGTTCACTCTTTGC ENSG00000203747 B0085 anti-human CD25 BC96 TTTGTCCTGTACGCC ENSG00000134460 B0087 anti-human CD45RO UCHL1 CTCCGAATCATGTTG ENSG00000081237 B0088 anti-human CD279 (PD-1) EH12.2H7 ACAGCGCCGTATTTA ENSG00000188389 B0089 anti-human TIGIT (VSTM3) A15153G TTGCTTACCGCCAGA ENSG00000181847 B0090 Mouse IgG1, κ isotype MOPC-21 GCCGGACGACATTAA B0091 Mouse IgG2a, κ isotype MOPC-173 CTCCTACCTAAACTG B0092 Mouse IgG2b, κ isotype MPC-11 ATATGTATCACGCGA B0095 Rat IgG2b, κ isotype RTK4530 GATTCTTGACGACCT B0100 anti-human CD20 2H7 TTCTGGGTCCCTAGA ENSG00000156738 B0101 anti-human CD335 (NKp46) 9E2 ACAATTTGAACAGCG ENSG00000189430 B0124 anti-human CD31 WM59 ACCTTTATGCCACGG ENSG00000261371 B0127 anti-Human Podoplanin NC-08 GGTTACTCGTTGTGT ENSG00000162493 B0134 anti-human CD146 P1H12 CCTTGGATAACATCA ENSG00000076706 B0136 anti-human IgM MHM-88 TAGCGAGCCCGTATA ENSG00000211899 B0138 anti-human CD5 UCHT2 CATTAACGGGATGCC ENSG00000110448 B0141 anti-human CD195 (CCR5) J418F1 CCAAAGTAAGAGCCA ENSG00000160791 B0142 anti-human CD32 FUN-2 GCTTCCGAATTACCG ENSG00000143226 B0143 anti-human CD196 (CCR6) G034E3 GATCCCTTTGTCACT ENSG00000112486 B0144 anti-human CD185 (CXCR5) J252D4 AATTCAACCGTCGCC ENSG00000160683 B0145 anti-human CD103 (Integrin αE) Ber-ACT8 GACCTCATTGTGAAT ENSG00000083457 B0146 anti-human CD69 FN50 GTCTCTTGGCTTAAA ENSG00000110848 B0147 anti-human CD62L DREG-56 GTCCCTGCAACTTGA ENSG00000188404 B0149 anti-human CD161 HP-3G10 GTACGCAGTCCTTCT ENSG00000111796 B0151 anti-human CD152 (CTLA-4) BNI3 ATGGTTCACGTAATC ENSG00000163599 B0152 anti-human CD223 (LAG-3) 11C3C65 CATTTGTCTGCCGGT ENSG00000089692 B0153 anti-human KLRG1 (MAFA) SA231A2 CTTATTTCCTGCCCT ENSG00000139187 B0154 anti-human CD27 O323 GCACTCCTGCATGTA ENSG00000139193 B0155 anti-human CD107a (LAMP-1) H4A3 CAGCCCACTGCAATA ENSG00000185896 B0156 anti-human CD95 (Fas) DX2 CCAGCTCATTAGAGC ENSG00000026103 B0158 anti-human CD134 (OX40) Ber-ACT35 (ACT35) AACCCACCGTTGTTA ENSG00000186827 B0159 anti-human HLA-DR L243 AATAGCGAGCAAGTA ENSG00000204287 B0160 anti-human CD1c L161 GAGCTACTTCACTCG ENSG00000158481 B0161 anti-human CD11b ICRF44 GACAAGTGATCTGCA ENSG00000169896 B0162 anti-human CD64 10.1 AAGTATGCCCTACGA ENSG00000150337 B0163 anti-human CD141 (Thrombomodulin) M80 GGATAACCGCGCTTT ENSG00000178726 B0164 anti-human CD1d 51.1 TCGAGTCGCTTATCA ENSG00000158473 B0165 anti-human CD314 (NKG2D) 1D11 CGTGTTTGTTCCTCA ENSG00000213809 B0167 anti-human CD35 E11 ACTTCCGTCGATCTT ENSG00000203710 B0168 anti-human CD57 Recombinant QA17A04 AACTCCCTATGGAGG ENSG00000109956 B0170 anti-human CD272 (BTLA) MIH26 GTTATTGGACTAAGG ENSG00000186265 B0171 anti-human/mouse/rat CD278 (ICOS) C398.4A CGCGCACCCATTAAA ENSG00000163600 B0174 anti-human CD58 (LFA-3) TS2/9 GTTCCTATGGACGAC ENSG00000116815 B0176 anti-human CD39 A1 TTACCTGGTATCCGT ENSG00000138185 B0179 anti-human CX3CR1 K0124E1 AGTATCGTCTCTGGG ENSG00000168329 B0180 anti-human CD24 ML5 AGATTCCTTCGTGTT ENSG00000272398 B0181 anti-human CD21 Bu32 AACCTAGTAGTTCGG ENSG00000117322 B0185 anti-human CD11a TS2/4 TATATCCTTGTGAGC ENSG00000005844 B0187 anti-human CD79b (Igβ) CB3-1 ATTCTTCAACCGAAG ENSG00000007312 B0189 anti-human CD244 (2B4) C1.7 TCGCTTGGATGGTAG ENSG00000122223 B0206 anti-human CD169 7-239 TACTCAGCGTGTTTG ENSG00000088827 B0214 anti-human/mouse integrin β7 FIB504 TCCTTGGATGTACCG ENSG00000139626 B0215 anti-human CD268 (BAFF-R) 11C1 CGAAGTCGATCCGTA ENSG00000159958 B0216 anti-human CD42b HIP1 TCCTAGTACCGAAGT ENSG00000203618 B0217 anti-human CD54 HA58 CTGATAGACTTGAGT ENSG00000090339 B0218 anti-human CD62P (P-Selectin) AK4 CCTTCCGTATCCCTT ENSG00000174175 B0219 anti-human CD119 (IFN-γ R α chain) GIR-208 TGTGTATTCCCTTGT ENSG00000027697 B0224 anti-human TCR α/β IP26 CGTAACGTAGAGCGA B0236 Rat IgG1, κ isotype RTK2071 ATCAGATGCCCTCAT B0238 Rat IgG2a, κ Isotype RTK2758 AAGTCAGGTTCGTTT B0242 anti-human CD192 (CCR2) K036C2 GAGTTCCCTTACCTG ENSG00000121807 B0246 anti-human CD122 (IL-2Rβ) TU27 TCATTTCCTCCGATT ENSG00000100385 B0352 anti-human FcεRIα AER-37 (CRA-1) CTCGTTTCCGTATCG ENSG00000179639 B0353 anti-human CD41 HIP8 ACGTTGTGGCCTTGT ENSG00000005961 B0355 anti-human CD137 (4-1BB) 4B4-1 CAGTAAGTTCGGGAC ENSG00000049249 B0358 anti-human CD163 GHI/61 GCTTCTCCTTCCTTA ENSG00000177575 B0359 anti-human CD83 HB15e CCACTCATTTCCGGT ENSG00000112149 B0363 anti-human CD124 (IL-4Rα) G077F6 CCGTCCTGATAGATG ENSG00000077238 B0364 anti-human CD13 WM15 TTTCAACGCCCTTTC ENSG00000166825 B0367 anti-human CD2 TS1/8 TACGATTTGTCAGGG ENSG00000116824 B0368 anti-human CD226 (DNAM-1) 11A8 TCTCAGTGTTTGTGG ENSG00000150637 B0369 anti-human CD29 TS2/16 GTATTCCCTCAGTCA ENSG00000150093 B0370 anti-human CD303 (BDCA-2) 201A GAGATGTCCGAATTT ENSG00000198178 B0371 anti-human CD49b P1E6-C5 GCTTTCTTCAGTATG ENSG00000164171 B0373 anti-human CD81 (TAPA-1) 5A6 GTATCCTTCCTTGGC ENSG00000110651 B0384 anti-human IgD IA6-2 CAGTCTCCGTAGAGT ENSG00000211898 B0385 anti-human CD18 TS1/18 TATTGGGACACTTCT ENSG00000160255 B0386 anti-human CD28 CD28.2 TGAGAACGACCCTAA ENSG00000178562 B0389 anti-human CD38 HIT2 TGTACCCGCTTGTGA ENSG00000004468 B0390 anti-human CD127 (IL-7Rα) A019D5 GTGTGTTGTCCTATG ENSG00000168685 B0391 anti-human CD45 HI30 TGCAATTACCCGGAT ENSG00000081237 B0393 anti-human CD22 S-HCL-1 GGGTTGTTGTCTTTG ENSG00000012124 B0394 anti-human CD71 CY1G4 CCGTGTTCCTCATTA ENSG00000072274 B0396 anti-human CD26 BA5b GGTGGCTAGATAATG ENSG00000197635 B0398 anti-human CD115 (CSF-1R) 9-4D2-1E4 AATCACGGTCCTTGT ENSG00000182578 B0404 anti-human CD63 H5C6 GAGATGTCTGCAACT ENSG00000135404 B0406 anti-human CD304 (Neuropilin-1) 12C2 GGACTAAGTTTCGTT ENSG00000099250 B0407 anti-human CD36 5-271 TTCTTTGCCTTGCCA ENSG00000135218 B0408 anti-human CD172a (SIRPα) 15-414 CGTGTTTAACTTGAG ENSG00000198053 B0419 anti-human CD72 3F3 CAGTCGTGGTAGATA ENSG00000137101 B0420 anti-human CD158 (KIR2DL1/S1/S3/S5) HP-MA4 TATCAACCAACGCTT ENSG00000125498 B0446 anti-human CD93 VIMD2 GCGCTACTTCCTTGA ENSG00000125810 B0575 anti-human CD49a TS2/7 ACTGATGGACTCAGA ENSG00000213949 B0576 anti-human CD49d 9F10 CCATTCAACTTCCGG ENSG00000115232 B0577 anti-human CD73 (Ecto-5'-nucleotidase) AD2 CAGTTCCTCAGTTCG ENSG00000135318 B0579 anti-human CD9 HI9a GAGTCACCAATCTGC ENSG00000010278 B0581 anti-human TCR Vα7.2 3C10 TACGAGCAGTATTCA B0582 anti-human TCR Vδ2 B6 TCAGTCAGATGGTAT B0591 anti-human LOX-1 15C4 ACCCTTTACCGAATA ENSG00000173391 B0592 anti-human CD158b (KIR2DL2/L3, NKAT2) DX27 GACCCGTAGTTTGAT ENSG00000243772 B0599 anti-human CD158e1 (KIR3DL1, NKB1) DX9 GGACGCTTTCCTTGA ENSG00000167633 B0822 anti-human CD142 NY2 CACTGCCGTCGATTA ENSG00000117525 B0830 anti-human CD319 (CRACC) 162.1 AGTATGCCATGTCTT ENSG00000026751 B0864 anti-human CD352 (NTB-A) NT-7 AGTTTCCACTCAGGC ENSG00000162739 B0867 anti-human CD94 DX22 CTTTCCGGTCCTACA ENSG00000134539 B0871 anti-human CD162 KPL-1 ATATGTCAGAGCACC ENSG00000110876 B0896 anti-human CD85j (ILT2) GHI/75 CCTTGTGAGGCTATG ENSG00000104972 B0897 anti-human CD23 EBVCS-5 TCTGTATAACCGTCT ENSG00000104921 B0902 anti-human CD328 (Siglec-7) 6-434 CTTAGCATTTCACTG ENSG00000168995 B0918 anti-human HLA-E 3D12 GAGTCGAGAAATCAT ENSG00000204592 B0920 anti-human CD82 ASL-24 TCCCACTTCCGCTTT ENSG00000085117 B0944 anti-human CD101 (BB27) BB27 CTACTTCCCTGTCAA ENSG00000134256 B1046 anti-human CD88 (C5aR) S5/1 GCCGCATGAGAAACA ENSG00000197405 B1052 anti-human CD224 KF29 CTGATGAGATGTCAG ENSG00000100031	Illumina NovaSeq 6000	5
EGAD50000000253	This dataset is the second batch of WGS uploaded from FL GenomeCanada data. The other batch is in EGAD00001011343	Illumina HiSeq 2500	78
EGAD50000000255	Bulk RNA sequencing of iPSCs and iPSC derived pericytes from three cell lines MNZTASi019-A, MNZTASi021-A and MNZTASi022-A. Data includes one iPSC data set per cell line and iPSC derived pericyte differentiations per cell line (3 differentiations from MNZTASi019-A and MNZTASi021-A, 2 from MNZTASi022-A). RNA-seq data was generated using an Illumina Stranded mRNA 150bp paired-end library preparation.	NextSeq 2000	11
EGAD50000000257	Raw paired-end whole-genome sequencing data of plasma cell free DNA on the NovaSeq 6000.	Illumina NovaSeq 6000	810
EGAD50000000258	Metagenomic sequencing of human fecal samples	NextSeq 2000	239
EGAD50000000260	Single cell sequencing of expanded regulatory T cells (Tregs) in 9 APS-1 patients and 9 age and gender matched controls. Gene expression (GEX) libraries were generated by using the Library Construction Kit from 10x genomics.	Illumina NovaSeq 6000	18
EGAD50000000261	Single cell TCR sequencing of expanded regulatory T cells (Tregs) in 9 APS-1 patients and 9 ange and gender matched controls. T-cell receptor (TCR) libraries were generated using the Single Cell Human TCR Amplification Kit and the Library Construction Kit from 10x genomics.	Illumina NovaSeq 6000	18
EGAD50000000262	Single cell sequencing of expanded regulatory T cells (Tregs) in 8 APS-1 patients and 8 age and gender matched controls (same patients and controls as for global gene expression and TCR sequencing, excluding one for each group (control sample and patient sample #13)). Each Sample has two technical repeats. 10x Genomics Target Hybridization Kit and Human Immunology Panel was used with GEX libraries.	Illumina NovaSeq 6000	16
EGAD50000000263	Skeletal muscle carries a unique ability for adaptation as well as regeneration that highly depends on a supportive cellular microenvironment. Here we show that spatial and single-cell transcriptomic analysis in human skeletal muscle is a unique approach to illuminate cellular interactions in the microenvironment.	DNBSEQ-G400	12
EGAD50000000264	Despite major advances in linking single genetic variants to single causal genes, the significance of genetic variation on transcript-level regulation of expression, transcript-specific functions, and relevance to human disease has been poorly investigated. Strawberry notch homolog 2 (SBNO2) is a candidate gene in a susceptibility locus with different variants associated with Crohn’s disease and bone mineral density. The SBNO2 locus is also differentially methylated in Crohn’s disease but the functional mechanisms are unknown. Here we show that the isoforms of SBNO2 are differentially regulated by lipopolysaccharide and IL-10. We identify Crohn’s disease associated isoform quantitative trait loci that negatively regulate the expression of the noncanonical isoform 2 corresponding with the methylation signals at the isoform 2 promoter in IBD and CD. The two isoforms of SBNO2 drive differential gene networks with isoform 2 dominantly impacting antimicrobial activity in macrophages. Our data highlight the role of isoform quantitative trait loci to understand disease susceptibility and resolve underlying mechanisms of disease. This dataset contains RNAseq raw data from CD14+ monocyte-derived macrophages and siRNA-mediated knockdown experiments, as well as RNAseq raw data from THP-1 monocytes-derived macrophages following ectopic expression of SBNO2 isoforms.	Illumina NovaSeq 6000	36
EGAD50000000265	Twist Bioscience probes were designed to LINE1 and HERV sequences for targeted amplification. Enriched cDNA libraries were then sequenced on the PacBio Sequel II, 4 samples/SMRTcell. The available files have undergone ccs (using the default parameters except for --min-rq 0.9, which was a recommended parameter for Iso-Seq) and barcode demultiplexing. Recommended analysis follows https://isoseq.how/, starting at the primer removal and demultiplexing step with lima.	Sequel II	31
EGAD50000000266	The breadth and depth at which cancer models are interrogated contribute to successful translation of drug discovery efforts to the clinic. In colorectal cancer (CRC), model availability is limited by a dearth of large-scale collections of patient-derived xenografts (PDXs) and paired tumoroids from metastatic disease, the setting where experimental therapies are typically tested. XENTURION is a unique open-science resource that combines a platform of 129 PDX models and a sister platform of 129 matched PDX-derived tumoroids (PDXTs) from patients with metastatic CRC, with accompanying multidimensional molecular and therapeutic characterization. In this specific dataset we focused our attention on early (passage 3) and late (passage 8-12) PDXTs with their matched PDXs and normal liver	Illumina NovaSeq X	92
EGAD50000000267	Amplicon sequencing	Illumina MiSeq	589
EGAD50000000268	We performed WGS on AutoMACS-separated BM leukemic blasts (CD34 + CD33+) to characterize the breakpoint and genes involved in the translocation. Additionally, we performed RNA-seq of the bone marrow blast cells to explore the functional consequences of the t(7;12) translocation	Illumina NovaSeq 6000	3
EGAD50000000269	This dataset contains 21 pediatric brain tumor cases which where obtained during surgery using a cavitating ultrasonic aspiration and submitted to nanopore whole genome sequencing.	MinION	21
EGAD50000000270	Targeted sequencing using the TruSight Oncology 500 DNA probes panel, on samples 162 and 521	NextSeq 500	4
EGAD50000000271	TruSight Oncology 500 RNA probes panel on case 521	NextSeq 500	2
EGAD50000000272	RNA extracted from FFPE tumour samples 368, 455, 503, and 521, sequenced using the TruSight RNA Fusion Panel	NextSeq 500	4
EGAD50000000273	311 genes		1
EGAD50000000274	This data set contains 16 paired fastq files (WGS) and 4 paired fastq files (WES).	Illumina HiSeq 2500	4
EGAD50000000275	Whole genome sequencing data of 56 high-grade serous carcinoma (HGSC) patients (208 samples) sequenced with Novoseq 6000	Illumina NovaSeq 6000	208
EGAD50000000276	The synthetic genomes have been created trying to mimic real cancer data of 4 patients (Named 185,186,187 and 188). Mutations are based on real CRC patients from the PCAWG dataset. For each patient, two tumor samples at different time points and one healthy sample have been simulated. The cancer intra-tumor heterogeneity and evolution in the patients is depicted by simulating reads from tumor subclones separately and then mixing them according to their clonal proportions in each sample. For rapid use and transfer only selected chromosomes have been generated for each patient. Chromosomes per patient: -185: chr4, chr5, chr7, chr17 -186: chr1, chr7, chr12, chr17 -187: chr1, chr2, chr5, chr12, chr17 -188: chr2, chr5, chr12, chr13, chr17 Worflows used to create BAM/BAI, VCF and MAF files from FASTQ (Alignment with GRCh38): - https://usegalaxy.eu/published/workflow?id=2c3d05023c02113e - https://usegalaxy.eu/published/workflow?id=1da86d74f8535f4e	unspecified	8
EGAD50000000277	Chemotherapy is the standard-of-care treatment for metastatic colorectal cancer (mCRC) and benefits some patients, but what distinguishes responders from non-responders is unclear. In this study, we leveraged a comprehensive collection of 27 molecularly annotated patient-derived xenografts to uncover functional predictors of response to 5-FU and irinotecan combination therapy (FOLFIRI) in mCRC. Genetic analyses revealed that treatment sensitivity was marked by genomic scars indicative of BRCAness, suggesting homologous recombination (HR) deficiency as a key determinant. Accordingly, we surveyed a manually curated panel of 44 genes with a documented role in HR for the potential presence of pathogenic mutations. We did not observe a specific enrichment of HR gene mutations based on response to FOLFIRI. This result, combined with the absence of widespread biallelic inactivation of the analyzed genes and the predominance of mutations categorized as variants of unknown significance, suggests that FOLFIRI sensitivity is not primarily governed by underlying mutations in HR genes responsible for mitigating the genotoxic effects of this therapeutic regimen.	unspecified	27
EGAD50000000279	Whole genome sequencing profiling of 4 primary PDAC tissue samples - WGS unmapped reads, sequenced using NovaSeq 6000, 15x coverage 160 million reads per sample.	Illumina NovaSeq 6000	4
EGAD50000000280	Whole genome sequencing profiling of 41 PDAC patient-derived organoids (PDO) - WGS unmapped reads, sequenced using NovaSeq 6000, 15x coverage 160 million reads per sample.	Illumina NovaSeq 6000	41
EGAD50000000281	CIRCLE-seq data of PDAC 1 patient-derived organoid (PDO) - Unmapped reads, libraries were sequenced using the Illumina NextSeq500 with the NextSeq 500/550 Mid Output Kit v2.5 (300 Cycles), generating around 10M paired end 150bp reads per sample	NextSeq 500	1
EGAD50000000282	Whole genome sequencing profiling of 7 PDAC patient-derived organoids (PDO) grown in a culture medium lacking both WNT3A and RSPO1- WGS unmapped reads, sequenced using NovaSeq 6000, 15x coverage 160 million reads per sample.	Illumina NovaSeq 6000	7
EGAD50000000283	The dataset comprises data for n=329 participants who underwent saliva sampling. Shotgun metagenomic paired-end sequencing was conducted using the Illumina NovaSeq 6000 platform, and the resulting files are in FASTQ format.	Illumina NovaSeq 6000	331
EGAD50000000285	DM1 patient blood transcriptome samples, mix of 6 patients per sample	Sequel II	3
EGAD50000000286	The capture panel targets the functional methylome in human whole blood. Regions incorporated in the panel design included hypomethylated windows generated from merged WGBS data.	Illumina HiSeq 2000	527
EGAD50000000287	Gut microbiome 16S rRNA raw data for N=7174 FINRISK 2002 participants. FINRISK fecal samples were mailed to the Knight laboratory at the University of California (San Diego, CA), for microbiota sequencing using the standard Earth Microbiome Project protocols (https://earthmicrobiome.org/protocols-and-standards/). DNA was extracted using a magnetic bead-based DNA extraction protocol. Amplicon sequence data for the V4 region of the 16S rRNA gene was generated using 515F (Parada) and 806R (Apprill) primers. For a total of 25μl reaction volume, 13 µl PCR-grade water was combined with 10 µl PCR master mix (Platinum Hot Start PCR Master Mix, 2x, ThermoFisher), 0.5 µl forward primer (10 µM), 0.5 µl reverse primer (10 µM), and 1 µl template DNA. Amplification was performed in triplicate reactions, and triplicate PCR reactions were pooled afterwards. Expected products were visualized on agarose gels (300–350 bp) and quantified with Quant-iT PicoGreen dsDNA kit (Invitrogen). Equal amounts (240 ng) of amplicon were combined for each sample and cleaned (MoBio UltraClean PCR Clean-Up Kit). Cleaned amplicon pools were sequenced with 515F and 806R primers.	Illumina MiSeq	7174
EGAD50000000289	In this study, we characterize premalignant lesions of the fallopian tube (serous tubal intraepithelial carcinomas) to explore the earliest events of tumorigenesis following mutation of the TP53 tumor suppressor gene. We conduct laser capture microdissection to isolate premalignant cells from adjacent normal cells, and subject isolated tissues to RNA-seq. Our findings reveal the earliest transcriptional changes established during premalignancy within the fallopian tube.	Illumina NovaSeq 6000	16
EGAD50000000291	Starting point is one patient from which iPSC cells are created, some received a genome correction to be called iCtrl from those two batches (DJ1 and iCtrl). 3 independent differentiation to Microglia cells were done, named R1, R2 and R3. On top, growing conditions were either untreated or LPS, leading to the 12 samples.	NextSeq 500	12
EGAD50000000292	Single-cell RNA-sequencing of bronchoalveolar lavage (BAL) samples from patients with severe COVID-19 with or without Dexamethasone treatment and for responders and non-responders was performed using 10x Genomics technology.	Illumina NovaSeq 6000	12
EGAD50000000293	This dataset consists of WES NGS paired-end raw data (FASTQ R1, R2 and UMI sequence) obtained from 25 localised colon cancer patients. Specifically, we have 25 primary tumor tissue samples, 17 metastatic tissues, 25 white blood cell samples, 25 plasma samples collected at relapse, 12 baseline plasma samples and 15 plasma samples post-surgery.	Illumina NovaSeq X	119
EGAD50000000294	Ultra high-resolution chromatin capture data in CACO2, CL11, HT29, SW403, SW480, SW948 MSS CRC cell lines	Illumina NovaSeq 6000	6
EGAD50000000295	ChIP-Seq data for CTCF, H3K4me1, H3K4me3, H3K27ac, H3K27me3, H3K36me3 in C32, CL11, HT29, SW403, SW480, SW948 MSS CRC cell lines	Illumina NovaSeq 6000	6
EGAD50000000296	ATAC-Seq data for C32, CACO2, CL11, HT29, SW403, SW480, SW948 MSS CRC cell lines, and HCEC-1CT normal colon cell line	Illumina NovaSeq 6000	8
EGAD50000000297	RNA-Seq data for C32, CACO2, CL11, HT29, SW403, SW948 MSS CRC cell lines and HCEC-1CT normal colon cell line	Illumina NovaSeq 6000	7
EGAD50000000298	The dataset represents a total of 58 DNA samples from 16 male and 12 female pediatric patients affected with embryonal central nervous system tumors. The samples were subject to whole genome sequencing, WGS, [48 samples, (representing 12 male and 11 female individuals)] and whole exome sequencing, WES, [10 samples, (representing 4 male and 1 female individuals)]. One tumor tissue sample and one peripheral blood sample were analyzed from each of 26 patients, whereas two tumor tissue samples and one peripheral blood sample were analyzed from two patients. The WGS samples were sequenced 2x150 bp paired-end on an Illumina HiSeqX v2.5 instrument, and the WES samples were sequenced 2x100 bp paired-end on an Illumina HiSeq 2500 instrument. The FASTQ files generated were aligned to the human reference genome sequence GRCh38/hg38 using bwa-mem, with the ALT-aware option turned on. Sorting of reads and marking of PCR duplicates was performed with GATK. Base quality score recalibration and joint realignment of reads around insertions and deletions (indels) were conducted using GATK tools. The dataset consists of 58 files in the CRAM format (lossless compression) with a total file size of ~8,8 TB. All CRAM files but one, are derived from one sequence run and one sample. P4551_227N_P4552_112N is a CRAM file where 2 sequence runs (P4551_227N and P4552_112N) from peripheral blood samples from the same individual, P019, were aligned into one single CRAM file. Additional genomic and molecular data (FASTQ, BAM, IDAT, and VCF files) and limited clinical data can be requested by ethically approved projects conducting research in the field of pediatric cancer.	HiSeq X Ten Illumina HiSeq 2500	58
EGAD50000000299	The dataset represents a total of 18 DNA samples from 6 male and 3 female pediatric patients affected with central or peripheral nervous system tumors not classified as embryonal central nervous system tumors, nor gliomas, glioneuronal, or neuronal tumors. One tumor tissue sample and one peripheral blood sample from each patient were subject to whole genome sequencing (WGS) and were sequenced 2x150 bp paired-end on an Illumina HiSeqX v2.5 instrument. The FASTQ files generated were aligned to the human reference genome sequence GRCh38/hg38 using bwa-mem, with the ALT-aware option turned on. Sorting of reads and marking of PCR duplicates was performed with GATK. Base quality score recalibration and joint realignment of reads around insertions and deletions (indels) were conducted using GATK tools. The dataset consists of 18 files in the CRAM format (lossless compression) with a total file size of ~3,4 TB. Additional genomic and molecular data (FASTQ, BAM, IDAT, and VCF files) and limited clinical data can be requested by ethically approved projects conducting research in the field of pediatric cancer.	HiSeq X Ten	18
EGAD50000000300	The dataset represents a total of 85 DNA samples from 22 male and 20 female pediatric patients affected with gliomas, glioneuronal, and neuronal tumors. The samples were subject to whole genome sequencing, WGS, [71 samples, (representing 18 male and 17 female individuals)] and whole exome sequencing, WES, [14 samples, (representing 4 males and 3 female individuals)]. One tumor tissue sample and one peripheral blood sample were analyzed from each of 84 patients, whereas two tumor tissue samples and one peripheral blood sample were analyzed from one patient. The WGS samples were sequenced 2x150 bp paired-end on an Illumina HiSeqX v2.5 instrument, and the WES samples were sequenced 2x100 bp paired-end on an Illumina HiSeq 2500 instrument. The FASTQ files generated were aligned to the human reference genome sequence GRCh38/hg38 using bwa-mem, with the ALT-aware option turned on. Sorting of reads and marking of PCR duplicates was performed with GATK. Base quality score recalibration and joint realignment of reads around insertions and deletions (indels) were conducted using GATK tools. The dataset consists of 85 files in the CRAM format (lossless compression) with a total file size of ~13,3 TB. Additional genomic and molecular data (FASTQ, BAM, IDAT, and VCF files) and limited clinical data can be requested by ethically approved projects conducting research in the field of pediatric cancer.	HiSeq X Ten Illumina HiSeq 2500	85
EGAD50000000301	EBNA2 ChIP-Re-ChIP in primary bulk B cells 4 days post EBV-infection with B95-8. DNA binding elements of the viral master regulator EBNA2, EBNA2-CBF transcription factor complex or EBNA2-EBF1 transcription factor complex were analyzed to identify virally targeted genes.	Illumina NovaSeq 6000	20
EGAD50000000302	PBMCs from six kidney transplant recipients receiving as part of the Trex001 study autologous Tregs and donor bone marrow and six control patients not receiving either of the two treatments were collected pre-transplant and at one, three and six month post-transplant. Donor reactive T-cells were identified by mixed lymphocyte reactions (MLR) and lineage specific T-cell receptor (TCR) repertoires of native T-cells and proliferating and non-proliferating T-cells from MLRs were determined by next generation sequencing based profiling of the TCR.	NextSeq 2000	181
EGAD50000000303	Targeted cfDNA and WBC sequencing data from patients profiled in the study "Prediction of plasma ctDNA fraction and prognostic implications of liquid biopsy in advanced prostate cancer". Note that 'Exome sequencing' is listed under Dataset Type since there is no option for targeted panel sequencing.	Illumina HiSeq 2500	1228
EGAD50000000304	Organoid cultures were exposed to different E.Coli strains and a dye control. In total 25 organoid cultures were whole-genome sequenced using the Novaseq6000 platforms. The data is deposited as .bam format.	Illumina NovaSeq 6000	25
EGAD50000000305	Naïve (CD27-IgD+) B cells were isolated from buffy coat preparations of healthy donors using CD19 magnetic beads, followed by reals of CD19 beads and incubation with IgD-biotin and anti-biotin magnetic beads. B cells were infected with EBV by spinoculation or stimulated with heat-inactivated EBV and control cells were left uninfected. RNA was extracted immediately after isolation in un-infected B cells. From EBV-infected B cells and B cells stimulated with heat-inactivated virus, RNA was extracted 24 and 96 hours after infection / stimulation.	Illumina NovaSeq 6000	31
EGAD50000000306	B cells were isolated from buffy coat preparations of healthy donors using CD19 magnetic beads. B cells were then infected with EBV using spinoculation or activated with heat-inactivated EBV, respectively. Additionally cells were treated with 5ng/ml CpG or a BCR-crosslinking mixture in presence or absence of 10 µM Linrodostat (IDO1 inhibitor). RNA samples were isolated at 2 days post-infection / post-activation and one day 0 control with non-infected B cells was included.	Illumina NovaSeq 6000	36
EGAD50000000307	This research project was a collaboration between the Karolinska Institute and the Stanley Center at the Broad Institute. In this project we sequenced and analyzed the whole exomes of 5,876 Bipolar case/control samples from collaborators in Sweden. Genomic DNA from each samples was sequenced to a mean depth of 20x.	HiSeq X Ten Illumina Genome Analyzer IIx	5876
EGAD50000000308	The data set consists of unprocessed RNA-Seq data from 225 patients diagnosed with T cell acute lymphoblastic leukemia in fastq file format. Samples from bone marrow or peripheral blood were subjected to mRNA library prep using Poly-A selection and sequencing on a NovaSeq 6000 system yielding approximately 30 million reads per sample.	Illumina NovaSeq 6000	225
EGAD50000000309	Whole exome sequencing data was mapped to GRCh38 using bwa-mem2 as implemented in the nfcore sarek workflow.	Illumina NovaSeq 6000	13
EGAD50000000310	Genomic data from Diffuse-large B-cell lymphomas at diagnosis and during the treatment. The processed samples were : -Circulating tumor DNA: at diagnosis and after2-cycles of treatment. -Tumour lymph node at diagnosis -Genomic DNA	Illumina NovaSeq 6000 NextSeq 500	154
EGAD50000000311	We identified 2 germline mutations in the DCLRE1B gene encoding the Apollo protein by Whole Exome Sequencing (WES) in two families with inherited clear-cell Renal Cell Carcinoma. The raw data submitted here corresponds to WES performed on Genomic Platform at Gustave Roussy Institute.	Illumina NovaSeq 6000	5
EGAD50000000312	RNAsequencing from 271 samples.	Illumina NovaSeq 6000	271
EGAD50000000313	The dataset contains whole genome sequencing data of 23 high-grade serous carcinoma (HGSC) patients sequenced with Novoseq 6000. The 89 samples are either fresh frozen tumour samples or blood samples. The files provided are paired fastq files.	Illumina NovaSeq 6000	89
EGAD50000000317	We included 16 fresh tumor samples of biopsy-proven invasive penile squamous cell carcinomas and 6 adult non-malignant inner prepuce samples. Single-cell RNA sequencing was performed (10x genomics). An HPV reference genome of 15 high-risk HPV types was generated and we mapped all single cell reads to the GRCh38 human and HPV reference genome using CellRanger. Targeted next-generation sequencing (tNGS) was performed for the detection of TP53 loss-of-function mutations.	Illumina NovaSeq 6000	22
EGAD50000000318	Shallow whole genome sequencing of 170 samples from 24 esophageal adenocarcinoma's. DNA was obtained from FFPE stored material. Illumna Hiseq 400 was used for sequencing.	Illumina HiSeq 4000	170
EGAD50000000319	This project used NGS (next generation sequencing) in mismatch repair deficient colorectal cancer samples. The project investigated the role of secondary MMR (mismatch repair) gene mutations in tumor evolution. This dataset includes BAM files from multi-region Whole Exome Sequencing of 49 samples from 22 patient tumors.	Illumina NovaSeq 6000	71
EGAD50000000320	Fastq files of bulk RNAseq data from DCIS, invasive and microinvasive breast cancer at diagnosis. This dataset covers 18 DCIS cases, 17 microinvasive and 20 primary invasive breast cancer.	Illumina HiSeq 2500	55
EGAD50000000321	Fastq files from scRNAseq data of cancer associated pericytes-like from 3 patients. CAP were isolated from a total of 3 primary BC (surgical residues prior to any treatment) by using BDFACS ARIA III sorter (BD Biosciences). BC were collected directly from the operating room after surgical specimen macroscopic examination and selection of areas of interest by a pathologist. Samples were cut into small pieces (around 1 mm3) and digested in CO2-independent medium (Gibco #18045-054) supplemented with 150 μg/mL liberase (Roche #05401020001) and Dnase I (Roche #11284932001) for 40 min at 37°C with shaking (180 rpm). After digestion, cells were processed and stained as described above (#Flow Cytometry analysis of BC samples). CAP fibroblasts were then gated on the Live/Dead negative fraction and defined as EPCAM- CD45- CD31- CD235a- FAPMed CD29High. CAP scRNA-seq: Upon isolation, CAP cells were directly collected into RNase-free tubes (Thermo Fisher Scientific, #AM12450) precoated with DMEM (GE Life Sciences, #SH30243.01) supplemented with 10% FBS (Biosera, #1003/500). Single-cell capture, lysis, and cDNA library construction were performed using Chromium system from 10X Genomics, with the following kits: Chromium Single Cell 3′ Library & Gel Bead Kit v2 kit (10X Genomics, #120237) and Chromium Single Cell A Chip Kits (10X Genomics, #1000009). Generation of gel beads in Emulsion (GEM), barcoding, post GEM-reverse transcription cleanup and cDNA amplification were performed according to the manufacturer’s instructions. Cells were loaded accordingly on the Chromium Single cell A chips, and 12 cycles were performed for cDNA amplification. cDNA quality and quantity were checked on Agilent 2100 Bioanalyzer using Agilent High Sensitivity DNA Kit (Agilent, #5067-4626) and library construction followed according to 10X Genomics protocol. Libraries were next run on the Illumina HiSeq (for patients P1) and NovaSeq (for patients P2–3) with a depth of sequencing of 50,000 reads per cell.	Illumina NovaSeq 6000	3
EGAD50000000322	Fastq files from spatial transcriptomic of breast cancer coming from 8 Breast cancer sections. Sample preparation: frozen BC samples were chosen based on tissue structure and RNA quality (RIN > 8). The “Visium Spatial Tissue Optimization Slide and Reagent Kit” (10X Genomics; #PN-1000193) was then used to optimize permeabilization conditions for BC tissues. Briefly, sections were fixed, stained and then permeabilized at different time points to capture mRNA, and the reverse transcription was performed to generate fluorescently labeled cDNA. The permeabilization time that resulted in the highest fluorescence signal with the lowest background diffusion was chosen. The best permeabilization time for BC tissue was 18 min. Cryostat sections of 10 μm of thickness were cut and placed on Visium Spatial Gene Expression slides (10X Genomics, PN-1000184). The slide was incubated for 1 min at 37°C, then fixed with methanol for 30 min at -20°C followed by Hematoxylin and Eosin (H&E) staining and images were taken under a high-resolution microscope. After imaging, the coverslip was detached by holding the slide in water and the slide was mounted in a plastic slide cassette. The spatial gene expression process, including tissue permeabilization, second strand synthesis and cDNA amplification, was performed according to the manufacturer’s instructions (10X Genomics; #CG000239). cDNA quality was next assessed using Agilent High sensitivity DNA Kit (Agilent, #5067-4626). The spatial gene libraries were constructed using Visium Spatial Library Construction Kit (10X Genomics, PN-1000184).	Illumina NovaSeq 6000	8
EGAD50000000323	Fastq files from bulk RNAseq of fibroblasts after culture and facs sorting (N=9). Sorted FAP+ CAF cells RNAs were extracted using Qiagen miRNeasy Kit (Qiagen, #217004) according to the manufacturer's instructions. Verification of RNA integrity and quality was performed using the Agilent RNA 6000 nano Kit (Agilent Technologies, #5067-1511). cDNA libraries were prepared using the TruSeq Stranded mRNA Kit (Illumina, #20020594) followed by sequencing on NovaSeq (Illumina).	Illumina NovaSeq 6000	9
EGAD50000000324	Paired RNA sequencing of additional samples of Thymic epithelial tumors. Uploaded are the paired fastq files, sequencers were Illumina HiSeq 2500, HiSeq 4000, NovaSeq 6000 and HiSeq X Ten. The kits used were Illumina Truseq RNA and Illumina Truseq stranded mRNA.	Illumina HiSeq 4000 Illumina NovaSeq 6000	21
EGAD50000000325	Paired whole genome and exome sequencing of tumor-control pairs of Thymic epithelial tumors, additional samples with paired fastq files, sequenced on Illumina-HiSeq_X_Ten and using the Illumina_TruSeq_Nano_DNA kit.	HiSeq X Ten Illumina HiSeq 4000 Illumina NovaSeq 6000	64
EGAD50000000326	Targeted glioma panel sequencing of pediatric hemispheric high-grade gliomas and diffuse midline gliomas. File type is paired-read fastq files (2 per sample). Sequencing was performed on Illumina NextSex instruments. Genes covered: TP53, H3F3A, HIST1H3B, HIST1H3C, IDH1, IDH2, KRAS, PIK3CA, TERT promoter, PTPN11, FGFR1/2/3, MYB, BRAF, MYBL1, EGFR, PDGFRA, MYCN, MYC, CDKN2A.	NextSeq 500	140
EGAD50000000327	This dataset contains aligned BAM files from whole-exome sequencing of 29 patients from the Oxel pilot study. Alignment to GRCh38 reference genome was performed using the BWA aligner.	Illumina NovaSeq 6000	29
EGAD50000000328	Single-cell, single-nucleus and CITE-sequencing of neuroblastoma tumors with 10X Genomics	Illumina NovaSeq 6000	22
EGAD50000000329	CITEseq was performed as outlined in Biolegend ‘TotalSeqTM-A Antibodies and Cell Hashing with 10x Single Cell 3' Reagent Kit v3 3.1 Protocol’ with minor modifications, using Biolegend oligo-conjugated antibodies and streptavidin TotalSeq reagents. Briefly, ADAPT-NK cells were stained with CD56-biotin mAb (Miltenyi, clone REA196), followed by TotalSeq antibodies and streptavidin-PE and Live/Dead Aqua (Invitrogen). Cells were subsequently sorted for viable CD56+ cells by flow cytometry.	Illumina HiSeq 4000	2
EGAD50000000330	Repeated Sampling Experiment containing 135 fastq files of RNA sequencing. These represent time series of different organ areas sampled during autopsies.	Illumina NovaSeq 6000	135
EGAD50000000331	20 fastq files of RNA-sequencing of the 2 main histopathological grow patterns observed in liver metastases.	unspecified	20
EGAD50000000332	Dataset containing scRNA and scTCR sequencing of 7 patients with cutaneous T cell lymphoma. Sequencing was performed on Illumina NextSeq 550, HiSeq 4000 and NovaSeq 6000. The sequencing was always paired.	Illumina HiSeq 4000 Illumina NovaSeq 6000 NextSeq 550	14
EGAD50000000333	RNA-seq data from nasal and bronchial tissues in 649 subjects, many with lung cancer. Lung cancer is the leading cause of cancer-related death in the world. In contrast to many other cancers, a direct connection to lifestyle risk in the form of cigarette smoke has long been established. More than 50% of all smoking-related lung cancers occur in former smokers, often many years after smoking cessation. Despite extensive research, the molecular processes for persistent lung cancer risk are unclear. CT screening of current and former smokers has been shown to reduce lung cancer mortality by up to 26%. To examine whether clinical risk stratification can be improved upon by the addition of genetic data, and to explore the mechanisms of the persisting risk in former smokers, we have analyzed transcriptomic data from accessible airway tissues of 487 subjects. We developed a model to assess smoking associated gene expression changes and their reversibility after smoking is stopped, in both healthy subjects and clinic patients. We find persistent smoking associated immune alterations to be a hallmark of the clinic patients. Integrating previous GWAS data using a transcriptional network approach, we demonstrate that the same immune and interferon related pathways are strongly enriched for genes linked to known genetic risk factors, demonstrating a causal relationship between immune alteration and lung cancer risk. Finally, we used accessible airway transcriptomic data to derive a non-invasive lung cancer risk classifier. Our results provide initial evidence for germline-mediated personalised smoke injury response and risk in the general population, with potential implications for managing long-term lung cancer incidence and mortality.	Illumina HiSeq 2500 Illumina HiSeq 4000	649
EGAD50000000334	This data set contains the raw FASTq files and processed files of scRNA seq experiment which were conducted on 3 EMD samples. The libraries were generated using 10x genomics Dual index Single Cell 3' v3.1. The processed files are TSV files of features, barcodes, and gene expression matrix	NextSeq 2000	3
EGAD50000000335	This object contains raw FATSq files and processed files of the sections of an entire EMD lesions of patient PT01A and PT10. The raw files represent bulk RNA sequencing of each of the sections. The processed file is a CSV file that contains gene expression.	NextSeq 2000	2
EGAD50000000336	This data set contains raw FASTQ files and processed files of spatial transcriptomics of 6 EMD samples collected from MM patients. The processed files contain the h5 expression matrices, the Image of the Visium slide, and a TSV of spatial coordinates. Patients included in this data set are PT01A, PT01B, PT02, PT03, PT07, PT08, PT09, PT10, and PT11.	NextSeq 2000	11
EGAD50000000337	This dataset includes WES and RNAseq for 7 patients with metastatic melanoma (4), non-small cell lung adenocarcnioma (1), cervix adenocarcinoma (1), and Epidermoid nasophaeyngeal carcinoma (1), enrolled in phase I clinical trials (NCT03475134m NCT04643574 and NCT05195619). WES was performed on matched cancer and healthy tissues, whereas RNAseq was performed on cancer tissues, using Illumina HiSeq 4000/6000 and Illumina NextSeq 500/550 systems.	Illumina HiSeq 4000 Illumina NovaSeq 6000 NextSeq 500 NextSeq 550	33
EGAD50000000338	The dataset contains scRNA-seq gene expression matrix (csv file) and metadata (csv file) after quality control filtering for data generated with the Smart-seq2 protocol. Transcript expression was quantified with Salmon v0.11.3 using cDNA sequences from GRCh38.94 and k-mer length 25, and was aggregated to gene level and transcript-length-corrected using tximport v1.8.0. The dataset comprises IgA+ transglutaminase 2-specific and other IgA+ B cells from the peripheral blood of 355 cells from two untreated celiac disease patients.	NextSeq 500	2
EGAD50000000339	The dataset contains processed sequencing data from Chromium Single Cell 5’ gene expression, human B cell VDJ and feature barcode (CSP) sequencing from transglutaminase 2-specific and other small intestinal plasma cells isolated from four untreated celiac disease patients. The raw sequencing data has been processed with Cell Ranger v.6.0.2 with the multi and aggr functions using the pre-built Cell Ranger references GRCh38 version 2020-A for gene expression and GRCh38-alts-ensembl-5.0.0 for V(D)J analysis. The dataset consists of a gene expression and antibody capture expression matrix (cell barcodes and feature names in tsv.gz file, expression matrix in mtx.gz file) and VDJ sequences in AIRR format (csv file). A metadata file (csv file) details cells passing our custom quality control based on number of detected genes, UMIs, mitochondrial genes, immunoglobulin genes and a productively rearranged immunoglobulin heavy chain of the IgA isotype.	Illumina NovaSeq 6000	4
EGAD50000000340	Raw scRNA-seq data from 355 IgA+ peripheral blood B-lineage cells of two untreated celiac disease patients. The data was generated with the Smart-seq2 protocol and sequenced on a NextSeq500 instrument (Illumina) with 75 bp paired-end reads in high-output mode. The dataset contains R1 and R2 reads for each single cell (fastq.gz files) for cells passing quality control based on number of detected genes, reads, mitochondrial genes, reads mapping to the reference transcriptome and a productively rearranged immunoglobulin heavy chain IgA isotype reconstructed by the computational tool BraCeR. Metadata for the cells is provided in a csv file.	NextSeq 500	2
EGAD50000000341	The dataset contains reconstructed VDJ sequences (fasta files) and accompanying metadata for each cell (csv file) from scRNA-seq data generated with the Smart-seq2 protocol. The VDJ sequences were reconstructed with the computational tool BraCeR using raw fastq files as input. The dataset contains sequences from 355 IgA+ peripheral blood B-lineage cells of two untreated celiac disease patients. The sequences comprise both IgA+ transglutaminase 2-specific and other IgA+ B cells.	NextSeq 500	2
EGAD50000000342	The dataset contains raw fastq files (fastq.gz) for Chromium Single Cell 5’ gene expression (GEX), human B cell VDJ and feature barcode (CSP) sequencing from transglutaminase 2-specific and other small intestinal plasma cells isolated from four untreated celiac disease patients. Single cell 5’ gene expression, V(D)J-enriched and cell surface protein libraries were generated using Chromium single cell kits, and barcoded cDNA from a total of 5,000-10,000 cells per sample was generated using the 10x Genomics Chromium Controller. The libraries were pooled prior to sequencing on a NovaSeq 6000 instrument (Illumina) using the following configuration: read 1: 26 cycles, read 2: 89 cycles, index read 1: 8 cycles.	Illumina NovaSeq 6000	4
EGAD50000000343	The dataset contains VDJ sequences in FASTA format, the same sequences run through IMGT/HighV-QUEST (tsv file) and accompanying metadata (csv file) including antigen specificity (transglutaminase 2-specific or other). The data was generated from cultured single B-cell clones from the peripheral blood of four untreated celiac disease patients. Sequences were obtained by a nested RT-PCR approach targeting the immunoglobulin chains followed by Sanger sequencing.		4
EGAD50000000344	The data published here contains bulk RNA-sequencing (RNAseq) data as obtainedfrom monocyte-derived dendritic cells in treated with/without LPS and with/without CESi (WWL113). Sequencing was performed in a paired-ended fashion on the NovaSeq6000.	Illumina NovaSeq 6000	24
EGAD50000000345	Type 1 diabetes mellitus (T1DM) is a prototypic endocrine autoimmune disease resulting from an immune-mediated destruction of pancreatic insulin-secreting beta-cells. A comprehensive immune cell phenotype evaluation in T1DM has not been performed thus far at the single. In this cross-sectional analysis, we generated a single-cell transcriptomic dataset of peripheral blood mononuclear cells (PBMCs) from 46 manifest T1DM (Stage 3) cases and 31 matched controls.Our study reveals a surprisingly strong systemic dimension at the level of immune cell network in T1DM, defines disease-relevant molecular subtypes and has the potential to guide non-invasive test development and patient stratification.	Illumina HiSeq 4000 unspecified	22
EGAD50000000346	We sequenced the genomes of 141 Korean never-smoker lung adenocarcinoma patients, excluding EGFR and ALK alterations. We utilized the TruSeq DNA Library Prep Kit and performed by the Illumina NovaSeq 6000 instrument, yielding sequencing paired reads of approximately 150 bp in size. Afterward, we processed the FASTQ files using the GATK Best Practice pipeline.	Illumina HiSeq 4000	281
EGAD50000000347	RNA-seq data from CD20+ sorted cells obtained from peripheral blood and lymph node Follicular Lymphoma patients. These samples were unstimulated after thawing and additionally, peripheral blood samples were stimulated during 7 days in culture as described in Dobaño-López C et al. Stranded RNA-seq libraries were performed using the TruSeq library kit (Illumina, San Diego, CA, USA). Libraries were sequenced on a NextSeq 2000 (Illumina) in a 2x50bp length.	NextSeq 2000	12
EGAD50000000348	Sixty-eight patients with advanced prostate cancer in castration-resistant or castration-sensitive settings undergoing treatment at the University Hospital Basel or the St. Claraspital Basel (Switzerland) were selected for targeted parallel sequencing analysis on liquid biopsy (plasma cfDNA) and matched formalin-fixed, paraffin-embedded (FFPE) tumor tissue samples. The liquid biopsy sequencing (plasma cfDNA) was performed using a custom-designed targeted AmpliSeq HD Prostate Cancer cfDNA panel on all 68 patients. The sequencing on 42 matched FFPE tumor biopsy samples was performed using a custom-designed or an alternative commercial panel (ThermoFisher). Raw data underwent automated processing on the Ion Torrent Server v5.16.1 (ThermoFIsher) and were aligned to a hg19 reference genome using the Torrent Alignment Software (ThermoFihser).	Ion GeneStudio S5 Prime	129
EGAD50000000349	We treated patient-derived cell lines with effective drug inhibitors that we discovered. To understand the mechanisms behind how the inhibitor can effectively inhibit tumor growth, we performed a series of RNA-Seq, Histone ChIP-Seq, TF ChIP-Seq and ATAC-Seq profiling for samples with and without treatment with the inhibitor. There are 18 RNA-Seq, 20 ATAC-Seq and 182 ChIP-Seq experiments performed. This includes 2 biological replicates. Novaseq 6000 was used to sequence the samples. Library preparation was performed in house for ChIP-Seq and ATAC-Seq, using NEB indexes, while libraries for RNA-Seq were constructed by the sequencing company hired. Total RNA was sequenced. Sequencing parameters used are PE150, bi-directional sequencing. For ATAC-Seq, sequencing was performed up to 40M reads, for ChIP-Seq, sequencing was performed up to 20M reads and for RNA-Seq, sequencing was performed up to 20M to 40M reads. Fastq files are uploaded here.	Illumina HiSeq 4000 Illumina NovaSeq 6000 Illumina NovaSeq X	220
EGAD50000000350	Patients with endocrine-resistant breast cancer in Stockholm Sweden. DNA obtained from patients primary and relapse tumors, and tumor-free lymph nodes used as germline control. DNA was extracted from formalin-fixed paraffin-embedded tissue and sequenced by 370-gene panel-based sequencing with Kapa HyperPlus library preparation and Twist Bioscience hybrid capture. Custom bait sets (panels) from Twist Bioscience. Paired end 2x150 bp using NovaSeq X. The data is presented as fastq-files.	Illumina NovaSeq X	54
EGAD50000000351	Fresh peripheral blood mononuclear cells of four human donors were cultured together with either lung adenocarcinoma A549 cancer cells or A549-expressing H1N1 Sialidase cancer cells. These treatments induced the differentiation of donor cells into immunosuppressive MDSC-like cells, which were further subjected to single-cell RNA sequencing.	Illumina NovaSeq 6000	8
EGAD50000000352	This dataset contains 241 samples sequenced with immunogene panel (2533 genes). The samples are sorted CD4+ or CD8+ T cells, skin, or fibroblast samples from patients with various hematological disorders (n=90) and healthy blood donors (n=21). The detailed description of sample processing, sequencing, and read alignment can be found in the publication (Somatic mutations associate with clonal expansion of CD8+ T cells, PMID: [will be updated])	Illumina HiSeq 2500	241
EGAD50000000353	Sequencing libraries were prepared and barcoded using the unique molecular identifier and index tagging following the VariantPlex Somatic Protocol (ArcherDx). Pool-library was loaded at 1.2 pM concentration with 20% PhiX and paired-end sequencing was performed using the NextSeq 500 Illumina sequencer using 300 cycle high output reagent kit.	NextSeq 500	28
EGAD50000000354	Multi-region tumor samples were cut from frozen sections or FFPEs and reviewed from microdissection and H&E staining in order to select ones with high cellularity. DNA extraction was done using DNeasy Blood & Tissue Kits for frozen samples following the manufacturer’s guideline. Extracted DNAs were processed on an Illumina HiSeq 2500 in a paired end mode (100x100) using a custom targeted panel based on the list of all unique somatic mutations from the original WES data by the Integrated Genomics Operation (IGO) at Memorial Sloan Kettering Cancer Center (New York, NY).	Illumina HiSeq 2500	244
EGAD50000000355	70 bam files generated from deep whole exome sequencing from samples from oesophageal adenocarcinoma from 17 patients. 17 bam files generated from deep whole exome sequencing from matching blood (germline control) from patients with oesophageal adenocarcinoma. Samples were collected within the clinical MEMORI trial.	Illumina NovaSeq X	97
EGAD50000000356	Bulk B Cell Receptor high-throughput sequencing data across 25 serial breast tumour biopsies obtained from 10 patients during neoadjuvant therapy. The samples were sequenced on an Illumina MiSeq instrument and their raw FastQ files deposited here.	Illumina MiSeq	25
EGAD50000000357	80 bam files generated from 3'RNAseq of tumour biopsies from oesophageal adenocarcinoma. Samples were collected within the clinical MEMORI trial.	Illumina NovaSeq X	80
EGAD50000000358	Single patient case of HER2-Positive Metastatic Extramammary Paget’s Disease	Illumina NovaSeq 6000	2
EGAD50000000359	Pig-to-human xenotransplantation is rapidly approaching the clinical arena; however, it is unclear which immunomodulatory regimens will effectively control human immune responses to pig xenografts. We transplanted a gene-edited pig kidney into a brain-dead human recipient on pharmacologic immunosuppression and studied the human immune response to the xenograft using spatial transcriptomics and single-cell RNA sequencing. Human immune cells were uncommon in the porcine kidney cortex early after xenotransplantation and consisted of primarily myeloid cells. Both the porcine resident macrophages and human infiltrating macrophages expressed genes consistent with an alternatively activated, anti-inflammatory phenotype. No significant infiltration of human B or T cells into the porcine kidney xenograft was detected. Altogether, these findings provide proof of concept that conventional pharmacologic immunosuppression is sufficient to restrict infiltration of human immune cells into the xenograft early after compatible pig-to-human kidney xenotransplantation.	Illumina NovaSeq 6000	11
EGAD50000000360	We analyzed 264 plasma samples collected between June 2016 and September 2021 from 63 epithelial ovarian cancer patients using tumor-guided plasma cell-free DNA analysis to detect residual disease after treatment.	Illumina NovaSeq 6000	1
EGAD50000000361	RNA sequencing was performed on 108 NSCLC tumor samples and their paired adjacent normal tissues (n=21) to identify associations with clinical and immune characteristics.	Illumina HiSeq 2000	129
EGAD50000000362	CYLD cutaneous syndrome (CCS) is a rare autosomal dominant disorder characterized by germline CYLD mutations and by multiple benign skin tumors dependent on NF-kB pathway. We assembled a large cohort of CCS rare skin tumors that was profiled with whole exome or genome sequencing, RNA sequencing and methylation arrays to better understand genetic mechanisms of CCS tumorogenesis.	BGISEQ-500	39
EGAD50000000363	We performed whole transcriptome RNA sequencing on serial tumour biopsies collected at baseline, Day 14, and after completion of all neoadjuvant therapy from the NA-PHER2 trial. Patients received neoadjuvant treatment with the combined regimen of trastuzumab, pertuzumab, palbociclib with or without addition of fulvestrant. Transcriptomic profiles were generated from 143 samples (Baseline n = 53, Day 14 n = 49, Surgery n = 41) corresponding to 53 of the 58 patients enrolled in NA-PHER2 trial (91.4%). RNA sequencing was performed on total RNA samples derived from formalin-fixed, paraffin-embedded (FFPE) tissue sections. RNA-Seq libraries were produced using NEBNext® Ultra™ II Directional RNA Library Prep Kit. The capture was then performed on cDNA libraries with the Twist Human Core Exome Enrichment System according to supplier recommendations (Twist Bioscience). The obtained eluted-enriched DNA samples was then sequenced on an Illumina NovaSeq as paired-end 100bp reads. FASTQ files are provided.	unspecified	147
EGAD50000000364	Single nuclei RNAseq data from 14 HGSOC primary tumour samples	Illumina NovaSeq 6000	14
EGAD50000000366	Source data of clinical study data corresponding to figures reported in the paper titled: Anti-TIGIT antibody improves PD-L1 blockade through myeloid and Treg cells. PMID: 38418879 DOI: 10.1038/s41586-024-07121-9		293
EGAD50000000367	Source data of clinical study data corresponding to figures reported in the paper titled: Anti-TIGIT antibody improves PD-L1 blockade through myeloid and Treg cells. PMID: 38418879 DOI: 10.1038/s41586-024-07121-9		293
EGAD50000000368	Source data of clinical study data corresponding to figures reported in the paper titled: Anti-TIGIT antibody improves PD-L1 blockade through myeloid and Treg cells. PMID: 38418879 DOI: 10.1038/s41586-024-07121-9		293
EGAD50000000369	Matrix of counts from the serum peptide mass spec data from patients enrolled in the CITYSCAPE trial, and the sample annotation. Specifically, serum samples at C1D1, C2D1, C3D1, SCRN were collected from a total of 132 patients and subject to Mass Spec at Biognosys. The current study focused on patients who have samples at both C1D1 and C2D1 (n = 64 pairs). Associated metadata also included.		293
EGAD50000000370	Matrices of counts from single-cell RNA-seq data and single-cell CITE-seq data collected from 16 patients enrolled in GO30103 trial, and the cell level annotation. Specifically, PBMCs at C1D1, C1D15 (2 weeks after treatment), C2D1 (3 weeks after treatment) and C4D1 (9 weeks after treatment) were collected and subject to 10x Genomics protocol. Associated metadata also included.		293
EGAD50000000371	This dataset is related to the NeoBCC trial and includes 12 human tissue samples	Illumina NovaSeq 6000	12
EGAD50000000373	Genomic alterations accumulate in the somatic cells throughout an individual’s lifetime. Recent sequencing studies have documented widespread mutations in the nuclear genome and the frequent clonal competition of normal cells carrying mutations. However, the landscape of mitochondrial DNA (mtDNA) heteroplasmy in normal human tissues is poorly understood. This study investigated the whole genome sequences (WGSs) of 2,096 clones established from non-neoplastic healthy single cells obtained from 31 donors. In addition, we analyzed 31 WGSs of neoplastic cells, including 12 clones established from adenomatous polyps from one individual with MUTYH-associated polyposis and 19 matched colorectal carcinomas from individuals who donated normal colorectal clones.	Illumina NovaSeq 6000	823
EGAD50000000375	We performed a systematic, genome-wide investigation of enhancer regions in colorectal cancer (CRC). We identified 12,117 putative enhancer regions using H3K27ac and H3Kme1 ChIP-seq and ATAC-seq. We performed scRNA-seq in HT29 and SW480 (MSS CRC cell lines) using the Parse Biosciences WT-mega kit with CRISPRi/dCas9 inhibition of these regions (Perturb-seq). The Parse split-pipe pipeline was used to demultiplex the raw fastq files into the processed files (mtx files for genes and gRNA) for each cell line.	Illumina NovaSeq 6000	2
EGAD50000000376	Single-cell RNA-seq profiling of effector (KLRG1+ PD1-), transitional (KLRG1+ PD1-intermediate), dysfunctional (CD39+ PD1+) and memory (IL7Rα+) CD8 T cells isolated from 4 human tumors (1 melanoma, 1 renal carcinoma and 2 ovarian carcinoma). Respective cell populations were identified and isolated using FACS. FASTQ files contain gene expression data, feature barcode antibody capture or TCR data on individual CD8 T cell subsets across the 4 different samples.	Illumina NovaSeq 6000	12
EGAD50000000377	Single-cell RNA-seq profiling of immune cells from human ovarian carcinoma, renal carcinoma and melanoma samples after 48h of ex vivo culture using the patient-derived tumor fragment platform (5 samples total). Samples were cultured in various conditions, including: untreated, CD8-IL2v-treated, CD8-IL2v + LCKi -treated, aPD1-treated, CD8-IL2v + aPD1 -treated, Untargeted IL2-treated, aCD3-treated and aCD3 + CD8-IL2-treated . FASTQ files contain gene expression data, feature barcode antibody capture or TCR data on immune cells from all conditions combined per patient.	Illumina NovaSeq 6000	15
EGAD50000000378	FFPE tissue and liquid biopsy (blood, pleural and peritoneal effusions) samples, processed with the cfRRBS (cell-free reduced representation bisulfite sequencing) protocol.	unspecified	181
EGAD50000000379	TCRseq data from Lauss et al Nat Comm 2024: Molecular patterns of resistance to immune checkpoint blockade in melanoma.	NextSeq 500	31
EGAD50000000380	Whole Exome Sequencing data from Lauss et al Nat Comm 2024. Molecular patterns of resistance to immune checkpoint blockade in melanoma.		74
EGAD50000000382	The dataset consist of DNA and RNA sequencing results and metadata of the samples. All sample numbers starting with 6716 are tumor samples which has been sequenced using WES (see BAM files). It concerns biopsies of metastatic lesions from patients with BRAFV600 mutated melanoma, obtained before, during and after the study treatment (see samples metadata) and in some cases blood for germline mutation analysis. Sequencing is performed using the Illumina Novaseq 6000 system.	Illumina NovaSeq 6000	41
EGAD50000000383	For RNA sequencing: All sample numbers starting with 6717 are tumor samples which has been sequenced using transcriptomics (see BAM files). It concerns biopsies of metastatic lesions from patients with BRAFV600 mutated melanoma, obtained before, during and after the study treatment (see samples metadata). RNA sequencing is performed using the Illumina Novaseq 6000 system.	Illumina NovaSeq 6000	30
EGAD50000000384	This dataset contains a tumor + normal DNA sequence data and tumor RNA seq data for a medulloblastoma patient.	unspecified	2
EGAD50000000385	Data pertains longitudinal transcriptomic data measured from blood obtained from patients with Crohn's disease that were starting treatment with vedolizumab. Samples were obtained prior to treatment and approximately 26 weeks into treatment during response assessment. At response assessment, patients were classified as responders (R) or non-responders (NR) based on a strict combination of endoscopic, biochemical and clinical criteria: ≥50% reduction in the endoscopic SES-CD score, corticosteroid-free clinical remission (≥3 point drop98 in HBI or HBI ≤4 and no systemic steroids) and/or biochemical response (C-reactive protein (CRP) and fecal calprotectin reduction ≥50% or ≤5 mg/L and fecal calprotectin ≤250 µg/g). Modified response was defined as a combination of corticosteroid-free clinical- (HBI ≤4) and biochemical (CRP ≤5 mg/L and/or fecal calprotectin ≤250 µg/g) remission between week 26-52 without treatment change through week 52. Transcriptomic analyses was conducted through RNA sequencing, wherein mRNA was extracted utilizing the QIAsymphony system, converted into cDNA and sequenced in a paired-end format on the Illumina NovaSeq6000 at the Amsterdam UMC Core Facility Genomics, generating a dataset comprising 40 million 150 bp-reads.	Illumina NovaSeq 6000	38
EGAD50000000386	Data pertains longitudinal transcriptomic data measured from blood obtained from patients with Crohn's disease that were starting treatment with ustekinumab. Samples were obtained prior to treatment and approximately 26 weeks into treatment during response assessment. At response assessment, patients were classified as responders (R) or non-responders (NR) based on a strict combination of endoscopic, biochemical and clinical criteria: ≥50% reduction in the endoscopic SES-CD score, corticosteroid-free clinical remission (≥3 point drop98 in HBI or HBI ≤4 and no systemic steroids) and/or biochemical response (C-reactive protein (CRP) and fecal calprotectin reduction ≥50% or ≤5 mg/L and fecal calprotectin ≤250 µg/g). Modified response was defined as a combination of corticosteroid-free clinical- (HBI ≤4) and biochemical (CRP ≤5 mg/L and/or fecal calprotectin ≤250 µg/g) remission between week 26-52 without treatment change through week 52. Transcriptomic analyses was conducted through RNA sequencing, wherein mRNA was extracted utilizing the QIAsymphony system, converted into cDNA and sequenced in a paired-end format on the Illumina NovaSeq6000 at the Amsterdam UMC Core Facility Genomics, generating a dataset comprising 40 million 150 bp-reads.	Illumina NovaSeq 6000	47
EGAD50000000387	The current data pertains RNA-sequencing reads obtained from thyroid samples acquired from fetuses with Down syndrome and fetuses with no genetic/developmental abnormality. Total RNA was isolated from left lobe from thyroid samples using a hand-held homogenizer and the Promega ReliaPrep RNA Miniprep System (Thermo Fisher Scientific). RNA yield was determined with the NanoDrop Microvolume Spectrophotometer (Thermo Fisher Scientific). Fragmentation and mRNA library preparation was performed using the Kapa mRNA Hyperprep Kit (Roche, Basel, Switzerland). Libraries were equimolar pooled and quality was checked on a TapeStation system using the DNA1000 ScreenTape (Agilent Technologies, Santa Clara, CA, USA). Libraries were sequenced with poly(A) selection to sequence all messenger RNA for gene expression analysis on the NovaSeq6000 PE150 (Illumina, San Diego, CA, USA), producing at least 40M 150-bp paired-end reads per library.	Illumina NovaSeq 6000	12
EGAD50000000388	These are genomic and transcriptomic high risk molecular segments for 167 samples collected at baseline from deidentified patients enrolled in cohorts A,B or D in the CC-220-MM-001 clinical study. Data are provided in a patient call table in csv format. Calls are generated from WGS data with mutations called by mutect2 best practices pipeline, CNV calls from Battenberg; and RNA-seq aligned with star aligner and quantified by Salmon.	Illumina Genome Analyzer	167
EGAD50000000389	This dataset contains 228 paired fastq files sequenced with Illumina Novaseq 6000, and the file with the sample and clinical data.	Illumina NovaSeq 6000	114
EGAD50000000391	This dataset contains unpaired FASTQ files of 316 endometrial cancer cases and 316 matched controls representing circulating RNAs. RNA count files are provided as the output of sncRNA pipeline and represent 12 RNA types (isomiR, lncRNA, precursor miRNA, miRNA, miscRNA, mRNA, piRNA, scaRNA, snoRNA, snRNA, tRF, and tRNA). We also provide metadata of the samples.	Illumina NovaSeq 6000	632
EGAD50000000392	This research project was a collaboration between Trinity College Dublin, Ireland and the Stanley Center at the Broad Institute. In this project we sequenced and analyzed the whole exomes of 191 Bipolar case/control samples from collaborators in Ireland. Genomic DNA from each sample was sequenced to a mean depth of 20x. The project used Illumina WXS sequencing of DNA and the file type is cram.	HiSeq X Ten	191
EGAD50000000393	The dataset includes WES sequencing data on PRE- treatment biopsies of lymph node metastasis (n=79) The technology used for sequencing is llumina HiSeq. The PRADO trial tested a personalized response-directed treatment approach based on the pathologic response after neoadjuvant ipilimumab plus nivolumab in stage III melanoma patients. In patients achieving a major pathologic response, in their index lymph node (the largest lymph node metastasis at baseline), therapeutic lymph node dissection (TLND) and adjuvant therapy were omitted. Patients with partial response underwent TLND only, whereas patients with pathologic non-response underwent TLND and adjuvant systemic therapy ± synchronous radiotherapy.	Illumina NovaSeq 6000	158
EGAD50000000394	The data set is composed of single cell transcriptomics data for adrenal glands from 11 deceased organ donors (10x 3') and spatial transcriptomic data for adrenal glands from 4 deceased organ donors (10x Visium).	Complete Genomics Illumina HiSeq 4000 NextSeq 2000	17
EGAD50000000395	RNA sequencing of 168 pulmonary samples including lung preneoplasia atypical adenomatous hyperplasia (AAH, N=38), adenocarcinoma in situ (AIS, N=22), minimally invasive adenocarcinoma (MIA, N=19) and invasive lung adenocarcinoma (ADC, N=38) and adjacent lung tissues (Normal, N=62).	Illumina NovaSeq 6000	168
EGAD50000000396	Whole genome sequencing of 42 pulmonary samples including lung preneoplasia atypical adenomatous hyperplasia (AAH, N=5), adenocarcinoma in situ (AIS, N=7), minimally invasive adenocarcinoma (MIA, N=6) and invasive lung adenocarcinoma (ADC, N=8) and adjacent lung tissues (Normal, N=16).	Illumina NovaSeq 6000	42
EGAD50000000397	Multi-region exome sequencing of 271 pulmonary nodules and matched adjacent lung tissue including lung preneoplasia atypical adenomatous hyperplasia (AAH, N=49), adenocarcinoma in situ (AIS, N=42), minimally invasive adenocarcinoma (MIA, N=37), invasive lung adenocarcinoma (ADC, N=86) and adjacent lung tissues (Normal, N=57).	Illumina NovaSeq 6000	271
EGAD50000000398	Nasal swabs were collected from COVID-19 patients during the acute phase of infection as well as 3 and 12-months post-infection. Cells isolated from nasal swabs were subjected to whole-genome enzymatic DNA methylation sequencing (n=33) and scRNA-seq (n=24). For n=12 former COVID-19 pateints, follow-up samples were collected at 3 and 12 months post-infection and subjected to scRNA-seq.	Illumina NovaSeq 6000	45
EGAD50000000400	This dataset contains single-cell BCR sequencing data generated by the Cellranger pipeline (v3.1.0, 10X Genomics) from 5 LN and 1 PB samples of 5 patients with T follicular helper cell lymphomas (TFHLs) as well as 7 homeostatic LN (HLN) samples.		13
EGAD50000000401	This dataset contains WES data analyzed using the Genomon2 pipeline (v.2.6.2, https://github.com/Genomon-Project) from 14 patients with T follicular helper cell lymphomas (TFHLs). A list of somatic mutations called by the Genomon2 pipeline for each sample is provided as a txt file.		32
EGAD50000000402	This dataset contains single-cell count data generated by the Cellranger pipeline (v3.1.0, 10X Genomics) from 9 LN and 16 PB samples of 14 patients with T follicular helper cell lymphomas (TFHLs) as well as 7 homeostatic LN (HLN) samples.		32
EGAD50000000403	This dataset contains single-cell TCR sequencing data generated by the Cellranger pipeline (v3.1.0, 10X Genomics) from 9 LN and 16 PB samples of 14 patients with T follicular helper cell lymphomas (TFHLs) as well as 7 homeostatic LN (HLN) samples.		32
EGAD50000000404	The abscopal effects of radiation may sensitize immunologically “cold” tumors to immune checkpoint inhibition (ICI). We investigated the immunostimulatory effects of radiotherapy leveraging multi-omic analyses of serial tissue and blood biospecimens (n=293) from a phase 2 clinical trial of stereotactic body radiation therapy (SBRT) followed by pembrolizumab in metastatic non-small cell lung cancer (NSCLC; NCT02492568). Patients with immunologically-cold tumors (low tumor mutation burden, null PD-L1 expression, WNT-pathway mutated) in the SBRT arm had significantly longer progression-free survival compared to ICI alone (P<0.05). Induction of interferon-gamma, interferon-alpha, and antigen processing and presentation gene sets was significantly enriched post SBRT in non-irradiated tumor sites (FDR adjusted P<0.01). Significant on-therapy expansions of new and pre-existing TCR clones in both the tumor and blood compartments were noted in the SBRT arm (P<0.05). These findings support the systemic anti-tumor effects of immuno-radiotherapy and may open a therapeutic window of opportunity to overcome resistance to ICI.	Illumina HiSeq 2000 Illumina NovaSeq 6000	218
EGAD50000000405	Sequencing results of single-cell transcriptome and antibody libraries from two biological experiments of cord blood progenitor cells. Four samples from different time points were processed using 10X Genomics Chromium Next GEM Single Cell 3’ Reagent Kits v3.1. Gene expression and antibody-derived libraries were sequenced separately.	NextSeq 2000	16
EGAD50000000406	scRNA sequencing and scTCR sequencing on three tumor lesions derived from one patient receiving adoptive tumor-infiltrating lymphocyte (TIL) therapy. Expanded metastasis was used for TIL expansion and yielded the TIL product which also was used for scTCR sequencing. 10 weeks after transfer of the TIL product the regressing metastasis was resected. After the patient progressed the progressing metastasis was removed (61 weeks).	Illumina NovaSeq 6000	4
EGAD50000000407	Sequencing files for TOPARP-B patients as described in manuscript	Illumina NovaSeq 6000 NextSeq 500	255
EGAD50000000408	This dataset contains RNA-sequencing data of 169 IDH-mutant astrocytoma samples included in the GLASS-NL cohort	Illumina NovaSeq 6000	169
EGAD50000000410	We profile the adult human CNS from distinct regions, sex and ages, for chromatin accessibility at the single-nuclei level. Tissue was collected from 20 different donors (10 male, 10 female) within the ages of 34 to 74 years old. Additionally we performed Single-cell nanoCUT&Tag H3K27ac and H3K27me3 profiling for 4 of the donors and multiome single cell assay (RNA-seq+ATAC-seq) for 2 of the donors. Each donor donated fresh frozen white matter from the following three tissue regions: primary motor cortex (Brodmann area 4, BA4), arbor vitae cerebelli (CB) and fasciculi cuneatus and gracilis from and cervical spinal cord (CSC). Tissue was processed semi-randomly, ensuring that each batch of experiments had a representation of both sexes and all three tissue regions. Completed libraries were again randomly multiplexed during sequencing to minimize batch effects. We also performed micro-C,2 replicates, of iPS human derived cell cultures of human oligodendrocyte precursor cells (hOPC).	Illumina NovaSeq 6000	60
EGAD50000000411	A mutation accumulation experiment in colorectal cancer (CRC) derived tumoroids. A sequential single-cell cloning approach was adopted to measure the mutation rate in eight tumoroids obtained from five patients. WGS was also performed on their matched normal tissue and on standard tumoroids cultures without any cloning step.	unspecified	188
EGAD50000000413	The dataset consists of three samples: two controls and one case of juvenile Parkinsonism. Whole Exome Sequencing (WXS) was performed on these samples using hybrid selection for library preparation. Sequencing was carried out on the Illumina NextSeq 500 platform. For each sample, two FASTQ files containing paired-end reads (R1 and R2) were generated. The data deposited consists of the corresponding FASTQ files.	NextSeq 500	3
EGAD50000000414	RNA sequencing (fastq files) of white blood cells (WBCs) from healthy donors (n=376) and cancer patients (n=421) with different diagnoses, stages of disease and previously administered treatments, was performed. Samples from cancer patients were collected from the BostonGene clinical program; all patients provided written consent per IRB-approved protocols. Blood samples from healthy donors were purchased from multiple collection centers throughout the United States. Whole blood samples (3 ml) in K2-EDTA tubes received within 24 hours of collection at RT underwent red blood cell (RBC) lysis to isolate WBCs. Isolated WBCs for RNA sequencing were centrifuged at 300 x g for 5 minutes with a maximum of 10^6 cells per vial. The supernatant was removed, and the cells were resuspended in cold Homogenization Buffer (2% 1-Thioglycerol, Promega). Samples were then frozen at -80°C until extraction. RNA extraction was performed from frozen samples with Maxwell RSC simplyRNA Cells Kit (Promega) using the benchtop automated Maxwell RSC Instrument (Promega). Libraries were prepared with Illumina TruSeq® Stranded mRNA Library Prep (Poly-A mRNA; stranded). Libraries were sequenced on NovaSeq 6000 as Paired-End Reads (2x150) with targeted coverage of 50 mln reads.	Illumina NovaSeq 6000	797
EGAD50000000415	The ChRCC study WES dataset contains raw whole exome sequencing data of 17 tumor and 7 adjacent normal samples from 7 UTSW patients, who have consented to depositing their genomic data to public repository. WES was performed using 75bp paired-end fragments at an average read depth > 100x on a HiSeq2500 platform (Illumina, San Diego, CA, USA). The raw data is in fastq format.	Illumina HiSeq 2500	24
EGAD50000000416	The ChRCC study RNA-Seq dataset contains raw whole transcriptome sequencing data of 16 tumor and 6 adjacent normal samples from 7 UTSW patients, who have consented to depositing their genomic data to public repository. RNA-Seq was performed using 50bp single-end on a HiSeq2500 platform (Illumina, San Diego, CA, USA). 50M reads per sample on average. The raw data is in fastq format.	Illumina HiSeq 2500	22
EGAD50000000417	This dataset contains 125 BAM files of cord blood hematopoietic stem and progenitor cell DNA, sequenced with Illumina Novaseq 6000, and 125 files with the variant calling performed to the sequencing data.	Illumina NovaSeq X	125
EGAD50000000418	In order to perform comprehensive SNV and CNA analyses of a cohort of BCP-LBL patients, we performed whole exome sequencing of 41 tissue samples from BCP-LBL patients. Because the material was available as FFPE, the target coverage of the samples was >200x.	Illumina NovaSeq 6000	41
EGAD50000000419	In order to perform comprehensive transcriptomics and gene fusion analyses of a cohort of BCP-LBL patients, we performed RNA sequencing of 49 tissue samples from BCP-LBL patients. Because the material was available as FFPE, and had a relative low quality, we used a capture-based approach, where NGS libraries obtained from total RNA were captured in 4-plex, using a whole exome capture panel.	Illumina NovaSeq 6000	49
EGAD50000000420	Cancer samples for neo-open reading frame peptides that comprise the tumor framome are a rich source of neoantigens for cancer immunotherapy.	HiSeq X Ten PromethION	61
EGAD50000000421	Clinicopathologic features of the ten patients with ovarian immature teratomas studied by whole-exome sequencing	Illumina HiSeq 4000	70
EGAD50000000422	This is a meta-analysis of myeloma datasets, both with and without the UK Biobank cohort included.		1
EGAD50000000424	This dataset contains fastq files for paired blood-tumor scRNA-seq samples from 5 NSCLC patients and paired blood-tumor scATAC-seq samples from 2 NSCLC patients, both sequenced with NovaSeq or Illumina HiSeq - Rapid Run.	Illumina HiSeq X Illumina NovaSeq X	14
EGAD50000000425	To characterize clear cell Renal Cell Carcinoma (ccRCC), we set up a comprehensive multi-omics pilot study on a VHL-ccRCC-patient. Whole Genome Sequencing (WGS) was applied on six ccRCCs, four cysts and one blood DNA sample as germinal control. Raw-data were analyzed with GENomics-DRAGEN platform. We will upload WGS-Data for 11 samples (6 ccRCCs, 4 Cysts, 1 blood).	Illumina NovaSeq 6000	11
EGAD50000000426	All files for TIX individuals	Illumina HiSeq 4000	67
EGAD50000000427	The dataset includes bam files from WGS of 26 monoclonal patient derived organoid (PDO) lines isolated from 6 independent tumor subclones of a dMMR colorectal tumor. These organoids were grown as part of an in vitro timecourse experiment for the duration of 9 weeks; bam files represent the mutational load at the start of the timecourse (t0) and the end of the timecourse (t1).	Illumina NovaSeq 6000	53
EGAD50000000428	RNAseq aligned to hg38.p14 using nfcore/RNA-seq	Illumina NovaSeq X	4
EGAD50000000429	Research study of genetic susceptibility and analyses of polygenic risk scores in allergic diseases. This dataset consists of variant data derived from whole-genome sequencing in reference samples from one of the parental populations (i.e., North Africans). The dataset consist of 15 VCF files (and corresponding index files) with a total of 1.58 million variants called using GATK v4 and following the Broad Variant Calling Best Practices, from a set of unrelated individuals sequenced in paired-read mode 2x150 bp using an Illumina HiSeq 4000 sequencer.	Illumina HiSeq 4000	15
EGAD50000000430	Dataset corresponding to the bulk RNA seq dataset from prefrontal cortex for a total of N=44 samples. Tissue samples correspond to patients with different alpha synucleinopathies. Specifically, idiopathic Parkinson's disease (PD, N=20) monogenic PD caused by LRRK2 mutations (LRRK2-PD, N=7), multiple system atrophy (MSA, N=6) and neurologically healthy controls (N=11). RNA seq was carried out using ribosomal depletion. The totality of these samples were also sequenced using single nucleus RNA seq, and available within the same study as a separate dataset.	Illumina HiSeq 2500	90
EGAD50000000431	Single nucleus RNA seq dataset from prefrontal cortex of a total of N=46 individuals with alpha synucleinopathies and healthy controls. Tissue samples correspond to patients with Parkinson's disease (PD, N=20) monogenic PD caused by LRRK2 mutations (LRRK2-PD, N=7), multiple system atrophy (MSA, N=6) and neurologically healthy controls (N=13). RNA seq was carried out employing 10X Genomics Chromium. These samples were also sequenced using conventional bulk tissue RNA seq, and available within the same study as a separate dataset.	Illumina NovaSeq 6000	90
EGAD50000000432	Single nucleus RNA seq dataset for CI stratification of Parkinson's disease Single nucleus RNA seq dataset of prefrontal cortex tissue from a total of N=18 samples. Tissue samples correspond to patients with idiopathic Parkison's disease (PD) with varying levels of Complex-I activity (N=12) and neurologically healthy controls (N=6). Single nucleus RNA seq was carried out using 10X Genomics Chromium. This dataset corresponds to a subset of the samples sequenced using conventional bulk tissue RNA seq and available within this Study as a separate Dataset.	Illumina NovaSeq 6000	116
EGAD50000000433	Dataset corresponding to the bulk RNA seq dataset from prefrontal cortex for a total of N=98 samples. Tissue samples correspond to patients with idiopathic Parkison's disease (PD) with varying levels of Complex-I activity (N=79) and neurologically healthy controls (N=19). RNA seq was carried out using ribosomal depletion. A subset of these samples was additionally sequenced using single nucleus RNA seq and is available in the same Study as an additional Dataset.	Illumina HiSeq 2500	115
EGAD50000000434	Raw sequencing files from scRNA-seq dataset used in Schmassmann et al. 2023. Single-cell characterization of human GBM reveals regional differences in tumor-infiltrating leukocyte activation. Elife 12 (https://elifesciences.org/articles/92678) Dataset content: 14 samples from 5 donors File type: paired-end fastq files Technology: Illumina sequencing Experimentation used: scRNA-seq using the 10X technology	Illumina HiSeq 4000	14
EGAD50000000435	Long-read transcriptomic data of control and FECD exp positive CECs	Sequel	9
EGAD50000000436	Short-read transcriptomic data of unaffected control, expansion negative, and expansion positive FECD CECs	Illumina HiSeq 4000	10
EGAD50000000437	Dataset containing WGS data from 4 patients with 17 samples with multiple myeloma. The sequencing was always paired and performed using Illumina NovaSeq 6000	Illumina NovaSeq 6000	17
EGAD50000000438	Dataset containing 36 samples of multiple myeloma. The sequencing was always paired and performed on Illumina NextSeq 550 and Illumina NovaSeq 6000.	Illumina NovaSeq 6000 NextSeq 550	36
EGAD50000000439	18 patients were biopsied at resistance to an FGFR inhibitor with tumor WES and/or WTS performed to identify resistance mechanisms to selective FGFR2 inhibitors across FGFR2-driven malignancies.	Illumina HiSeq 4000 Illumina NovaSeq 6000 NextSeq 500	63
EGAD50000000441	The dataset contains whole genome sequencing data of 17 high-grade serous carcinoma (HGSC) patients sequenced with Novoseq 6000. The 46 samples are either fresh frozen tumour samples or blood samples. The files provided are paired fastq files.	Illumina NovaSeq 6000	46
EGAD50000000442	The dataset contains whole genome sequencing data of 32 high-grade serous carcinoma (HGSC) patients sequenced with Novoseq 6000. The 82 samples are either fresh frozen tumour samples or blood samples. The files provided are paired fastq files.	Illumina NovaSeq 6000	82
EGAD50000000443	This dataset regroups RNAseq of paired primary colorectal cancers and liver metastases extracted from 113 patients In Paris and Besançon University Hospitals. 216 FASTQ files from single-end 3' PolyA (QuantSeq 3′mRNA-Seq Kit FWD for Illumina (Lexogen)) RNAseq samples were generated On NOVASEQ 6000 (ILLUMINA).	Illumina NovaSeq 6000	216
EGAD50000000444	This dataset includes all protein biomarker measurements from the DETECT-A study. All protein biomarkers were evaluated with Luminex bead-based immunoassays obtained from Millipore.		9911
EGAD50000000445	To get insights about immune cells infiltrating cells the CSF of RRMS patients (n=5), we proceed to 10X scRNAseq of paired CSF and blood circulating cells (about 20000 cells each) sampled at diagnostic.	Illumina NovaSeq 6000	10
EGAD50000000446	The current dataset represents bulk RNA-Seq gene expression profiling (with an exome library preparation kit) of muscle-invasive bladder cancer tissue samples obtained before and after platinum-based chemotherapy. 89 samples are pre-treatment transurethral resection of the bladder tumor (TUR-BT) tissue at diagnosis (baseline), 86 are post-treatment cystectomy tissue (resected tumor bulk), comprising 76 pairs of samples from the same patients. FASTQ files contain gene expression data.	Illumina HiSeq 2500 Illumina NovaSeq 6000	175
EGAD50000000447	Sanger sequencing data of cell lines derived from bulk MV4-11 cells, targeting the R248 locus of TP53, consisting of either wildtype or homozygous mutant lines.	AB 3730xL Genetic Analyzer	4
EGAD50000000448	Targeted-capture sequencing data from 168 ENKTCL patients	unspecified	168
EGAD50000000449	snRNA-seq and spatial transcriptomic data and analysis of healthy (CTRL) and inflamed (immune-mediated necrotizing myopathy (IMNM) and inclusion body myositis (IBM)) human quadriceps muscle. This data set includes 19 snRNA-seq (7 CTRL, 4 IMNM, 8 IBM) and 8 ST samples (3 CTRL, 2 IMNM, 3 IBM).	Illumina NovaSeq 6000	27
EGAD50000000450	To study monocyte and macrophage activation in ANCA-associtated vasculitis (AAV), we performed bulk RNA sequencing of bead-selected monocytes and in vitro cultured monocyte-derived macrophages from AAV patients and healthy controls. Overview patients included for sequencing monocytes: - AAV active disease, n=4, MPO-AAV=4 - AAV remission, n=10, PR3-AAV=5, MPO-AAV=5 - Healthy controls, n=6 Overview patients included for sequencing monocyte-derived macrophages: - AAV active, n=1, PR3-AAV=1 - AAV remission, n=3, PR3-AAV=3 - Healthy controls, n=3	Illumina NovaSeq 6000	48
EGAD50000000451	This dataset contains whole exome sequencing (WES) of 29 HIV- EBV- primary central nervous system lymphoma (PCNSL) tumors and 5 activated B-cell-like PCNSL (ABC-PCNSL) and activated B-cell-like diffuse large B-cell lymphoma (ABC-DLBCL) cell lines (TK, HKBML, OCI-Ly3, HBL-1, TMD-8). Cases with secondary involvement of the CNS were excluded. DNA was extracted with AllPrep DNA/RNA FFPE kit and libraries were prepared using the Agilent SureSelect Human All Exon v6 + UTR kit. Paired end reads were aligned to GRCh37 using bwa mem v.0.7.17, and reads were sorted and duplicated were marked in the final BAM files. Access to this data is controlled. There are a number of steps that a researcher must take to obtain access to this data, including execution of a Data Access Agreement between the institutions. The process is overseen by the Technology Development Office; please contact our general email address TDOadmin@phsa.ca. Please only click the "request data" button on the EGA website after a Data Access Agreement is fully executed.	Illumina HiSeq 2500	34
EGAD50000000452	Data used in "An ultra-sensitive and specific ctDNA assay provides novel pre-operative disease stratification in early stage lung adenocarcinoma"	Illumina NovaSeq 6000	1004
EGAD50000000453	Pulmonary pleomorphic carcinoma (PPC) is an aggressive and highly heterogeneous non-small-cell lung carcinoma whose underlying biology is still poorly understood. Forty-two tumor areas including 39 primary tumors and 3 metastases from 20 PPC patients were microdissected and the histologically distinct components were subjected to whole exome sequencing (WES) separately. Twist Human Core Exome + RefSeq + Mito-Panel kit (Twist Bioscience) was used for the whole exome capturing according to manufacturer’s guidelines. Paired-end 100-bp reads were generated on the Illumina NovaSeq 6000. After the sequencing, reads were aligned against the reference human genome GRCh38 using Burrows-Wheeler Aligner (BWA, v0.7.12)	Illumina NovaSeq 6000	62
EGAD50000000454	ctDNA biomarker and relevant clinical data for divarasib phase I GO42144 study, including tumor type, KRAS G12C mutation detectability in plasma at baseline, baseline SLD, sites of metastasis, lines of therapy, best response, confirmed best response, PFS, ctDNA tumor fraction and KRAS G12C VAF at baseline/C1D15/C3D1.		308
EGAD50000000457	We performed a dietary intervention study comparing western diet, traditional diet and the supplementation of fermented banana beverage in a cohort of 66 individuals (22 in each study arm).	Illumina NovaSeq 6000	219
EGAD50000000458	Universal targeted haplotyping by droplet digital PCR sequencing (amplicon sequencing)	NextSeq 500	22
EGAD50000000459	Universal targeted haplotyping by droplet digital PCR sequencing (Target Capture sequencing)	NextSeq 500	25
EGAD50000000460	Our study sought to resolve, with single-molecule fidelity, the mismatches and damage events that precede DNA mutations. Using a novel single-molecule, long-read sequencing method (HiDEF-seq) we detect base substitutions when present in either one or both DNA strands. We also detect cytosine deamination, a common type of DNA damage, with single-molecule fidelity. This study profiled 134 samples from diverse tissues, including from individuals with cancer predisposition syndromes. These samples revealed single-strand mismatch and damage signatures. Since double-strand DNA mutations are only the endpoint of the mutation process, our approach enables new studies of how mutations arise in a variety of contexts, especially in cancer and aging.	Illumina NovaSeq 6000 Illumina NovaSeq X	21
EGAD50000000461	H3K27ac ChIP-seq datasets in human insulinoma samples	unspecified	12
EGAD50000000462	RNA-seq datasets in human insulinoma samples	unspecified	11
EGAD50000000463	H3K27me3 Cut&Tag datasets in human pancreatic islets and the EndoC-bH1 cell line	unspecified	2
EGAD50000000464	Whole-Genome Sequencing datasets of insulinoma samples and paired blood controls	unspecified	26
EGAD50000000465	This dataset contains Ribo-seq BAM files of Epstein-Barr virus-transformed B-lymphoblastoid cell lines from six individuals (HG00114, HG00282, NA12005, NA12044, NA12717 and NA12751) of the GEUVADIS Project. For each EBV-LCL, 20 million cells were treated with 2 μg/ml harringtonine and 100 μg/ml cycloheximide. After lysing cells in lysis buffer supplemented with 100 μg/ml cycloheximide, ribosome complexes were purified by density purification. Small RNA molecules were isolated using the NucleoSpin miRNA Kit (Bioke, Leiden, Netherlands), followed by RNA PAGE gel separation. Universal linkers were added after dephosphorylation. After reverse transcription, the cDNA was circularized and, after ribosomal RNA depletion, barcoded and sequenced on Illumina NextSeq 500.	NextSeq 500	6
EGAD50000000466	Transcriptome data of affected and controls.	Illumina NovaSeq 6000 Illumina NovaSeq X	6
EGAD50000000467	Fastq files and BAM files	Illumina HiSeq 2500 Illumina NovaSeq 6000	101
EGAD50000000468	For mature miRNAs and hairpins, BAM and FASTq files	Illumina HiSeq 2500	101
EGAD50000000469	Raw bam file for bulk RNA-seq of checkpoint-blockade treated lung cancer cohorts	unspecified	355
EGAD50000000470	Whole Exome Sequencing of 924 Bipolar cases and matched controls performed at the Broad Institute on a cohort from Umea, Sweden. The exome used Twist capture and samples were sequenced on Illumina HiSeqX machines producing cram files	HiSeq X Ten	924
EGAD50000000471	The impact of MSC(UC) on peripheral B cells from Systemic Lupus Erythematosus (SLE) patients was studied by 10X scRNAseq. This scRNAseq study encompassed 3 SLE patients at 3 time points: before or after (1 month, and 3 months) MSC injection in order to analyze B cell subsets and their DEG. The aim of this study was to observe the potential changes of B cell subsets after MSC(UC) injection in SLE patients.	Illumina NovaSeq X	3
EGAD50000000473	This project contains the 16 WGS tumor samples and the corresponding 15 WGS normal tissue samples (germline control) not yet deposited in the public domain (e.g., EGA, dbGaP) at the time of submission of the manuscript (Zhu et al.)	unspecified	31
EGAD50000000474	Whole genome sequencing (WGS) data of 12 tumours was generated on an Illumina NovaSeq or HiSeqX instrument. For frozen samples, libraries were constructed using a PCR-free library construction method. For tumours preserved by formalin fixation and paraffin embedding (FFPE), libraries were constructed with a method that included S1 nuclease treatment. Reads were aligned to the grch37 reference with bwa-mem 0.7.17.	Illumina NovaSeq X	12
EGAD50000000475	Osteosarcoma is a primary bone tumor that exhibits a complex genome characterized by gross chromosomal abnormalities. Osteosarcoma patients often develop metastatic disease, resulting in limited therapeutic options and poor survival rates. To gain knowledge on the mechanisms underlying osteosarcoma heterogeneity and metastatic process, it is important to obtain a detailed profile of the genomic alterations that accompany osteosarcoma progression. Therefore, in this study we performed WGS on multiple tissue samples from six patients with osteosarcoma, including the treatment naïve biopsy of the primary tumor, resection of the primary tumor after neoadjuvant chemotherapy, local recurrence and distant metastases.	Illumina NovaSeq 6000	30
EGAD50000000476	Data corresponding to the fastq of the 654 individuals included in the study.	NextSeq 500	654
EGAD50000000477	The cohort included 98 mild and 75 severe cases with a median age of 53 years. We amplified and sequenced the T Cell Receptor (TCR β) chain complementary determining region 3 (CDR3b) and performed bioinformatic analyses to assess repertoire diversity, clonality, allelic usage, and epitope affinity CDR3b clustering. CDR3b sequences were amplified by multiplex PCR and sequenced by Illumina. The resulting raw files were processed for clonotype assembly, filtering for coding clonotypes, removal of non-TRB alleles, and downsampling.	NextSeq 500	173
EGAD50000000478	This dataset contains paired-end fastq files for single-cell RNA-seq of 16 NPM1 mutated AML patients. 19 fastq pairs are included, with two samples split into two lanes and a second (technical) replicate for one patient. Libraries were constructed using 10x Genomics Chromium Next GEM Single cell 3’ v3.1 and sequencing was performed using Illumina NovaSeq6000.	Illumina NovaSeq 6000	17
EGAD50000000479	Mixture of 5 unrelated individuals sequenced by 10x as a scATAC-seq with low cell count. The dataset was then processed by Cell Ranger and deconvoluted to yield each individuals genetic profile using the De-goulash pipeline. The separation file is submitted as the processed file. The bam are submitted as unprocessed files.	Illumina NovaSeq 6000	1
EGAD50000000480	Mixture of 5 unrelated individuals sequenced by 10x as a scATAC-seq. The dataset was then processed by Cell Ranger and deconvoluted to yield each individuals genetic profile using the De-goulash pipeline. The separation file is submitted as the processed file. The bam are submitted as unprocessed files.	Illumina NovaSeq 6000	1
EGAD50000000481	reference whole exome sequence serving as a reference of individuals. Includes the raw GSa files and the called variants in vcf format merge for all individuals (S1-S5).	unspecified	1
EGAD50000000482	The dataset contains RNAseq profiles of 1040 patients from the ROBUST clinical trial (NCT02285062). The Allprep DNA/RNA FFPE kit was used to simultaneously purify genomic DNA and total RNA from formalin-fixed, paraffin embedded (FFPE) tissue sections. RNAseq libraries (75PE, 50M) were constructed using Illumina TruSeq RNA Access method. Fastq files are included. Also includes processed WGS mutation calls output.	Illumina HiSeq 2500	1095
EGAD50000000483	Metadata associated with 16SV4 ribosomal RNA gene sequencing and Long read microbiome whole metagenome sequencing of pediatric GI samples.		20
EGAD50000000484	Sample sheet for aligning samples with patients.		20
EGAD50000000485	16SV4 ribosomal RNA gene sequencing data of GI samples and associated mixed community cultures collected from pediatric patients at risk for IBD.	Illumina MiSeq	11
EGAD50000000486	Long read microbiome whole metagenome sequencing of pediatric GI samples.	unspecified	9
EGAD50000000487	Baseline and on-treatment tumor samples from human papillomavirus-positive oropharynx cancer patients receiving anti-CTLA-4 and anti-PD-1 immune checkpoint blockade were analyzed by single-cell RNA and TCR sequencing.	Illumina NovaSeq 6000	65
EGAD50000000488	ScRNA-seq FASTQ files from myocarditis and control cardiac muscle samples; and myositis and control skeletal muscle samples.	Illumina NovaSeq 6000	25
EGAD50000000489	Targeted capture sequencing of 364 samples representing a mix of Burkitt lymphoma, diffuse large B-cell lymphoma, follicular lymphoma, and high-grade B-cell lymphomas with MYC and BCL2 or BCL6 rearrangements. Capture space covers 3.5 Mb of the human genome including regions around MYC, BCL2, BCL6, PAX5, and IGH/K/L.	Illumina HiSeq X	364
EGAD50000000490	This dataset contains 190 fastq files sequenced with Illumina HiSeq500.	NextSeq 500	190
EGAD50000000491	Samples, in a form of PAXgene ﬁxed and paraffin-embedded biopsies, were collected from the multi-site, double-blind, randomized, placebo-controlled trial, aimed at dose-finding and assessing the efficacy and tolerability of a 6-week treatment with ZED1227 capsules vs. placebo in subjects with well-controlled celiac disease undergoing gluten challenge. Total RNA was extracted from the PaxFPE biopsy specimens (n = 116) using additional cuttings from the samples on which histomorphometry was previously assessed. For the extraction, an RNeasy Kit (Qiagen, Hilden, Germany) was used according to the manufacturer’s instructions. Library preparation and next-generation sequencing (NGS) were performed by the Qiagen NGS Service. A total of 10 ng of purified RNA was converted into cDNA NGS libraries. Library preparation was quality controlled using capillary electrophoresis. Based on the quality of the inserts and the concentration measurements, the libraries were pooled in equimolar ratios and then sequenced on a NextSeq (Illumina Inc., San Diego, USA) sequencing instrument according to the manufacturer’s instructions, with 100 bp read length for read 1 and 27bp for read 2. The raw data were de-multiplexed, and FASTQ files for each sample were generated using bcl2fastq2 software (Illumina Inc., San Diego, USA).	NextSeq 550	116
EGAD50000000492	uman duodenal tissues for establishing organoid cultures used in this study were sourced from de-identified surgical specimens (n = 3) of the duodenum obtained from patients who had undergone biopsy procedures unrelated to CeD at Tampere University Hospital. The protocol was approved by the Ethics Committee of Tampere University Hospital, Tampere, Finland (ETL code R18082). RNA from the duodenal organoids was isolated using an RNeasy Kit (Qiagen, Hilden, Germany) following the manufacturer’s instructions. RNA purity and concentration were measured using a NanoDrop One spectrophotometer (NanoDrop Technologies, Wilmington, Delaware, USA). Preparation of the RNA library and transcriptome sequencing was conducted by Novogene Co., LTD (Cambridge, UK). Messenger RNA was purified from total RNA using polyA selection and subjected to library construction. Sequencing was performed on an Illumina platform, and 150 bp paired-end reads were generated.	Illumina NovaSeq 6000	11
EGAD50000000493	Single cell RNA libraries were prepared using 10x Genomics Chromium Next GEM Single Cell 3’ v2 reagents. The samples were barcoded and each library were pooled with two samples at equimolar concentrations. The pooled libraries (n=4) were sequenced on the NextSeq 500 machine (Illumina) with paired-end sequencing and dual indexing as recommended in the manufacturer’s protocol; 26 and 98 cycles for the respective Read 1 and 2, and 8 cycles for i7 index.	NextSeq 500	4
EGAD50000000494	This dataset includes 19 bamfiles of tumor, relapsed tumor and normal samples derived from 7 patients.	Illumina NovaSeq 6000	19
EGAD50000000495	Immune memory is key to effective antimicrobial responses, but the impact of mRNA vaccines on this process is not fully understood. Our research shows that SARS-CoV-2 mRNA vaccines alter the epigenetic profile of human macrophages, specifically enhancing histone acetylation, which is linked to immune training. Significant epigenetic changes, along with increased cytokine release, require two vaccine doses. However, these effects diminish over time but can be restored with a booster dose six months later, maintaining a strong pro-inflammatory response.	Illumina NovaSeq 6000	40
EGAD50000000496	Paired-end ribodepletion RNAseq performed on 257 samples of Burkitt lymphoma, diffuse large B cell lymphoma, follicular lymphoma, and high-grade B cell lymphoma with MYC and BCL2 or BCL6 rearrangements. RNA was derived from either fresh frozen or formalin fixed/paraffin embedded (FFPE) tissues. Sequencing was performed on Illumina HiSeqX or NovaSeqX instruments.	Illumina NovaSeq X	257
EGAD50000000497	CITE-Seq (cellular indexing of transcriptomes and epitopes) of 51 lymph node (LN) samples from including mantle cell lymphoma (MCL, n = 8), follicular lymphoma (FL, n = 12), germinal center (GCB, n = 5) or activated B-cell (non-GCB/ABC, n = 7) diffuse large B-cell lymphoma (DLBCL), and marginal zone lymphomas (MZL, n = 11), in addition to non-malignant reactive lymph nodes (rLN, n = 8). Of the malignant LN samples, 20 were collected at the time of initial diagnosis and 23 were from patients who had previously undergone one or more lines of systemic treatment. Relapse samples were collected at least 3 months after cessation of systemic treatment. This dataset also includes 5 prime RNA sequencing and immune receptor sequencing (10X Genomics) for 11 of these samples (2 rLN, 2 MCL, 3 FL, 2 DLBCL (GCB) and 2 MZL).	NextSeq 2000	51
EGAD50000000498	This dataset contains scRNAseq data of human telencephalic organoids, at day 120, from 4 different sequencing runs (libraries), as follows (please see cell line clone nomenclature in original publication): 1. Library 177136: Pat.2 ARID1B+/+ clone 2c (2 organoids, E11rep_1, E11rep_2), Pat.2 ARID1B+/- clone 2a (2 organoids: B002orig_1, B002orig_2) 2. Library 178119: Pat.2 ARID1B+/+ clone 2c (3 organoids: E11rep_1, E11rep_2, E11rep_3), Pat.2 ARID1B+/- clone 2a (3 organoids: B002orig_1, B002orig_2, B002orig_3) 3. Library 178120: Pat.2 ARID1B+/+ clone 2d (3 organoids: A3rep_1, A3rep_2, A3rep_3), Pat.2 ARID1B+/- clone 2b (3 organoids: B002_F7_1, B002_F7_2, B002_F7_3) 4. Library 184337: Pat.1 ARID1B+/- clone 1a (3 organoids: B001orig_1, B001orig_2, B001orig_3), HD.1 ARID1B+/+ clone 3a (3 organoids: 176_1, 176_2, 176_3), HD.1 ARID1B+/- clone 3b (3 organoids: F10_1, F10_2, F10_3)	Illumina NovaSeq X	25
EGAD50000000499	This dataset contains scMultiomics (scRNAseq + scATACseq) data of human telencephalic organoids, at day 120, from 3 different sequencing runs (libraries), as follows (please see cell line clone nomenclature in original publication): 1. Library 233269(RNA)/235293(ATAC): Pat.1 ARID1B+/- clone 1a (9 pooled organoids). 2. Library 233270(RNA)/235294(ATAC): HD.1 ARID1B+/+ clone 3a (9 pooled organoids). 3. Library 233271(RNA)/235295(ATAC): HD.1 ARID1B+/- clone 3b (8 pooled organoids).	Illumina NovaSeq 6000	6
EGAD50000000500	This whole-section GeoMx Digital Spatial Profiler (DSP) dataset was a part of the discovery cohort in the study. The dataset contains raw FASTQ files from Regions of Interest (ROIs) across complete tumor tissue sections, enabling detection of 18,000 protein-coding genes. GeoMx DSP data from 1063 ROIs (after QC) from 15 GCs were included in the datasets.	Illumina NovaSeq X	1233
EGAD50000000501	Single-end 75 basepair small-RNA sequencing of 90 SDHB-deficient non-metastatic and metastatic pheochromocytoma and paraganglioma. Samples were prepared using the NEXTFLEX® Small RNA-Seq Kit v3 (Bioo Scientific ). Samples were sequenced at the molecular genomics core (Peter MacCallum Cancer Centre) using 50bp single end sequencing on the Illumina NextSeq 500 (Illumina, USA). This dataset contains raw sequencing reads in FASTQ format.	NextSeq 500	90
EGAD50000000502	Paired-end 150 basepair whole genome sequencing of 94 SDHB-deficient non-metastatic and metastatic pheochromocytoma and paraganglioma. Libraries were prepared using the Illumina® TruSeq™ DNA Nano library preparation method according to the manufacturer’s instructions at The University of Melbourne Centre for Cancer Research (UMCCR) using 200ng input DNA and a 550 base pair insert size. Samples were sequenced in separate batches on the Illumina® Nova-Seq 6000 according to manufacturer’s instructions (Illumina, USA). Included in this dataset are FASTQ format files containing raw read data, CRAM format files containing reads aligned to GRCh38, and VCF format files containing germline variants calls.	Illumina NovaSeq 6000	173
EGAD50000000503	10x genomics single-nuclei ATAC sequencing of 7 SDHB-deficient non-metastatic and metastatic pheochromocytoma and paraganglioma, and 1 normal adrenal medulla sample. snATAC-seq was conducted using the "Van Helsing" protocol (dx.doi.org/10.17504/protocols.io.bw52pg8e). Once processed, snATAC-seq libraries were sequenced on the Illumina NextSeq 500 (Illumina, USA) using 50bp paired-end sequencing. Raw sequencing data in BCL format was demultiplexed using cellranger-atac mkfastq (V2.0.0). This dataset contains sequencing reads in FASTQ format. R1/R3 files contain the forward and reverse sequencing reads, respectively, R2 contains the 10x Barcode.	Illumina NovaSeq 6000	8
EGAD50000000504	Paired-end 150 basepair whole transcriptome sequencing of 91 SDHB-deficient non-metastatic and metastatic pheochromocytoma and paraganglioma. Samples were prepared using the NEB-Next directional RNA-Seq kit (NEB, USA) and underwent 150bp paired-end sequencing on the Illumina NovaSeq 6000 (Illumina, USA) according to manufacturer’s instructions. This dataset contains raw sequencing reads in FASTQ format.	Illumina NovaSeq 6000	91
EGAD50000000505	10x genomics single-nuclei RNA sequencing of 9 SDHB-deficient non-metastatic and metastatic pheochromocytoma and paraganglioma. snRNA-seq was performed using the ‘Frankenstein’ protocol (dx.doi.org/10.17504/protocols.io.bqxymxpw). Briefly, nuclei were extracted from frozen tissues and subjected to fluorescence-activated nuclei sorting (FANS) using 4′,6-diamidino-2-phenylindole (DAPI). Sorting was performed using a BD FACSaria 2 instrument, sorting between 3000 and 10,000 nuclei per sample, capturing both diploid and tetraploid populations. FAN-sorted nuclei were immediately processed using either the 10x Chromium Single Cell 5’ (PN-1000006, 4 samples) or 3’ (PN-1000075, 4 samples) Library & Gel Bead Kit (10x Genomics, USA). Once processed, snRNA-seq libraries were sequenced on the Illumina Nova-Seq 6000 (Illumina, USA) using 150bp paired-end sequencing. This dataset contains raw sequencing reads in FASTQ format.	Illumina NovaSeq 6000	9
EGAD50000000506	Raw NGS data of primary Acute Myeloid Leukemia samples. The libraries were obtained with Sophia Genetics Myeloid Solution kit, targeting 30 genes involved in AML.	Illumina MiSeq	33
EGAD50000000507	The dataset contains whole genome sequencing data of 42 high-grade serous carcinoma (HGSC) patients sequenced with Novoseq 6000. The 100 samples are either fresh frozen tumour samples or blood samples. The files provided are paired fastq files.	Illumina NovaSeq 6000	100
EGAD50000000508	Shallow WGS long-read sequencing of primary neuroblastoma samples.	MinION	13
EGAD50000000509	Shallow long-read nanopore WGS of ecDNA containing neuroblastoma cell lines.	MinION	5
EGAD50000000510	scATAC-seq by 10xGenomics Droplet Sequencing from two human donors. scATAC sequences are available for FACS-sorted CD45+ cells from blodd, skin and vat (visceral adipose tissue)	NextSeq 500	12
EGAD50000000512	To investigate the influence of lifelong exercise training on the response of skeletal muscle to a bout of acute exercise we generated targeted epigenomic data from long-term endurance (8 men) and strength (8 men) trained individuals and healthy age-matched untrained controls (8 men). Skeletal muscle biopsies were taken from M. vastus lateralis before, directly after, and 3hrs following acute exercise. Control subjects completed one bout of acute endurance exercise and one bout of acute resistance exercise, separated by 4-8 weeks, athletes completed one bout in their respective form of sports. All 96 samples were used for DNA extraction and targeted library construction using a custom Twist Biosciences panel and following EM-methylation tranformation were sequenced (2x150bp paired end) on the Illumina NovaSeq 6000.	Illumina NovaSeq 6000	96
EGAD50000000513	Genotype array data from 152 South African Coloured individuals. Typed on Illumina H3Africa array.		152
EGAD50000000514	For two donors, we collected CD34 plus and CD34 minus hematopoietic cells for the single-cell RNA-seq analysis.	NextSeq 500	4
EGAD50000000515	Genome-wide studies have uncovered multiple independent signals at the RREB1 locus associated with altered type 2 diabetes risk and related glycaemic traits. However, little is known about the function of the zinc finger transcription factor Ras-responsive element binding protein 1 (RREB1) in glucose homeostasis or how changes in its expression and/or function influence diabetes risk.	NextSeq 500	62
EGAD50000000516	The coding variant (p.Arg192His) in the transcription factor PAX4 is associated with an altered risk for type 2 diabetes (T2D) in East Asian populations. In mice, Pax4 is essential for beta cell formation but its role on human beta cell development and/or function is unknown. Participants carrying the PAX4 p.His192 allele exhibited decreased pancreatic beta cell function compared to homozygotes for the p.192Arg allele in a cross-sectional study in which we carried out an intravenous glucose tolerance test and an oral glucose tolerance test. In a pedigree of a patient with young onset diabetes, several members carry a newly identified p.Tyr186X allele. In the human beta cell model, EndoC-βH1, PAX4 knockdown led to impaired insulin secretion, reduced total insulin content, and altered hormone gene expression. Deletion of PAX4 in human induced pluripotent stem cell (hiPSC)-derived islet-like cells resulted in derepression of alpha cell gene expression. In vitro differentiation of hiPSCs carrying PAX4 p.His192 and p.X186 risk alleles exhibited increased polyhormonal endocrine cell formation and reduced insulin content that can be reversed with gene correction. Together, we demonstrate the role of PAX4 in human endocrine cell development, beta cell function, and its contribution to T2D-risk.	Illumina NovaSeq 6000	64
EGAD50000000517	Resolving causal genes for type 2 diabetes at loci implicated by genome-wide association studies (GWAS) requires integrating functional genomic data from relevant cell types. Chromatin features in endocrine cells of the pancreatic islet are particularly informative and recent studies leveraging chromosome conformation capture (3C) with Hi-C based methods have elucidated regulatory mechanisms in human islets. However, these genome-wide approaches are less sensitive and afford lower resolution than methods that target specific loci. Methods: To gauge the extent to which targeted 3C further resolves chromatin-mediated regulatory mechanisms at GWAS loci, we generated interaction profiles at 23 loci using next-generation (NG) capture-C in a human beta cell model (EndoC-βH1) and contrasted these maps with Hi-C maps in EndoC-βH1 cells and human islets and a promoter capture Hi-C map in human islets. Results: We found improvements in assay sensitivity of up to 33-fold and resolved ~3.6X more chromatin interactions. At a subset of 18 loci with 25 co-localised GWAS and eQTL signals, NG Capture-C interactions implicated effector transcripts at five additional genetic signals relative to promoter capture Hi-C through physical contact with gene promoters. Conclusions: High resolution chromatin interaction profiles at selectively targeted loci can complement genome- and promoter-wide maps.	NextSeq 500	15
EGAD50000000518	Identification of the genes and processes mediating genetic association signals for complex diseases represents a major challenge. As many of the genetic signals for type 2 diabetes (T2D) exert their effects through pancreatic islet-cell dysfunction, we performed a genome-wide pooled CRISPR loss-of-function screen in a human pancreatic beta cell line. We assessed the regulation of insulin content as a disease-relevant readout of beta cell function and identified 580 genes influencing this phenotype. Integration with genetic and genomic data provided experimental support for 20 candidate T2D effector transcripts including the autophagy receptor CALCOCO2. Loss of CALCOCO2 was associated with distorted mitochondria, less proinsulin-containing immature granules and accumulation of autophagosomes upon inhibition of late-stage autophagy. Carriers of T2D-associated variants at the CALCOCO2 locus further displayed altered insulin secretion. Our study highlights how cellular screens can augment existing multi-omic efforts to support mechanistic understanding and provide evidence for causal effects at genome-wide association studies loci.	NextSeq 500	6
EGAD50000000519	Population level variation and molecular mechanisms behind insulin secretion in response to carbohydrate, protein, and fat remain uncharacterized despite ramifications for personalized nutrition. We now define prototypical insulin secretion dynamics in response to the three macronutrients in islets from 140 cadaveric donors, including those diagnosed with type 2 diabetes. We leverage the insulin response heterogeneity and use transcriptomics and proteomics to identify molecular pathways of specific nutrient responsiveness. Surprisingly, we find robust insulin secretion to fatty acid stimulus in ~8% of donors, challenging the idea that fat has negligible effects on insulin release. Distinct islet proteomes with differences in metabolic signalling networks convey this hyper-responsiveness to fat relative to carbohydrate. By comparing human islets to human embryonic stem cell-derived islet clusters, we show that, unlike glucose-responsiveness, fat hyper-responsiveness is equivalent and may be a hallmark of functionally immature cells. Our study represents the first comparison of dynamic responses to nutrients and multi-omics analysis in human insulin secreting cells. Responses of different people’s islets to carbohydrate, protein, and fat lay the groundwork for personalized nutrition.	Illumina NovaSeq 6000	96
EGAD50000000520	This dataset contains the FASTQ files, the correspondent H&E pictures with the fiducial frames and the json files that were used for the spatial transcriptomic analysis in our paper (n=19). Within the json files names, one can find the information about the slide name (V11M111-111) and the capture area (A1) for each sample.	Illumina NovaSeq 6000	20
EGAD50000000521	Dataset including FASTQ files for all snRNA-seq samples from subcortical Multiple Sclerosis (MS) lesions and controls (n=15), together with the curated and annotated snRNA-seq atlas.	Illumina NovaSeq 6000	16
EGAD50000000522	Curated subsetting of cell subtypes derived from the 9 principal cell types within our subcortical Multiple Sclerosis (MS) lesions snRNA-seq atlas.	Illumina NovaSeq 6000	16
EGAD50000000523	The dataset contains samples of 6 organotypic co-cultures, assembled with patient-derived material from ovarian cancer (OC) patients. Tumor cells, both as bulk and as cancer stem cells-enriched (OCSC) populations, are cultured or not with in vitro peritoneal TME (for details see Battistini C et al, Tumor microenvironment-induced FOXM1 regulates ovarian cancer stemness, CDDis 2024). Dataset is composed by fastq file (paired end) type from bulk RNA-Seq.	Illumina NovaSeq 6000	24
EGAD50000000524	BAM and VCF files from WES of an affected proband with a novel immunodeficiency phenotype caused by homozygous mutation of SLC19A1.	Ion Torrent Proton	1
EGAD50000000525	This dataset contains count data (from 10X CellRanger) and metadata from 7 acute myeloid leukemia subjects treated with chemotherapy, as well as data from 3 healthy donors. Additionally, the dataset includes VDJ data for all subjects except for 31_base and the 3 healthy donors (TCRseq not performed).	Illumina NovaSeq X	15
EGAD50000000526	The dataset contains whole genome sequencing data of 8 high-grade serous carcinoma (HGSC) patients sequenced with Novoseq 6000. The 16 samples are either fresh frozen tumour samples or blood samples. The files provided are paired fastq files.	Illumina NovaSeq 6000	16
EGAD50000000527	Raw 16SV4-sequence data from bronchial brushing DNA obtained from healthy volunteers prior to and four weeks after ICS treatment.	Illumina MiSeq	1
EGAD50000000528	Associated metadata for sequencing data.		63
EGAD50000000529	Barcodes associated with 16SV4 sequencing.		63
EGAD50000000530	Raw data of ctDNA profiling using the PredicineWES+ or PredicineBEACON assay in patients enrolled in divarasib phase I GO42144 study.	Illumina NovaSeq 6000	303
EGAD50000000531	The dataset contains HMO measurements for 1542 samples from Lifelines NEXT cohort	Illumina MiSeq	1502
EGAD50000000532	This dataset contains 16S sequencing of fecal and milk samples for HMO-microbiome Lifelines NEXT study	Illumina MiSeq	1501
EGAD50000000533	10x Genomics immune profiling including sequenced libraries of: 5'-end mRNA transcriptomics Beta and alpha chain VDJ transcripts Cell surface expression by 204 DNA barcoded antibodies	Illumina NovaSeq 6000	143
EGAD50000000535	This dataset comes from shallow whole genome sequencing data of STIC project	Illumina HiSeq 4000	38
EGAD50000000536	This is the dataset for Ampliseq sequencing	Illumina HiSeq 4000	11
EGAD50000000537	Bulk RNAseq analysis of 84 PDAC samples : Normalized read counts	Illumina NovaSeq 6000	1
EGAD50000000538	Mononuclear cells were isolated as cell suspensions by dissociation in a BD Horizon™ Dri Tumor & Tissue Dissociation Reagent (BD Biosciences) using a gentleMACS dissociator (Miltenyi Biotec). A totalseq-C antibody cocktail (Biolegend) was used, and the cell suspensions were diluted to a concentration of 1,000 cells/μL. Library sequencing was performed using the NovaSeq6000 System (Illumina). The FASTQ files were aligned to the 10x provided reference genome using 10x Genomics Cell Ranger software v7.1.0 to create unique molecular identifier count tables of gene expression of the samples.	Illumina NovaSeq 6000	1
EGAD50000000540	Smart-seq2 single cell RNA sequencing of human BCC, SCC, melanoma (ALM) and healthy control skin samples.	Illumina HiSeq 2000	15
EGAD50000000541	This dataset was generated using single-molecule real-time sequencing for both training and testing purposes. The DNA samples isolated from human buffy coats or placenta of healthy individuals (n=4,) were prepared in a way that four types of base modifications were incorporated, including unmodified controls, 5mC, 5hmC, and 6mA. The sequence data for each sample consists of subreads produced by single-molecule real-time sequencing and stored in a bam format. Details of the specific experiments conducted in this study are outlined in our manuscript.	Sequel II	7
EGAD50000000542	RNA-sequencing, ATAC-sequencing, CUT&Tag-sequencing, data from CMML patient and aged-matched healthy donor CD34+ cells. RNA-sequencing data from patient CD34+ cells after treatment for 4 days with decitabine or UNC0638 inhibitors, alone or together, or left untreated	Illumina NovaSeq 6000	71
EGAD50000000543	Longitudinal analysis of single-cell RNAseq datasets of PBMCs from COVID-19 CVID patients and controls.	Illumina NovaSeq 6000	111
EGAD50000000544	Fastq files + patients' characteristics	Illumina NovaSeq 6000	9
EGAD50000000545	This dataset includes DNA methylation profiles from 112 young with Type 1 Diabetes (T1D) at T1D diagnosis, who were longitudinally monitored for hyperglycemia over an average duration of 3 years. These datasets were generated using whole-genome bisulfite sequencing. It includes 1872 fastq files (i.e. 936 paired-end fastq files) generated through 150 bp paired-end sequencing on Illumina HiSeqX.	Illumina HiSeq X	224
EGAD50000000546	Whole exome sequencing was performed from 108 Diffuse Large B Cell Lymphoma cases. A library containing whole exome regions was used to isolate the DNA for sequencing (SureSelect XT Human All Exon V6 (Agilent technologies)). Sequencing on a NovaSeq 6000 instrument (Illumina, paired end, 2x100, mean 566Gb per FlowCell) was performed.	Illumina NovaSeq 6000	146
EGAD50000000547	This research project was a collaboration between Cardiff University, UK and the Stanley Center at the Broad Institute. In this project we sequenced and analyzed the whole exomes of 2,458 Bipolar case/control samples from collaborators in UK. Genomic DNA from each sample was sequenced to a mean depth of 20x. The exome used Twist capture and samples were sequenced on Illumina HiSeqX machines producing CRAM files.	HiSeq X Ten	2512
EGAD50000000548	DNA Sequencing of 864 genes from KiCS cancer panel. Multiple family members affected with multifocal GIST who underwent whole genome sequencing of the germline and tumor. Affected individuals with GIST harbored a germline variant found within exon 13 of the KIT gene, (c.1965T>G; p.Asn655Lys, p.N655K) and a variant in the MSR1 gene (c.877C>T; p.Arg293*, pR293X).	Illumina HiSeq 2500	7
EGAD50000000549	Whole genome sequencing of tumour and normal samples from familial GIST. Multiple family members were affected with multifocal GIST who underwent whole genome sequencing of the germline and tumor. Affected individuals with GIST harbored a germline variant found within exon 13 of the KIT gene, (c.1965T>G; p.Asn655Lys, p.N655K) and a variant in the MSR1 gene (c.877C>T; p.Arg293*, pR293X).	HiSeq X Ten	7
EGAD50000000551	SDR-seq_BCL data access. Two follicular lymphoma (FL) primary patient samples and one germinal center subtype diffuse large B-cell lymphoma (GCB) primary patient sample. FL1 and GCB1 are sequenced in Run1. FL2 is sequenced in Run2.	NextSeq 2000	3
EGAD50000000553	cfRRBS data produced from the cfDNA in plasma isolated from healthy, adult donors.	Illumina NovaSeq 6000	44
EGAD50000000554	In this study, we explore the potential of classifying pediatric brain tumors based on methylation profiling of the cell-free DNA in cerebrospinal fluid (CSF). For this proof-of-concept study, we collected 20 cerebrospinal fluid samples of pediatric brain cancer patients via a ventricular drain placed for reasons of increased intracranial pressure. For 11 patients in this study we collected matched tumor DNA. This cohort contains fastQ files of cfRRBS data of these samples.	Illumina NovaSeq 6000	43
EGAD50000000555	BaTwa genotype data using the H3Africa array.		80
EGAD50000000556	Whole exome sequencing of a collection of 6 Ewing sarcoma and 3 CIC-DUX4 sarcoma tumoroids plus 4 matched patient tumors and normal tissue. Tumoroids were established from digested patient tumor material as 3D culture in Matrigel. After 2-3 month of growth, Genomic DNA was isolated from individual tumoroid models using the DNeasy Blood & Tissue Kit (Qiagen) and used for whole exome sequencing. Genomic DNA from corresponding patient tumors and matched normal tissue was tested in parallel, if available.	Illumina NovaSeq 6000	18
EGAD50000000557	RNA sequencing of a collection of 6 Ewing sarcoma and 3 CIC-DUX4 sarcoma tumoroids plus 3 matched patient tumors. Tumoroids were established from digested patient tumor material as 3D culture in Matrigel. After 2-3 month of growth, RNA was isolated from individual tumoroid models using the RNeasy Kit (Qiagen) and used for whole exome sequencing. RNA from corresponding patient tumors was tested in parallel, if available.	Illumina NovaSeq 6000	11
EGAD50000000558	Raw fastq files obtained by RNA sequencing 138 IDH-mutant astrocytomas included in the CATNON trial. RNA was extracted from formalin-fixed paraffin-embedded (FFPE) tissue blocks using the RNeasy FFPE kit. RNA sequencing was performed on an Illumina NovaSeq 6000 (GenomeScan BV, Leiden, The Netherlands) with 150bp paired-end reads including UMI tags.	Illumina NovaSeq 6000	138
EGAD50000000559	This dataset contains a gene-cell matrix derived from single-cell RNA sequencing (scRNA-seq) data of ileal tissue from Crohn's disease (CD) patients and colorectal cancer (CRC) patients. It includes: Crohn's Disease Patients: A trio of transmural lesions (stenotic, inflamed, and non-inflamed) from each patient. Colorectal Cancer Patients: Unaffected ileal tissue used as external non-inflamed control. Cell Level Metadata: The dataset includes relevant cell-level metadata such as cell type annotations used in the study. Experimental Details: Platform: 10x Genomics Chromium Single Cell 3' GEX Sequencing: Illumina NovaSeq Processing: Data processed with Cell Ranger software. Resulting count matrices were merged for downstream analysis, including integration and dimensionality reduction. Dataset Composition: Crohn's Disease Patients: 10 patients with 3 samples each (non-inflamed, inflamed, stenotic), totaling 30 samples. Colorectal Cancer Patients: 5 patients with 1 sample each of unaffected tissue, totaling 5 samples. Data Provided: Merged Raw Count Matrix: The final merged raw count matrix used for downstream analysis. Cell Metadata File: Contains details of sample, tissue, and patient for each cell in the count matrix. Barcodes File: Indicate each cell barcode which also encodes the sample, tissue, and patient details for each cell. CD.S_Inf: Stenotic Corhn's disease inflamed samples CD.S_Sten: Stenotic CD patient stenosis sample CD.S_Prox: Stenotic CD Patient - proximal non-inflamed sample CC.C_Prox: CRC Patient proximal unaffected sample eg: A barcode 'CC.C_1_Prox_AAGTCGTAGACCCTTA' indicates CRC Patient unaffected proximal sampe from CRC Patient no.1 and the nucleic acid sequence indicate a unique cell from this sample. Total Samples: Crohn's Disease (CD) Patients: 30 samples Colorectal Cancer (CRC) Patients: 5 samples Patient_no Sample Sample_type 1 CC.C_1 CC.C_1_Prox CC.C_Prox 2 CD.S_1 CD.S_1_Prox CD.S_Prox 3 CD.S_1 CD.S_1_Infl CD.S_Infl 4 CD.S_1 CD.S_1_Sten CD.S_Sten 5 CC.C_2 CC.C_2_Prox CC.C_Prox 6 CD.S_2 CD.S_2_Prox CD.S_Prox 7 CD.S_2 CD.S_2_Infl CD.S_Infl 8 CD.S_2 CD.S_2_Sten CD.S_Sten 9 CC.C_3 CC.C_3_Prox CC.C_Prox 10 CC.C_4 CC.C_4_Prox CC.C_Prox 11 CD.S_3 CD.S_3_Prox CD.S_Prox 12 CD.S_3 CD.S_3_Infl CD.S_Infl 13 CD.S_3 CD.S_3_Sten CD.S_Sten 14 CD.S_4 CD.S_4_Prox CD.S_Prox 15 CD.S_4 CD.S_4_Infl CD.S_Infl 16 CD.S_4 CD.S_4_Sten CD.S_Sten 17 CC.C_5 CC.C_5_Prox CC.C_Prox 18 CD.S_5 CD.S_5_Prox CD.S_Prox 19 CD.S_5 CD.S_5_Infl CD.S_Infl 20 CD.S_5 CD.S_5_Sten CD.S_Sten 21 CD.S_6 CD.S_6_Prox CD.S_Prox 22 CD.S_6 CD.S_6_Infl CD.S_Infl 23 CD.S_6 CD.S_6_Sten CD.S_Sten 24 CD.S_7 CD.S_7_Prox CD.S_Prox 25 CD.S_7 CD.S_7_Infl CD.S_Infl 26 CD.S_7 CD.S_7_Sten CD.S_Sten 27 CD.S_8 CD.S_8_Prox CD.S_Prox 28 CD.S_8 CD.S_8_Infl CD.S_Infl 29 CD.S_8 CD.S_8_Sten CD.S_Sten 30 CD.S_9 CD.S_9_Prox CD.S_Prox 31 CD.S_9 CD.S_9_Infl CD.S_Infl 32 CD.S_9 CD.S_9_Sten CD.S_Sten 33 CD.S_10 CD.S_10_Prox CD.S_Prox 34 CD.S_10 CD.S_10_Infl CD.S_Infl 35 CD.S_10 CD.S_10_Sten CD.S_Sten	Illumina NovaSeq 6000	35
EGAD50000000560	Low pass WGS and targeted glioma panel sequencing of low and high-grade gliomas in children, adolescent, and young adult patients.. File type is paired-read fastq files (2 per sample). Sequencing was performed on Illumina NextSex instruments. Genes covered on panel: TP53, H3F3A, HIST1H3B, HIST1H3C, IDH1, IDH2, KRAS, PIK3CA, TERT promoter, PTPN11, FGFR1/2/3, MYB, BRAF, MYBL1, EGFR, PDGFRA, MYCN, MYC, CDKN2A.	NextSeq 500	240
EGAD50000000561	Metagenomic characterization of tracheal aspirates from non-pulmonary sepsis patients. This dataset consists of non-human read data from shotgun sequencing in sepsis case samples from lung aspirates. The dataset consist of 32 paired FASTQ files sequenced in paired-read mode 2x150 bp using an Illumina HiSeq 4000 sequencer.	Illumina HiSeq 4000	32
EGAD50000000562	This dataset contains the FASTQ and BAM files for TOTHER3 study. Targeted DNA-Seq experiment were realized with 49 samples. Since this data was obtained with a panel of genes protected by Intellectual Property (IP), these files only contain information to validate the results published and, in consequence, sensible data had to be specifically removed.	Illumina NovaSeq 6000	49
EGAD50000000563	This research project was a collaboration between University Hospital Frankfurt and the Stanley Center at the Broad Institute. In this project we sequenced and analyzed the whole exomes of 823 Bipolar case/control samples from collaborators in Germany. Genomic DNA from each sample was sequenced to a mean depth of 20x. The exome used Twist capture and samples were sequenced on Illumina HiSeqX machines producing cram files.	HiSeq X Ten	823
EGAD50000000564	This dataset contains 10 tumor and normal pairs synthetic WGS data of colorectal cancer that were simulated in a standard format of Illumina paired-end reads. The NEAT read simulator (version 3.0, https://github.com/zstephens/neat-genreads) was utilized to synthetize these 10 pairs of tumor and normal WGS data. In the procedure of data generation, simulated parameters (i.e., sequencing error statistics, read fragment length distribution and GC% coverage bias) were learned from data models provided by NEAT. The average sequencing depth for tumor and normal samples aimed to reach around 110X and 60X, respectively. For generation of synthetic normal WGS data per each sample, a germline variant profile from a real patient was down-sampled randomly, representing 50% germline variants of a given patient. These were mixed with the other 50% in silico germline variants that were modelled randomly using an average mutation rate (0.001), finally constituting a full germline profile for normal synthetic WGS data. For generation of synthetic tumor WGS data per each sample, a pre-defined somatic short variant profile (SNVs+Indels) learnt from a real CRC patient was added to the germline variant profile used for creating the normal synthetic WGS data of the same patient, consisting of the variants for tumor sample. Neither copy number profile nor structural variation profile was introduced into the tumor synthetic WGS data. Tumor content and ploidy were assumed to be 100% and 2, respectively. For mapping/variant detection, the Sarek pipeline v3.1.2 (https://nf-co.re/sarek/3.1.2) was used, specifically: 1. BWA v0.7.17-r1188 for read mapping 2. GATK v4.3.0.0 for pre-processing BAM file (including markduplicates and recalibration). 2. Mutect2 (GATK v4.3.0.0) for somatic variant calling 3. Strelka2 v2.9.10 for germline and somatic variant calling Metadata information of 10 CRC patients used for the generation of synthetic normal and tumor WGS data: Patient_id Tumor_barcode Normal_barcode Age Sex Tissue Cancer SIM007 SIM007_T SIM007_N 71 F Rectal Primary CRC SIM008 SIM008_T SIM008_N 45 F Colon Neuroendocrine Metastasis CRC SIM010 SIM010_T SIM010_N 62 M Colon Metastasis CRC SIM011 SIM011_T SIM011_N 55 M Colon Neuroendocrine Metastasis CRC SIM012 SIM012_T SIM012_N 57 M Rectal Metastasis CRC SIM013 SIM013_T SIM013_N 69 M Colon Metastasis CRC SIM014 SIM014_T SIM014_N 68 M Colon Neuroendocrine primary CRC SIM015 SIM015_T SIM015_N 58 F Colon Primary CRC SIM016 SIM016_T SIM016_N 49 M Colon/Rectal Primary CRC SIM017 SIM017_T SIM017_N 78 M Colon Neuroendocrine primary CRC	unspecified	20
EGAD50000000565	Dataset contains bulk RNA sequencing reads from 44 NERD patients under dupilumab therapy before and after aspirin provocation at baseline and after 24 weeks of treatment. Dataset is a multiplexed .bam file containing all sequencing reads with tagged identifiers.	NextSeq 550	1
EGAD50000000566	scRNA-seq dataset for the study "Surgery in combination with immune checkpoint therapy as an effective treatment for patients with metastatic cancer." The dataset comprises 38 samples from 19 patients, with each pair includes a baseline (BL) sample and a post-biopsy or post-surgery sample. Each sample corresponds to an R1 (read 1) and an R2 (read 2) files (.fastq.gz) for paired reads. Beside procedure (biopsy or surgery), another important phenotype is response (PR and PD in this case, no SD for these patients), included in the description of the samples. The sequencing was performed using Illumina NovaSeq 6000 platform., as seen in the experiment description.	Illumina NovaSeq 6000	38
EGAD50000000567	Musculoskeletal diseases affect up to 20% of adults worldwide. The gut microbiome has been implicated in inflammatory conditions, but large-scale metagenomic evaluations have not yet traced the routes by which immunity in the gut affects inflammatory arthritis. To characterize the community structure and associated functional processes driving gut microbial involvement in arthritis, the Inflammatory Arthritis Microbiome Consortium investigated 440 stool shotgun metagenomes comprising 221 adults diagnosed with rheumatoid arthritis, ankylosing spondylitis, or psoriatic arthritis and 219 healthy controls and individuals with joint pain without an underlying inflammatory cause. Diagnosis explained about 2% of gut taxonomic variability, which is comparable in magnitude to inflammatory bowel disease. We identified several candidate microbes with differential carriage patterns in patients with elevated blood markers for inflammation. Our results confirm and extend previous findings of increased carriage of typically oral and inflammatory taxa and decreased abundance and prevalence of typical gut clades, indicating that distal inflammatory conditions, as well as local conditions, correspond to alterations to the gut microbial composition. We identified several differentially encoded pathways in the gut microbiome of patients with inflammatory arthritis, including changes in vitamin B salvage and biosynthesis and enrichment of iron sequestration. Although several of these changes characteristic of inflammation could have causal roles, we hypothesize that they are mainly positive feedback responses to changes in host physiology and immune homeostasis. By connecting taxonomic alternations to functional alterations, this work expands our understanding of the shifts in the gut ecosystem that occur in response to systemic inflammation during arthritis.	Illumina HiSeq 4000	440
EGAD50000000568	Bulk RNAseq data for IMbrave151 (N = 96), including raw counts, transcript per million (TPM), and voom transformed values.		96
EGAD50000000569	Patient demographic characteristics, treatment, and survival data for IMbrave151 (N = 162), with one patient per line. Mapping file to map anonymized subject IDs to RNAseq IDs		96
EGAD50000000573	A total of 20 individuals, were analyzed using Oxford Nanopore Technologies long-read sequencing. Each individual was sequenced on one PromethION flow-cell, providing approximately 25-30X coverage of the genome. The data is deposited as nanopore BAM files with methylation information included.	PromethION	20
EGAD50000000574	This dataset contains the raw sequencing data (fastq-files) of 10 sample libraries. These are 10x genomics single-cell transcriptome libraries of human peripheral blood mononuclear cells (PBMC’s). The samples are taken from healthy, acute decompensated (AD) and acute chronic liver failure ACLF patients. The sequencing was performed on an Illumina NovaSeq6000. This data are related to the paper: "Distinct immunometabolic signatures in circulating immune cells define disease outcome in acute-on-chronic liver failure"	Illumina NovaSeq 6000	10
EGAD50000000575	mRNA-Sequencing of 73 primary multiple myeloma (MM) samples and human MM cell lines before and after siRNA-mediated knockdown of ADAM8 (n=5 cell lines), ADAM9 (n=7 cell lines) and ADAM15 (n=5 cell lines). Paired-end sequencing was performed on a NovaSeq (Illumina). Fastq and bam files are provided for each sample.	Illumina NovaSeq 6000	107
EGAD50000000578	Here, we provide mapped cram files from 35 multiple myeloma patients. Sequencing was performed employing Illumina's short read technology. Resulting sequencing data was mapped against GRCh38 using nfcore/sarek v3.2.3.	Illumina NovaSeq 6000	35
EGAD50000000580	This dataset included human plasma EBV DNA target-capture sequencing data. The capture probes were designed to cover the entire EBV genome and selected human autosomal regions. After enrichment, the products were sequenced on the NextSeq500 System (Illumina) and aligned to the EBV genome (AJ507799.2) and the human genome (hg19) in BAM format.	NextSeq 500	360
EGAD50000000581	Shallow whole genome sequencing (sWGS) of 227 initial and recurrent tumor samples from 105 patients in the GLASS-NL cohort.	Illumina NovaSeq 6000	227
EGAD50000000582	Whole exome sequencing (WES) data from 45 normal samples of 45 patients in the GLASS-NL trial.	Illumina NovaSeq 6000	45
EGAD50000000583	Whole exome sequencing (WES) data from 219 initial and recurrent tumor samples from 103 patients in the GLASS-NL trial.	Illumina NovaSeq 6000	219
EGAD50000000584	scRNA-seq of Patient-derived tumor fragments (PDTFs) which were treated with either Isotype (DP47), Isotype-IL2v (DP47-IL2v), anti-PD1 (PembroPGLALA) or PD1-IL2v (PembroPGLALA-IL2v) for 48 hours. Dataset content: 10 samples from 3 donors File type: paired-end fastq files Protocol: Chromium Single Cell 3' Reagent Kits User Guide (Dual Index) with Feature Barcoding technology for Cell Surface Protein (CG000317) Technology: Illumina sequencing Experimentation: scRNA-seq using the 10X technology	Illumina NovaSeq 6000	10
EGAD50000000585	The dataset comprises RNAseq data of human testiculr tisue of fertile men with full spermatogenesis nd infertile men with biallelic variants in genes of the piRNA pathway.	Illumina NovaSeq 6000	7
EGAD50000000591	We used sequenced 34 prDLBCL samples using whole exome sequencing (WES) data to evaluate possible mutational signatures and driver mutations associated with the patient’s clinical and cytogenetic characteristics.	Illumina NovaSeq 6000	34
EGAD50000000592	We used sequenced 30 prDLBCL samples using RNA-seq to evaluate possible fusion events and to get expression profiles.	Illumina NovaSeq 6000	30
EGAD50000000593	Patients with peritoneal metastasis from colorectal cancer (PM-CRC) have inferior prognosis and respond particularly poorly to chemotherapy. This study aims to identify the molecular explanation for the observed clinical behavior and suggest novel treatment strategies in PM-CRC. Tumor samples (230) from a Norwegian national cohort undergoing surgery and hyperthermic intraperitoneal chemotherapy (HIPEC) with mitomycin C (MMC) for PM-CRC were subjected to targeted DNA sequencing, and associations with clinical data were analyzed. mRNA sequencing was conducted on a subset of 30 samples to compare gene expression in tumors harboring BRAF or KRAS mutations and wild-type tumors.	Ion GeneStudio S5 NextSeq 500	231
EGAD50000000594	Our cohort represents infants with ALL in whom KMT2Ar was not detected by FISH or by standard cytogenetics. Whole-genome sequencing (WGS) was performed using Illumina’s HiSeqX to a depth of >30X.	HiSeq X Ten	23
EGAD50000000595	Our cohort represents infants with ALL in whom KMT2Ar was not detected by FISH or by standard cytogenetics. Whole-transcriptome sequencing (WTS) libraries were prepared using the NEBNext Ultra II RNA Directional library kit for samples with at least 100ng RNA (n=19), the Clontech double stranded cDNA conversion kit plus the Nextera XT library protocol for samples with less than 100ng RNA (n=4), or ribodepletion using NEBNext rRNA Depletion Kit v2 (Human/Mouse/Rat) for Total RNA (n=1). Libraries were paired-end sequenced using the Illumina HiSeq 2500.	Illumina HiSeq 2500	28
EGAD50000000596	Massively Parallel Reporter Assays (MPRA) of colorectal cell lines HCEC-1CT (normal colon) and HT29 and SW403 (MSS cancer). Probes identified using the CRC GWAS.	Illumina NovaSeq 6000	3
EGAD50000000597	This dataset includes all data produced in the study describing "Lam-ESC&Lam-Recombination", a method for sequencing of signal joints and coding joints generated from human light chain antigen receptor loci during V(D)J recombination in B-cell precursor acute lymphoblastic leukaemia (BCP-ALL) patients. This dataset includes: - Amplicon Lam-ESC for a total of 86 BCP-ALL patients - fastq files - Amplicon Lam-Recombination for a total of 43 BCP-ALL patients - fastq files	unspecified	129
EGAD50000000598	RNA sequencing of air-liquid interface (ALI) cultures of olfactory mucosa (OM) cells derived from control (n=3) and AD (n=3) individuals exposed to SARS-CoV-2.	Illumina NovaSeq 6000	24
EGAD50000000599	These are the metagenomic datasets of 29 FMT traids with corresponding plasma metabolomics. Data for each triad consists of metagenomics for baseline, post FMT and corresponding donor. Clinical data concerning the glucose rate of disappearance and blood pressure are available as phenotypes.	Illumina NovaSeq 6000	69
EGAD50000000600	This dataset contains 208 paired fastq files sequenced with Illumina HiSeq 2500.	Illumina HiSeq 2500	208
EGAD50000000601	Data generated through single nuclei ATAC sequencing on whole ganglionic eminences from 3 human fetuses (two of 16 and one of 17 gestational weeks). Tissue was acquired from the MRC-Wellcome Trust Human Developmental Biology Resource with ethical approval. snATAC-Seq libraries were prepared from ~8,000 nuclei per sample using Chromium Next GEM Single Cell ATAC (v1.1) reagents (10X Genomics). Quality control of libraries was performed using the Agilent 5200 Fragment Analyzer before sequencing on an Illumina NovaSeq 6000 to a depth of at least 617 million read pairs per library. Raw sequencing data were converted into FASTQ files. For a full description of data generation, please see Cameron et al, Schizophrenia Bulletin 2024, https://doi.org/10.1093/schbul/sbae083. Please note that 10X generated BAM files, rather than FASTQ files, have been uploded. FASTQ files can be regenerated using the 10X Genomics bamtofastq tool. https://support.10xgenomics.com/docs/bamtofastq	Illumina NovaSeq 6000	3
EGAD50000000602	Sequencing dataset of 290 DLBCL and HGBL cfDNA samples of paired end 150bp sequencing runs.	Illumina NovaSeq 6000	290
EGAD50000000603	Spatial transcriptomics data (ST) from 32 human prostate tissue samples originating from 8 prostate cancer patients (5 patients with post-surgery relapse). The ST data was acquired using the Visium Spatial Gene Expression kit which resulted in over 20 000 spatially defined spots for the 32 tissue samples. The raw transcriptomics data is RNA-seq. The individual samples have information on patient origin and sample type (cancer, cancer-adjacent field-effect normal or normal sample far from cancer). Each ST spot has metadata such as sample origin, histology class (stroma, normal epithelium, cancer of various grading etc), number of cells and estimated cell type fractions. Patient metadata include information of age at surgery, time (months) until reported relapse, total follow-up time, pre-surgery PSA, post-surgery T-stage and metastasis status.	NextSeq 500	32
EGAD50000000604	Bulk transcriptomics data from 176 prostate tissue samples (37 patients, 27 with post-surgery relapse) that were acquired using the SENSE mRNA-Seq Library Prep Kit V2 and Illumina NextSeq 500 instrument. The raw transcriptomics data is single-read RNA-seq. The individual samples have information on patient origin and sample type (cancer, cancer-adjacent field-effect normal or normal sample far from cancer). Patient metadata include information of age at surgery, time (months) until reported relapse, pre-surgery PSA and post-surgery T-stage.	NextSeq 500	176
EGAD50000000605	Bulk methylation array data from 64 prostate tissue samples from 16 patients (5 with post-surgery relapse). Methylation data were acquired using the microarray kit Illumina Infinium MethylationEPIC BeadChip (n=64 ). The individual samples have information on patient origin and sample type (cancer, cancer-adjacent field-effect normal or normal sample far from cancer). Patient metadata include information of age at surgery, time (months) until reported relapse, pre-surgery PSA and post-surgery T-stage.	Infinium MethylationEPIC BeadChip	64
EGAD50000000606	Bulk methylation array data from 32 prostate tissue samples from 8 patients (3 with post-surgery relapse). Methylation data were acquired using the microarray assay Illumina Infinium MethylationEPIC v2.0 Kit. The individual samples have information on patient origin and sample type (cancer, cancer-adjacent field-effect normal or normal sample far from cancer). Patient metadata include information of age at surgery, time (months) until reported relapse, pre-surgery PSA and post-surgery T-stage.	Infinium MethylationEPIC v2.0 BeadChip	32
EGAD50000000607	Tumours with high and ultrahigh rates of somatic retrotransposition studied in the publication by Zumalave et al. titled Synchronous L1 retrotransposition events promote chromosomal crossover early in human tumorigenesis	Illumina NovaSeq 6000 MinION Sequel II	170
EGAD50000000608	This dataset contains all fecal shotgun metagenomics and metabolomics of the study which investigated the effects of the probiotic strain Anaerobutyricum soehngenii. Shotgun data is from three different time points, baseline and two follow up. Metabolomics sets are pre and post intervention, both fasted and post prandially. Data on randomization and metformin use can be found in the analysis.	NextSeq 500	74
EGAD50000000609	The dataset contains RNAseq profiles of 57 patients from the CheckMate-142 clinical trial. The Allprep DNA/RNA FFPE kit was used to simultaneously purify genomic DNA and RNA from formalin-fixed, paraffin embedded (FFPE) tissue sections. Baseline tumor tissue samples were processed using the Illumina TruSeq RNA Access in-solution hybrid capture panel and underwent subsequent NGS on the Illumina NovaSeq platform. Fastq files are included.	Illumina NovaSeq X	57
EGAD50000000610	The dataset contains WES profiles of 59 patients from the CheckMate-142 clinical trial. The Allprep DNA/RNA FFPE kit was used to simultaneously purify genomic DNA and RNA from formalin-fixed, paraffin embedded (FFPE) tissue sections. Baseline tumor tissue and matched whole -blood samples were processed using the Agilent SureSelect Human All Exon V5 in-solution hybrid capture panel and underwent subsequent next-generation sequencing (NGS) on the Illumina NovaSeq platform. Fastq files are included.	Illumina NovaSeq X	59
EGAD50000000611	MeD-seq data (fastq files) from gynecological cancers and associated healthy tissues. In total 292 fastq files generated by MeD-seq are deposited consisting of: healthy tissues (vulva n=11, cervix n=15, endometrium n=13, fallopian tube n=18 and ovary n=13), precursor lesions of cancer (vulva n=23 and cervix n=46) and cancer (vulva n=21, cervix n=45, endometrium n=26, fallopian tube n=8 and ovary n=33)	Illumina HiSeq 2500	292
EGAD50000000612	These datasets contain RNA-seq data of human skeletal muscle biopsies before or after exercise. This contains data of three projects on skeletal muscle. Paired Fastq files are uploaded. - HeteroFiber: Individual muscle fibers from human biopsies, collected at rest from m. vastus lateralis. (n = 1044) - ExerFiber: Individual muscle fibers from human biopsies (m. vastus lateralis), collected immediately after or 3 hours after HIIT exercise session. (n = 2096) - Bulk muscle exercise: bulk muscle biopsies at rest, immediately after or 3h after HIIT exercise session. (n = 126)	Illumina NovaSeq 6000	3426
EGAD50000000613	Tissue samples were collected from 18 ovarian cancer patients. After mechanical and enzymatic dissociation, a part of these tissues underwent growth factoromics analysis. For four cases, single-cell sequencing analysis was conducted under conditions with or without estradiol. Additionally, the remains of the tissues were frozen and subsequently subjected to bulk RNA-sequencing on 17 of these samples. Sixteen cases were diagnosed as HGSOC, while two samples were clear cell or mucinous type (Extended Data Table 1). All cases were grade 3, and none had received neoadjuvant chemotherapy. - Group C1, Epithelial Ovarian Cancer(EOC) patients with estrogen responsiveness : 6 samples - Group C2, Epithelial Ovarian Cancer(EOC) patients with estrogen non-responsiveness : 10 samples - non-classified 1-sample	Illumina HiSeq 4000 Illumina NovaSeq X	17
EGAD50000000614	VCF file with autosomal genotypes from 22 and 20 Eivissan and Menorcan samples. Affymetrix Human Origins array was used to genotype the samples and the variants were lifted over to the hg38 human genome reference.		42
EGAD50000000615	We report single-cell RNA sequencing data from myeloid cells of the human visceral adipose tissue in non-alcoholic fatty liver disease patients	Illumina NovaSeq X	16
EGAD50000000616	A mutation accumulation experiment in colorectal cancer (CRC) derived tumoroids. A sequential single-cell cloning approach was adopted to measure the mutation rate in eight tumoroids obtained from five patients. WGS was also performed on their matched normal tissue and on standard tumoroids cultures without any cloning step. This is a 150x depth sequencing for 7 samples.	unspecified	7
EGAD50000000617	Whole Genome Sequencing raw paired-ends reads for 20 matched colorectal patient-derived-organoids (normal and tumor). Tumor samples were obtained from patients treated at Niguarda Cancer Center, (Milano, Italy) and Candiolo Cancer Institute (Candiolo, Turin, Italy).	Illumina NovaSeq 6000	40
EGAD50000000618	In this study, formalin-fixed paraffin-embedded targeted locus capture (FFPE-TLC) sequencing is used as a novel technology for targeted detection of tumor-specific genomic structural variants (SVs) in the primary tumor of 29 colorectal cancer patients with metastatic disease. The tumor region was macrodissected and sequenced for 29 patients, and the "normal" region of the same slide was macrodissected and sequenced for 8 of these patients. SVs were found in the common fragile site (CFS)-associated genes MACROD2, PRKN, FHIT and WWOX as well as SVs caused by three LINE transposable elements. Tumor-specificity of selected SVs was independently verified by droplet digital PCR of tumor tissue DNA and their applicability as plasma circulating tumor DNA biomarkers was demonstrated. This dataset contains the hg19 reference alignment of the FFPE-TLC sequences for all 29 colorectal cancers and 8 adjacent normals.	Illumina NovaSeq 6000	37
EGAD50000000619	This research project was a collaboration between VU and the Stanley Center at the Broad Institute. In this project we sequenced and analyzed the whole exomes of 943 Bipolar case/control samples from collaborators in the Netherlands. Genomic DNA from each sample was sequenced to a mean depth of 20x. The exome used Twist capture and the samples were sequenced on Illumina HiSeqX machines producing cram files	HiSeq X Ten	947
EGAD50000000620	30 and 21 whole exome sequences for Eivissan and Menorcan healthy unrelated volunteers, respectively. The exomes were captured with the Agilent SureSelect Human All Exon V6 capture kit and pair-end sequenced in Illumina platforms. For each individual, a pair of ".fastq" files can be found containing the raw reads.	unspecified	51
EGAD50000000621	This dataset contains the bam files of 60 paired samples of AML and remission samples.	Illumina NovaSeq 6000	120
EGAD50000000622	We performed whole exome sequencing of paired fresh frozen tumor from 4 ER+ advanced breast cancer patients. Samples were obtained for each patient prior to combined CDK4/6 inhibitor and endocrine therapy and at disease progression (N=8). DNA was extracted using the Maxwell® RSC FFPE and Tissue DNA Kit (Promega). WES was performed at the Department of Molecular Medicine, Aarhus University Hospital on matched tumor DNA (derived from primary fresh frozen and FFPE tissue) and buffy coat DNA. Libraries of tumors and matching germline DNA were prepared using 50 ng DNA and captured by Twist Comprehensive Exome with custom spike-ins, sequenced on the Illumina NovaSeq 6000 platform to an average coverage of 413x (range: 148-515x). The dataset include the analysis data files (vcf files).		8
EGAD50000000623	Whole Exome Sequencing of 1260 Bipolar cases and matched controls performed at the Broad Institute on a cohort from Edinburgh, Scotland, UK. The exome used Twist capture and samples were sequenced on Illumina HiSeqX machines producing cram files	HiSeq X Ten	1242
EGAD50000000624	This dataset contains 29 case/control samples of WES and WGS sequencing data. Sequencing was performed on Illumina HiSeq X using TruSeq Stranded DNA Kit. The sequencing was always paired.	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina HiSeq X	40
EGAD50000000625	This dataset contains 19 case/control samples of RNA sequencing data. Sequencing was performed on Illumina HiSeq 4000 and NovaSeq 6000 using TruSeq Stranded mRNA Kit. The sequencing was always paired.	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina NovaSeq 6000	18
EGAD50000000626	Here, we applied single-cell RNA-sequencing (scRNA-seq) on isolated HRS cells and the immune cells from the same pediatric classical Hodgkin Lymphoma (cHL) tumors. Specifically, 13 cHL patients and 3 reactive lymph node control samples were included in this cohort. This allowed us to identify genes of cell surface proteins that are consistently overexpressed in HRS cells and can potentially be used as targets for antibody-drug conjugates or CAR T cells. Finally, we identify potential interactions by which HRS cells inhibit T cells, among which the Galectin-1/CD69 and HLA-DRA/LAG3 interactions. However, high levels of inter-patient heterogeneity of the interaction strength were observed. In conclusion, this study identifies new potential therapeutic targets for cHL and highlights the importance of studying heterogeneity when identifying therapy targets.	NextSeq 2000 NextSeq 500	15
EGAD50000000627	This research project was a collaboration between Cambridge University, UK and the Stanley Center at the Broad Institute. In this project we sequenced and analyzed the whole exomes of 2,873 Bipolar case/control samples from collaborators in UK. Genomic DNA from each sample was sequenced to a mean depth of 20x. The exome used Twist capture and samples were sequenced on Illumina HiSeqX machines producing cram files	HiSeq X Ten	2851
EGAD50000000629	The cohort consisted of individuals with low-count MBL, high-count MBL as well as patients with CLL. Whole genome sequencing was initially performed in a smaller cohort of selected, representative samples, while targeted re-sequencing of CLL putative driver genes was performed in all samples of the cohort. Besides CLL cell samples, the sequencing process was performed in paired control samples including both buccal and polymorphonuclear cell samples.	Illumina HiSeq 2000	52
EGAD50000000630	The dataset "DELFI low-coverage WGS of plasma cfDNA" includes paired FASTQ files produced by WGS of 689 plasma cfDNA samples from 153 patients with colorectal cancer. WGS (100bp PE) was performed on a NovaSeq 6000 with a target depth of coverage of 8x.	Illumina NovaSeq 6000	689
EGAD50000000631	a total number of 99 bulk RNA-seq lymphoma samples	Illumina HiSeq X	99
EGAD50000000632	a total number of 203 targeted DNA sequencing lymphoma samples	Illumina HiSeq 1000	203
EGAD50000000633	We performed cellular indexing of transcriptomes and epitopes (CITE-seq) of six primary leukemia samples from CK-AML patients. The dataset contains BAM files for each individual sample (D1922 and R0836) or for the hashed pool (HIAML47-HIAML85, P9D-P9R)	Illumina NovaSeq 6000	4
EGAD50000000634	We performed strand-specific single-cell sequencing of six primary leukemia samples from four CK-AML patients. We also performed strand-specific single-cell sequencing of three matching patient-derived xenografts (PDXs). The dataset contains BAM files for each cell.	NextSeq 500	491
EGAD50000000635	Pacbio Hifi whole genome sequencing data from six individuals carrying cytogenetically visible inversions. The HiFi data was produced on the PacBio Revio machine, using 1 flowcell per sample. The resulting data was aligned to hg19 using minimap2, and converted into bam using samtools.	unspecified	6
EGAD50000000636	bulk RNA sequencing of AML patients It contains 18 samples from primary AML patients. Half of the samples are control samples and half of the samples are FLT3-ITD knock-out (edited for the FLT3-ITD mutation). It is using Illumina high-throughput technology	Illumina NovaSeq 6000	18
EGAD50000000637	The prevalence of lung adenocarcinoma (LUAD) has increased sharply in East Asia. Early diagnosis leads to better survival rates, but this requires an improved understanding of the molecular changes during early tumorigenesis, particularly in non-smokers. We performed whole exome-sequencing and RNA-sequencing of samples from 94 East Asian patients with precancerous lesions (25 with atypical adenomatous hyperplasia [AAH]; 69 with adenocarcinoma in situ [AIS]) and 73 patients with early invasive lesions (minimally invasive adenocarcinoma [MIA]). This datasets contains the sequencing raw data (FASTQ format).	Illumina NovaSeq 6000	721
EGAD50000000638	Analysis of skewed X inactivation for X-linked disorders	PromethION	46
EGAD50000000639	Paired-end WGS data from 118 non-small cell lung cancer samples from 49 patients. WGS was performed on Illumina NovaSeq 6000.	Illumina NovaSeq 6000	118
EGAD50000000640	In our single-center study, we have launched a pilot program for pediatric patients with undiagnosed diseases in the second-largest university hospital in the Czech Republic. WES was implemented as a first-line test after inclusion in the study as part of the diagnostic workflow. This study was prospectively conducted at the Department of Pediatrics at University Hospital Brno between 2020 and 2023.	NextSeq 500	58
EGAD50000000641	An additional 320 swab samples were sequenced. The bam files contain consensus reads.	DNBSEQ-G400	320
EGAD50000000642	ChIP-seq has been perfomed on fresh-frozen tissue derived from healthy breast (HB), primary breast cancer (PB), BCa-derived liver metastasis (LM). Immunoprecipitation has been performed for ERa (sc-542, SantaCruz). Raw paired-end fastq.gz files are provided for both immunoprecipitated DNA and input samples.	Illumina NovaSeq 6000	22
EGAD50000000643	Hi-C libraries have been prepared by enzymatic digestion with MboI restriction enzyme and sonication (Covaris). Illumina single-indexig primers have been use to amplify ligated fragments. Hi-C experiments have been performed on 10 slices (50um thick) of fresh-frozen tissues or 10x10e6 Pleural Effusion cells. Raw paired-end fastq.gz files are provided.	NextSeq 500	39
EGAD50000000644	Raw sequencing data of WES that were obtained from precancerous samples as adenoma and MMR-deficient crypts and paired healthy tissue from MSI CRC patients of which 10 are diagnosed with lynch syndrome and 3 had sporadic cancer. These data were analyzed in order to evaluate microsatellite instability though tumorigenesis and associated to splicing deregulation in MSI tumors RNA.	Illumina NovaSeq 6000	36
EGAD50000000646	The detection of circulating tumor DNA, which allows non-invasive tumor molecular profiling and disease follow-up, promises optimal and individualized management of patients with cancer. However, detecting small fractions of tumor DNA released when the tumor burden is reduced remains a challenge. We developed a PCR-based targeted bisulfite method coupled to deep sequencing to detect methylation patterns of L1PA elements,which we named DIAMOND (for Detection of Long Interspersed Nuclear Element Altered Methylation ON plasma DNA). We used sodium bisulfite chemical conversion to achieve base-pair resolution analysis and designed a multiplexed PCR based on 8 amplicons covering L1PAs .	Illumina HiSeq 2500 Illumina MiSeq Illumina NovaSeq X	729
EGAD50000000647	Low coverage whole genome sequences (1-3x) were collected to study human genetic history across Wallacea archipelago and West Papuan Region of Indonesia . We collected 254 newly sequenced genomes mostly from previously undocumented populations including 9 communities in Wallacea and 3 in West Papuan regions. Out of 254, there are 8 high coverage whole genome sequences were produce in the study (L4).	Illumina NovaSeq 6000	254
EGAD50000000648	Total RNA sequencing of olfactory mucosa (OM) cells derived from individuals diagnosed with Alzheimer's disease exposed to traffic-related ultrafine particles (UFPs) for 24-h and 72-h in submerged cultures. The UFPs used for exposures were: A0, A20 and Euro6. Exposures were compared to the corresponding blank samples.	unspecified	59
EGAD50000000649	Bulk RNA-Sequencing of 18 primary breast cancers from Wu et al. (2021) study.	Illumina HiSeq 2000	24
EGAD50000000650	Tumors from 173 GBM patients were analysed for somatic mutations to generate a personalized peptide vaccine targeting tumor-specific neoantigens. Exome libraries for 173 glioblastoma tumors and matched normal DNA were sequenced on Illumina platform, alongside total RNA from the tumors.	Illumina NovaSeq 6000	346
EGAD50000000651	This dataset comprises 406 raw fastq files derived from 203 plasma cfDNA samples from 90 patients diagnosed with hepatocellular carcinoma (HCC). The samples include 90 baseline HCC (b-HCC) samples, collected during liver transplant or resection surgeries, and 113 postoperative follow-up (f-HCC) samples. For each sample, 10ng of cfDNA was used for library preparation utilizing cfMeDIP-seq technology based on the 5mC antibody-immunoprecipitation strategy. Libraries were validated via Bioanalyzer trace analysis and sequenced on Illumina NovaSeq 6000 or HiSeq 2500 platform with paired-end 150-bp (NovaSeq) or 125-bp (HiSeq) model for ~100 million reads per sample.	Illumina HiSeq 2500 Illumina NovaSeq 6000	203
EGAD50000000652	This dataset comprises 46 raw fastq files derived from 23 plasma cfDNA samples from 23 healthy (CTL) cancer-free donors. For each sample, 10ng of cfDNA was used for library preparation utilizing the cfMeDIP-seq technology based on the 5mC antibody-immunoprecipitation strategy. Libraries were validated via Bioanalyzer trace analysis and sequenced on the Illumina NovaSeq 6000 platform with paired-end 150-bp model for ~100 million reads per sample.	Illumina NovaSeq 6000	23
EGAD50000000653	In total, 90 DNA samples were genotyped on the Illumina Infinium H3Africa Consortium array (designed for 2,271,503 SNPs; using BeadChip type: H3Africa_2019_20037295_B1). Genotyping was performed at the SNP&SEQ Technology Platform (NGI/SciLifeLab Genomics, Sweden). Saliva Fulani dataset includes individuals from the following populations: 30 individuals from Senegal_FulaniLinguere, 30 individuals from Mauritania_FulaniAssaba, and 30 individuals from Mali_FulaniInnerDelta.		90
EGAD50000000654	In total, 329 DNA samples were genotyped on the Illumina Infinium H3Africa Consortium array (designed for 2,271,503 SNPs; using BeadChip type: H3Africa_2019_20037295_B1). Genotyping was performed at the SNP&SEQ Technology Platform (NGI/SciLifeLab Genomics, Sweden). WGA Fulani dataset includes individuals from the following populations: 33 BurkinaFaso_FulaniBanfora, 32 BurkinaFaso_FulaniTindangou, 33 Mali_FulaniDiafarabe, 35 Cameroon_FulaniTcheboua, 34 Chad_FulaniBongor, 24 Chad_FulaniLinia, 25 Niger_FulaniAbalak, 32 Niger_FulaniAder, 31 Niger_FulaniDiffa, 20 Niger_FulaniBalatungur, and 30 Niger_FulaniZinder.		329
EGAD50000000655	This database includes comprehensive cancer panel (CCP) on paired tumour and germline DNA from cancer of unknown primary samples, all of which are matched to WGS data under this study. This data was used to compare biomarker yield against what was achieved with WGS. This dataset contains n=34 tumour-germline paired samples. All samples are in BAM format aligned with GRCh38 reference genome.	Illumina NovaSeq 6000	68
EGAD50000000656	Whole genome and transcriptome sequencing of cancer of unknown primary tumours was used to determine yield of clinical biomarkers for a molecular guided trial or for resolving cancer type of origin. This study includes profiling of germline DNA and tumour DNA and RNA by whole genome and transcriptome sequencing. All samples are in BAM format aligned with GRCh38 reference genome. This dataset includes: 1. Whole genome sequencing (WGS) of 78 cancer of unknown primary tumour samples and 73 matched germline DNA. 2. Whole transcriptome sequencing (WTS) of 69 cancer of unknown primary tumour samples (matched to WGS cases) 3. Whole genome sequencing of 22 cell-free DNA samples from cancer of unknown primary patients and matched Germaine DNA(8 samples matched to tumour WGS)	Illumina NovaSeq 6000	264
EGAD50000000657	TruSight Oncology 500 (TSO500) cancer panel sequencing data on paired tumour and germline DNA from cancer of unknown primary samples. All of the samples in this dataset are matched to WGS data. This data was used to compare biomarker yield against what was achieved with WGS and contains n=51 tumour-only samples. All samples are in BAM format aligned with GRCh38 reference genome.	Illumina NovaSeq 6000	51
EGAD50000000659	Dataset content : Raw data from sequencing (fastq) 184 samples from 4 experiments : - Bulk mRNA seq - 24h and 48h, CTRL, CAL, IL6 and combination treatment - single-cell RNAseq - 7days CTRL, CAL, IL6 and combination treatment - ATACseq - 7days CTRL, CAL, IL6 and combination treatment To explore the effects of calprotectin (CAL) on early hematopoiesis, we incubated CD34+ cells from four healthy donors for 7 days in the presence of stem cell factor (SCF), FLT3-ligand and thrombopoietin (TPO) without and with CAL, interleukin-6 (IL6) and the IL6_CAL combination before collecting cells. We performed the same experiment with CD34+ cells collected from three patients with JAK2-V617F-mediated myelofibrosis (MF). A total of 135,545 cells from healthy donors and 111,725 cells from myelofibrosis patients were analyzed by single-cell RNA sequencing (scRNA-seq) using the 10X Chromium droplet-based platform.	Illumina NovaSeq 6000	184
EGAD50000000660	This dataset includes paired tumor-blood whole exome sequencing data for 209 gastric cancer (GC) patients, along with whole transcriptome sequencing data for 125 GC samples. Whole exome sequencing was conducted using Agilent SureSelect Human All Exon V6 kits. For RNA sequencing, total RNA was isolated using the RNeasy Mini Kit, and libraries were prepared with the TruSeq Stranded Total RNA with Ribo-Zero Gold protocol (Illumina). Aligned BAM files for both exome and RNAseq data are included in this dataset.	Illumina NovaSeq 6000	543
EGAD50000000661	Bulk RNA sequencing of flow cytometry sorted human CD4+ regulatory T (Treg), CD4+ conventional T (Tcon), CD8+ T, and CD19+ B cells from systemic lupus erythematosus patients collected at baseline (day 0, before interleukin-2 immunotherpay), day 5 (after 1 treatment cycle of interleukin-2 immunotherapy), and day 68 (after 4 treatment cycles of interleukin-2 immunotherapy). The dataset comprises files from the above-mentioned 4 immune cell subsets, collected at 3 time points (day 0, day 5, and day 68) of 12 systemic lupus erythematosus patients. Due to technical reasons, day 5 samples of patient SLE_012 could not be processed and are thus missing. The complete dataset totals in 140 raw sequencing files (fastq format).	Illumina NovaSeq 6000	140
EGAD50000000662	Single-cell RNA sequencing of flow cytometry sorted human CD45+ immune cells from 3 systemic lupus erythematosus patients (SLE_002, SLE_004, and SLE_006) collected at baseline (day 0, before interleukin-2 immunotherpay) and day 5 (after 1 treatment cycle of interleukin-2 immunotherapy). Sorted immune cells were hash tagged and then stained with TotalSeq Human Universal Cocktail (Biolegend) before further processing for single-cell RNA sequencing using the 10X-Genomics platform. Samples from all 3 patients were pooled and loaded on 3 lanes of the chromium controller, resulting in 1 raw sequncing file for each lane.	Illumina NovaSeq 6000	1
EGAD50000000663	Single-cell RNA sequencing of flow cytometry enriched CD4+ CD127- CD25+ regulatory T cells from 8 systemic lupus erythematosus patients (SLE_001 to SLE_007, and SLE_009) collected at baseline (day 0, before interleukin-2 immunotherpay) and day 5 (after 1 treatment cycle of interleukin-2 immunotherapy). Sorted immune cells were hash tagged and then stained with TotalSeq Human Universal Cocktail (Biolegend) before further processing for single-cell RNA sequencing using the 10X-Genomics platform (incl. generation of TCR libraries). Samples from patients SLE_001, SLE_002, SLE_004, SLE_006 were collected and processed on the first day of experiment (Day 1), and samples from patients SLE_003, SLE_005, SLE_007, SLE_009 on the second day of experiment (Day 2). On each day, samples were pooled and distributed to four lanes on the chromium controller. Run ID 245269 and ID 245270 identify samples sequenced on separate flow cells. Provided are raw sequencing files.	Illumina NovaSeq 6000	1
EGAD50000000664	Assay for Transposase-Accessible Chromatin using sequencing (ATAC) sequencing of human CD38+, HLA-DR+, CD38+ HLA-DR+, and CD38- HLA-DR- regulatory T (Treg) cell subsets sorted from peripheral blood pooled from 4 healthy donors. After isolation and purification of Treg cell subsets with flow cytometry, samples were frozen and transferred to Active Motif for ATAC sequencing. The dataset contains one raw sequencing file (fastq) of accesible regions for each of the above-mentioned Treg cell subsets.	Illumina NovaSeq 6000	4
EGAD50000000666	Cell free DNA sequencing data for 13samples with whole genome sequencing across multiple cancer types for cfDNA cohort	Illumina NovaSeq 6000	13
EGAD50000000667	Cell free DNA sequencing data for 9 samples with expanded panel across multiple cancer types for cfDNA cohort	Illumina NovaSeq 6000	9
EGAD50000000668	Simulated (based on real world data) tumor normal pair designed for benchmarking and optimisation of structural variation callers. The calls are derived from 12 patients in the hartwig database requested by the EUCANCan project. Paired-end sequencing data was imputed into the normal reference data to simulate real world variation like purity as well as technical impact of sequencing technologies. The dataset contains a tumor sequenced to ±90x depth and a normal control sequenced to ±30 x depth of these imputed events. The break junctions of all detected SVs in the original cases are imputed in this data from diagnostic WGS sequencing data.	Illumina NovaSeq 6000	1
EGAD50000000675	In order to know if the mechanisms we identified in mice – demonstrating that macrophages are able to orchestrate the intestinal regenerative process – are conserved in humans we develop a co-culture system with human intestinal organoids and human polarized macrophages. We co-culture human organoids with pro-inflammatory or anti-inflammatory macrophages for 48h. On day 3 cells were collected to perform single cell RNA sequencing.	NextSeq 550	3
EGAD50000000676	PacBio HiFi Revio WGS data from 16 individuals sequenced in the Genomic Medicine Sweden (GMS) Long read project. Each individual was sequenced on one PacBio Revio SMRT cell. The DNA was extracted from blood. The HiFi data was aligned to GRCh38 using minimap2 or the SMRT-link software.	unspecified	16
EGAD50000000677	scRNA-seq analysis of human iPSC-derived microglia in a novel, human in vitro 3D cortical tissue model. Samples include three distinct conditions (each as biological duplicates): wild-type microglia extracted from cultures after 1 month (WT_1mo) and 3 months (WT_3mo) of culturing, as well as knock-in microglia extracted from cultures carrying three familial AD mutations in the APP gene (Swedish, Iberian, Arctic; introduced by CRISPR/Cas9-mediated knock-in) after 3 months of culturing (KI_3mo).	Illumina NovaSeq 6000	3
EGAD50000000678	Zephir trial gene expression	Illumina NovaSeq 6000	24
EGAD50000000679	Colorectal cancer (CRC) is the second leading cause of cancer death worldwide. Early detection of precursor lesions or early-stage cancer could hamper cancer development or improve survival rates. Liquid biopsy, which detects tumor biomarkers, such as mutations, in blood, is a promising avenue for cancer screening. To assess the presence of genetic variants in plasma cell-free tumor DNA from patients with precursor lesions and colorectal cancer using the commercial Oncomine Colon cfDNA Assay. Cell-free DNA (cfDNA) samples from the blood plasma of 52 Brazilian patients were analyzed. Eight patients did not have any significant lesions (five normal colonoscopies and three hyperplastic polyps), 24 exhibited precursor lesions (13 nonadvanced adenomas, ten advanced adenomas, and one sessile serrated lesion), and 20 patients with cancer (CRC). The mutation profile of 14 CRC-associated genes were determined by next-generation sequencing (NGS) using the Oncomine Colon cfDNA Assay in the Ion Torrent PGM/S5 sequencer.	Ion Torrent S5	52
EGAD50000000681	This dataset contains 518 case and control WGS sequencing samples of patients with multiple myeloma. Sequencing was performed on Illumina NovaSeq 6000 and HiSeq X using TruSeq Nano DNA Kits. The sequencing was always paired.	HiSeq X Ten Illumina HiSeq X Illumina NovaSeq 6000	518
EGAD50000000682	This dataset contains 221 case and control WGS sequencing samples of patients with multiple myeloma. Sequencing was performed on Illumina NovaSeq 6000 and HiSeq X using TruSeq Nano DNA Kits. The sequencing was always paired.	HiSeq X Ten Illumina HiSeq X Illumina NovaSeq 6000	221
EGAD50000000683	This dataset contains 371 case and control RNA sequencing samples of patients with multiple myeloma. Sequencing was performed on Illumina NovaSeq 6000 and HiSeq X, HiSeq 4000 and HiSeq 2000 using TruSeq Stranded RNA and TruSeq Stranded total mRNA Kits. The sequencing was always paired.	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 4000 Illumina NovaSeq 6000	370
EGAD50000000684	Little is known about the transcriptomic profile of individuals who are exposed to SARS-CoV-2 yet resist becoming PCR positive. To investigate this, longitudinal whole-blood samples were taken (0, 7, 14, and 28 days after enrolment) from PCR positive and PCR negative SARS-CoV-2-naïve household contacts who were recently exposed to a COVID-19 index. Samples were also taken from pre- and post-pandemic unexposed controls. Total RNA was extracted from PAXgene tubes before undergoing poly(A) selection followed by globin and rRNA depletion. DNA libraries were constructed using the NEBNext® Ultra™ II Directional RNA Library Prep Kit for Illumina. All samples were then sequenced across 2 flowcells of an Illumina HiSeq 4000.	Illumina HiSeq 4000	144
EGAD50000000685	Mononuclear cells were isolated as cell suspensions by dissociation in a BD Horizon™ Dri Tumor & Tissue Dissociation Reagent (BD Biosciences) using a gentleMACS dissociator (Miltenyi Biotec). A totalseq-C antibody cocktail (Biolegend) was used, and the cell suspensions were diluted to a concentration of 1,000 cells/μL. Library sequencing was performed using the DNBSEQ-G400 (MGI Tech), Illumina HiSeq 2000, or Illumina NovaSeq 6000 5 (Illumina). The FASTQ files were aligned to the 10x provided reference genome using 10x Genomics Cell Ranger software v7.1.0 to create unique molecular identifier count tables of gene expression of the samples.	DNBSEQ-G400 Illumina HiSeq 2000 Illumina NovaSeq 6000	7
EGAD50000000686	Spatial transcriptomics analysis of triple negative breast cancers Both bulk sequencing and ST were performed Counts, images etc are available at https://zenodo.org/doi/10.5281/zenodo.8135721	Illumina NovaSeq 6000 NextSeq 500	94
EGAD50000000687	This is a Next Generation Sequencing approach based on whole Usher Syndrome genes sequencing with the aim of diagnosing USH patients and USH2A-associated RP patients	Illumina MiSeq NextSeq 500	44
EGAD50000000688	This dataset contains RNA sequencing data of 36 glioblastoma samples. Sequencing was performed on Illumina NovaSeq 6000 using TruSeq Stranded RNA Kit. The sequencing was always paired.	Illumina NovaSeq 6000	36
EGAD50000000689	The data set contains FASTQ files (filetype) of a NEXTSEQ550DX run (instrument). De FASTQ files are from DNA and RNA sample. The Library prep is a NGS TSO500 library prep (Illumina) (technology). A Study to Examine the Clinical Value of Comprehensive Genomic Profiling Performed by Belgian NGS Laboratories: a Belgian Precision Study of the BSMO in Collaboration With the Cancer Centre (BALLETT) This 2-year study involves the consortium of 9 cooperating Belgian NGS laboratories and will enroll 936 metastatic or locally advanced cancer patients coming from 13 different Belgian hospitals and cancer centers. Upon inclusion, all cancer patients will be offered 'comprehensive genomic profiling' (CGP) using Illumina's TSO500 NGS panel. This targeted NGS panel of 523 genes allows for the detection of single nucleotide variants, small indels, copy number variations and fusions, as well as for the determination of the 'tumor mutational burden' (TMB) and the 'microsatellite-instability' status (MSI). Both the wet lab execution of the CGP as well as the biological and clinical classification of the variants will be performed in a fully standardized way among the 9 participating Belgian local NGS laboratories.	NextSeq 550	8
EGAD50000000691	The dataset contains raw data from 10x Multiome derived single-nuclei RNA and single-nuclei ATAC sequencing in the same cells	Illumina NovaSeq 6000	14
EGAD50000000692	This dataset includes FASTQ files from FGF14 alleles sequenced by targeted nanopore in 67 patients, 64 control individuals and three unaffected relatives.	MinION	134
EGAD50000000694	Tumours from MMRd EC patients treated with 2 cycles neoadjuvant ICI. TruSight Oncology 500 (TSO 500) panel for all 10 tumour samples from patients included in the trial.	Illumina NovaSeq 6000	10
EGAD50000000695	The dataset for “Early detection of ovarian cancer using cell-free DNA fragmentomes and protein biomarkers” includes 409 cram files from whole genome next-generation sequencing on the Illumina HiSeq2500. The samples analyzed include plasma samples from healthy individuals and patients with cancer.	Illumina HiSeq 2500	409
EGAD50000000696	We subjected to whole genome sequencing (WGS) one ILC case lacking CDH1 biallelic mutations. Both tumor and normal samples were sequenced.	Illumina HiSeq 2500	2
EGAD50000000697	Patients were biopsied at progression on molecular targeted agents and WES and/or WTS were performed to identify resistance mechanisms. 271 underwent one or sequential tissue biopsies.	Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina NovaSeq X NextSeq 500	2071
EGAD50000000698	In this study we aim to gain insight into the mechanisms of immune evasion after initial pathologic response to neoadjuvant immune checkpoint inhibition in macroscopic stage III melanoma by pooling data of the neoadjuvant OpACIN, OpACIN-neo and PRADO trials. Here we used RNAseq data from 21 patients(paired end)	Illumina NovaSeq 6000	3
EGAD50000000699	In this study we aim to gain insight into the mechanisms of immune evasion after initial pathologic response to neoadjuvant immune checkpoint inhibition in macroscopic stage III melanoma by pooling data of the neoadjuvant OpACIN, OpACIN-neo and PRADO trials. Therefore we analyzed in-depth paired baseline and recurrent tumor samples through a.o. DNA sequencing analyses.	Illumina NovaSeq 6000	19
EGAD50000000700	In this study we aim to gain insight into the mechanisms of immune evasion after initial pathologic response to neoadjuvant immune checkpoint inhibition in macroscopic stage III melanoma by pooling data of the neoadjuvant OpACIN, OpACIN-neo and PRADO trials. Therefore we analyzed in-depth paired baseline(n=10) and recurrent tumor samples of the lymph nodes(n=10) and brain metastasis (n=2) through Whole Exome Sequencing (WES) analyses.	Illumina NovaSeq 6000	32
EGAD50000000701	Samples were from 34 mothers, 6 with gestational diabetes, 14 with type 1 diabetes and 14 without diabetes history with either vaginal (N=17) or cesarean section (n=17) delivery, comprising primary planned section (N=9) or secondary emergency section (N=8). Anesthesia methods during delivery included spinal (N=11), epidural (N=11), or local anesthesia (N=6), with unknown anesthesia information for six cases. Samples from each other were from the Villous and the Decidua part of the Placenta	Illumina NovaSeq 6000	68
EGAD50000000704	Ischemia reperfusion is an unavoidable step of organ transplantation. Development of therapeutics for lung injury during transplantation has proved challenging; understanding lung injury from human data at the single cell resolution is required to accelerate the development of therapeutics. Donor lung biopsies from six human lung transplant cases were collected at the end of cold preservation and 2-hour reperfusion and underwent single cell RNA sequencing.	Illumina NovaSeq 6000	12
EGAD50000000706	The aim of this project is to assess differences in intratumoral immune composition in pregnant melanoma patients versus non pregnant melanoma controls. Samples (N=25) were obtained from a local patient database. From archived FFPE we isolated RNA and performed NGS and bio-informatica data-analysis.	Illumina NovaSeq 6000	25
EGAD50000000707	Further investigation and characterisation of 12q-amplified low- and high-grade osteosarcomas with MDM2 and/or CDK4 amplification focusing on SV, copy number and gene fusion analyses. In total, 25 cases (33 samples total due to multi-sampling) were included, with some form of sequencing data available for 27 samples. Mate-pair whole genome sequencing (Illumina) is available for 19 samples, longread whole genome sequencing (PacBio HiFi) on 10 samples and RNA-sequencing (Illumina Truseq) on 21 samples. Data is available as BAM files.	NextSeq 500 unspecified	25
EGAD50000000708	Mid-pass whole genome sequencing was performed for 264 Malagasy individuals across three geographic regions across the island of Madagascar (west coast, central highlands, and southern highlands). This dataset includes the VCF from joint variant calling with reference populations as specified in the associated publication.		264
EGAD50000000709	Binary calls for all available samples in 4 Atezo trials.		2803
EGAD50000000710	This dataset contains chromosome 3 alignments. WGS was performed on genomic DNA isolated from peripheral blood cells.	unspecified	6
EGAD50000000711	H3K27ac CUT&Tag performed in LCLs derived from constitutional MLH1 epimutation carriers and non-carrier relatives	unspecified	6
EGAD50000000712	RNA-seq in LCLs derived from constitutional MLH1 epimutation carriers and non-carrier relatives to investigate alterations in gene expression associated with MLH1 epimutation	unspecified	6
EGAD50000000713	UMI-4C sequencing data generated for MLH1 promoter to assess allele-specific interactions in LCLs derived from constitutional MLH1 epimutation carriers and non-carrier relatives	unspecified	6
EGAD50000000714	ATAC-seq performed in LCLs derived from constitutional MLH1 epimutation carriers and non-carrier relatives to profile alterations in chromatin accessibility associated with constitutional MLH1 epimutation.	unspecified	6
EGAD50000000715	Whole exome sequencing of 66 gastric samples and whole transcriptome sequencing of 191 gastric samples.	Illumina HiSeq 4000	257
EGAD50000000716	These data were generated as part of a collaboration between University of Bristol, UK and the Stanley Center at the Broad Institute. This project sequenced and analysed the whole exomes of 2969 samples from the Avon Longitudinal Study of Parents and Children (ALSPAC). ALSPAC is a population-based pregnancy cohort with participants acting as ‘controls’ for this project. Genomic DNA from each sample was sequenced to a mean depth of 20x. The exome used Twist capture and samples were sequenced on Illumina NovaSeq 6000 machines producing CRAM files. ALSPAC recruited pregnant women in the Avon County of south-west England between 1991 and 1992. Of the 15,447 pregnancies, 14,901 children were alive at 1 year of age. DNA was extracted from child blood samples taken between ages 7 and 24.	Illumina NovaSeq 6000	2969
EGAD50000000717	The dataset includes the RNA-seq on the 2D cell culture of primary myoblast cell lines derived from FSHD patients and healthy donors (n=3 in each condition) and on the 3D muscle bundles generated by the human iPSCs derived from healthy and FSHD-affected cells of mosaic FSHD patients (n=3 in each condition)	Illumina NovaSeq 6000	12
EGAD50000000718	The dataset includes the PacBio full-length isoform sequencing data of DUX4i myoblasts (Clone-5) with and without 16-hour DOX-treatment (referred to as DUX4- and DUX4+; n=1 in each condition).	Sequel II	2
EGAD50000000719	The dataset includes the short-read RNA sequencing data of DUX4i myoblasts (Clone-5 and Clone-7) with and without 16-hour DOX-treatment (n=1 in each condition).	Illumina NovaSeq 6000	4
EGAD50000000720	Long-read (PacBio) RNA sequencing dataset of three neural retinal samples. Three PacBio libraries were prepared according the 'standard workflow' optimized for sequencing transcripts centered around 2kb. Additionally one Library (input HNR_S2) was prepared according an optimized 'long workflow' to enrich for larger transcripts up to >10kb. To further enhance the capture of full-length transcripts for USH2A and ADGRV1 (USH2C) genes, we also employed a targeted enrichment approach using the Samplix Xdrop System, followed by PacBio long-read sequencing. USH2A and ADGRV1 enrichment was performed by targeting '5- mid and 3' targets of the respective genes. Finally, we also added ONT (Oxford Nanopore technology) long-read mRNA sequencing of three independent neural retina samples. These data were used for the validation of events observed from the Iso-Seq and Samplix data. The ONT datasets contain sequencing data of the Usher-associated genes. Files are raw BAM, and CRAM format files generated by a Sequel II machine. Additionally, the ccs3 BAM format files are included.	Sequel II unspecified	8
EGAD50000000722	Exome sequencing data from fourteen phenotypically abnormal human fetal samples.	Illumina NovaSeq 6000	14
EGAD50000000723	Basal-like breast cancer originates in luminal progenitors, frequently with an altered PI3K pathway, and focally in close association with genetically altered myoepithelial cells at the site of tumor initiation. The exact trajectory behind this bi-lineage phenomenon remains poorly understood. Here we used a breast cancer relevant transduction protocol including hTERT, shp16, shp53, and PIK3CA(H1047R) to immortalize FACS isolated luminal cells, and we identified a candidate multipotent progenitor. We found that the apparent luminal phenotype of these oncogene transduced progenitors was metastable giving rise to basal-like cells dependent on culture conditions. After culturing the cells for more than 60 passages, cells were subjected to scRNA-seq as well as bulk RNA sequencing of two subpopulations (CD271+ and CD271-).	NextSeq 2000 unspecified	3
EGAD50000000724	Methylation profile of 33 patients with small cell lung cancer (SCLC) for both, tumour and normal lung samples using MeDIP-seq.	Illumina HiSeq 2000	66
EGAD50000000726	Using DNA extracted from peripheral blood, Cas9-targeted nanopore DNA sequencing was used to analyze MAGEL2 gene, including its entire regulatory construct (chr15:23639316-23651466), for sequence variation and 5-methyl-cytosine (5mC) modification in a cohort of adults with HFA compared to sex- and age-matched NC.	PromethION	40
EGAD50000000727	Single-nuclei sequencing data from four neuroblastoma patients. Each patient was run on two lanes, resulting in two runs per patient. Data is provided in paired-end fastq files.	Illumina NovaSeq 6000	8
EGAD50000000728	Genomic and Transcriptomic single-cell sequencing of neuroblastoma patient. Data represents one 96-well plate that was processed with G&T sequencing, resulting in genomic and transcriptomic data from the same single cells. Dataset contains 95 bam files containing the DNA sequencing data and 95 bam files containing the RNA sequencing data.	Illumina NovaSeq 6000	190
EGAD50000000729	Genomic and Transcriptomic sequencing of neuroblastoma HSR and ecDNA cell lines. Data represents five 96-well plates that were processed with G&T sequencing, resulting in genomic and transcriptomic data from the same single cells. Dataset contains 95 bam files containing the DNA sequencing data and 95 bam files containing the RNA sequencing data of CHP212 cells, 380 bam files (190 DNA and 190 RNA) for TR14 cells, 188 bam files (94 DNA and 94 RNA) for Kelly cells and 192 bam files (96 DNA and 96 RNA) for IMR5/75 cells.	Illumina NovaSeq 6000	950
EGAD50000000730	A better understanding of the molecular landscape of non-muscle-invasive bladder cancer (NMIBC) is essential to improve risk assessment and identify potential therapeutic targets. Here, we perform a comprehensive genomic analysis of patients diagnosed with NMIBC based on whole-exome- (n=438), shallow whole-genome- (n=362), and total RNA-sequencing (n=414). This dataset contains 876 BAM files corresponding to the full WES dataset. Tumor and matched germline DNA were sequenced to a mean coverage of 132x (35x-338x) and 128x (31x-302x), respectively.	Illumina NovaSeq 6000	876
EGAD50000000731	A better understanding of the molecular landscape of non-muscle-invasive bladder cancer (NMIBC) is essential to improve risk assessment and identify potential therapeutic targets. Here, we perform a comprehensive genomic analysis of patients diagnosed with NMIBC based on whole-exome- (n=438), shallow whole-genome- (n=362), and total RNA-sequencing (n=414). Additionally, tumor/germline samples from 4 patients were sequenced using Oxford Nanopore technology. This dataset contains 8 BAM files corresponding to the Long read sequencing data.	PromethION	8
EGAD50000000732	A better understanding of the molecular landscape of non-muscle-invasive bladder cancer (NMIBC) is essential to improve risk assessment and identify potential therapeutic targets. Here, we perform a comprehensive genomic analysis of patients diagnosed with NMIBC based on whole-exome- (n=438), shallow whole-genome- (n=362), and total RNA-sequencing (n=414). This dataset contains 876 BAM files corresponding to the full WES dataset. This dataset contains 392 BAM files corresponding to the full sWGS dataset. Tumor samples were sequenced to median coverage of 2.14x (1.24-9.01).	Illumina NovaSeq 6000	392
EGAD50000000733	A better understanding of the molecular landscape of non-muscle-invasive bladder cancer (NMIBC) is essential to improve risk assessment and identify potential therapeutic targets. Here, we perform a comprehensive genomic analysis of patients diagnosed with NMIBC based on whole-exome- (n=438), shallow whole-genome- (n=362), and total RNA-sequencing (n=414). This dataset contains 414 BAM files corresponding to the full total RNA-sequencing dataset.	Illumina HiSeq 2000 Illumina NovaSeq 6000	414
EGAD50000000734	cell-free Reduced Representation Bisulfite Sequencing (cfRRBS) data from blood plasma and FFPE tissue samples from patients with esophageal adenocarcinoma, and blood plasma of healthy donors.	Illumina NovaSeq 6000	224
EGAD50000000735	We generated chromatin accessibility profiles for anti-Notch2 (n=3) and anti-gD (n=3) treated LIV78 tumors. Tumors were isolated, snap frozen in liquid nitrogen and stored at -80C. Tumor tissue was pulverized using a Covaris CP02 CryoPrep Pulverizer. Flash frozen pulverized tumor samples were sent to Active Motif to perform the ATAC-seq assay. Resulting material was quantified using the KAPA Library Quantification Kit for Illumina platforms (KAPA Biosystems), and sequenced with PE42 sequencing on the NextSeq 500 sequencer (Illumina) to produce FASTQ sequencing files.	NextSeq 500	6
EGAD50000000736	To establish the baseline transcriptional profiles, we performed bulk RNA-seq for 36 of our HCC PDX models. 2-6 samples were collected for each model resulting in a total of 68 samples. Isolated RNA was used as input for library preparation using TruSeq RNA Sample Preparation Kit v2 (Illumina). The libraries were multiplexed and sequenced on Illumina HiSeq 2500 (Illumina) to produce FASTQ files.	Illumina HiSeq 2500	68
EGAD50000000737	To study the effect of Notch inhibition on the transcriptional profiles our PDX models, we generated bulk RNA-seq data for two Notch inhibition sensitive and three Notch inhibition insensitive models across multiple timepoints (8h, 24h, 48h, 7d, 17d) using anti-JAG1, anti-Notch2, anti-Notch1 or anti-gD antibodies. 3-5 samples were collected for each treatment condition, resulting in a total of 96 samples. Isolated RNA was used as input for library preparation using TruSeq RNA Sample Preparation Kit v2 (Illumina). The libraries were multiplexed and sequenced on Illumina HiSeq 2500 (Illumina) to produce FASTQ files.	Illumina HiSeq 2500	96
EGAD50000000738	We generated single-cell RNA-seq data for anti-Notch2 (n=3), anti-JAG1 (n=1) and anti-gD (n=4) treated LIV78 tumors. NCR nude mice bearing similarly sized LIV78 tumors (volumes between 300-900mm3) were injected intravenously with 30mg/kg body weight of either anti-Notch2, anti-Jag1 or anti-gD control antibody 72hrs prior to tumor harvest and tumor cell isolation. Following the depletion of CD45+ and Ter119+ cells, the samples were processed for single-cell RNA-seq (scRNAseq) as described previously (Long et al., 2019) using the Chromium Single Cell 3’ Library and Gel bead kit v2, following the manufacturer’s manual. cDNAs and libraries were prepared following the manufacturer’s manual (10X Genomics). Libraries were profiled by Bioanalyzer High Sensitivity DNA kit (Agilent Technologies) and quantified using Kapa Library Quantification Kit (Kapa Biosystems, Wilmington, MA). Each library was sequenced in one lane of HiSeq 2500 (Illumina) following the manufacturer’s sequencing specification (10X Genomics) to produce the resulting FASTQ files.	Illumina HiSeq 2500	8
EGAD50000000739	RNA editing analyses of BRCA-isogenic cell lines and patient-derived xenografts (PDX)	Illumina NovaSeq X	22
EGAD50000000740	Enrichments of human stool samples in culture medium containing succinate as the main carbon source and the corresponding stool samples.	Illumina MiSeq	204
EGAD50000000741	Structural variants assessed in 6 patient samples with varying phenotypes, through the use of sniffles2 with a support parameter of 1 and at a length of 30 bp	PromethION	6
EGAD50000000743	This dataset contains 138 .bam files sequenced with Illumina NovaSeq 6000. The files with the variant calling performed to the sequencing data, as well as the clinical data (phenotype) extracted during the sample extraction.	Illumina NovaSeq 6000	138
EGAD50000000744	The dataset for “Genomic landscapes of endometrioid and mucinous ovarian cancers and morphologically similar tumor types” includes 288 bam files from whole genome, whole exome and targeted next-generation sequencing on the Illumina HiSeq2500 and MiSeq. The samples analyzed include tumor and normal tissue samples from patients with cancer.	Illumina HiSeq 2500	288
EGAD50000000745	38 prostate cancer specimens derived from five human patients with ISUP ≥ 3 pathology who underwent radical prostatectomy and pelvic lymph node dissection at surgery as part of the International Cancer Genome Consortium (ICGC) late-onset cohort. Using single-cell multiome (expression and chromatin accessibility from RNA and ATAC, respectively) sequencing assay, we characterized 282,956 nuclei/cells, with a median of 6 primary tumor foci from the prostate gland and 3 locoregional lymph node metastases across all patients.	Illumina NovaSeq 6000	38
EGAD50000000746	Transcriptomics for the ALTTO study by high-throughput sequencing. 386 HER2+ breast cancers treated by trastuzumab were sequenced. Two fastq files are given for each sample.	Illumina NovaSeq 6000	386
EGAD50000000747	Transcriptomic analysis of K7M2 murine osteosarcoma cells stably modified to overexpress or repress CYR61	Illumina NovaSeq X	20
EGAD50000000748	Dataset of the study "Synergy study: "Tissue resident CD8+ T cell clonal expansion in advanced triple negative breast cancer is associated with response to chemoimmunotherapy". This dataset contains the cellranger's gene-expression matrices for the 5 prime single-cell RNA sequencing experiments		40
EGAD50000000749	Single-cell RNA and TCR-sequencing of three patients with treatment-refractive immune-mediated arthritis. The dataset consists of synovial tissue CD4+ and CD8+ T cells and peripheral blood CD45+ leukocytes.	Illumina NovaSeq 6000	12
EGAD50000000750	This study is the first phase of the Moroccan Genome Project, which included the complete sequencing of 109 genomes from the Kingdom of Morocco. The sequencing was performing using the Illumina NovaSeq6000 platform, with a mean coverage of 30X.	Illumina NovaSeq 6000	109
EGAD50000000751	This dataset contains single cell protein tag sequencing of EAC samples by 10x Genomics CITE-seq. Total number of files - 24. File format - FASTQ.	Illumina NovaSeq 6000	12
EGAD50000000752	This dataset contains single cell RNA-sequencing of EAC samples by 10x Genomics. Total number of files - 24. File format - FASTQ.	Illumina NovaSeq 6000	12
EGAD50000000753	This dataset contains single cell RNA-sequencing of BE samples by 10x Genomics. Total number of files - 28. File format - FASTQ.	Illumina NovaSeq 6000	14
EGAD50000000754	The NFKBIE gene, which encodes the NF-κB inhibitor IκBε, is mutated in 3-7% of patients with chronic lymphocytic leukemia (CLL). The most recurrent alteration is a 4-bp frameshift deletion associated with NF-κB activation in leukemic B cells and poor clinical outcome. To study the functional consequences of NFKBIE gene inactivation, both in vitro and in vivo, we engineered CLL B cells and CLL-prone mice to stably down-regulate NFKBIE expression and investigated its role in controlling NF-κB activity and disease expansion. We found that IκBε loss leads to NF-κB pathway activation and promotes both migration and proliferation of CLL cells in a dose-dependent manner. Importantly, NFKBIE inactivation was sufficient to induce a more rapid expansion of the CLL clone in lymphoid organs and contributed to the development of an aggressive disease with a shortened survival in both xenografts and genetically modified mice. IκBε deficiency was associated with an alteration of the MAPK pathway, also confirmed by RNA-sequencing in NFKBIE-mutated patient samples, and resistance to the BTK inhibitor ibrutinib. In summary, our work underscores the multimodal relevance of the NF-κB pathway in CLL and paves the way to translate these findings into novel therapeutic options.	Illumina HiSeq 2500	8
EGAD50000000755	Oxford Nanopore Technologies based long-read RNA sequencing data from 5 patients with stereotyped subset CLL. The BAM files were aligned against hg19 reference genome.	MinION	5
EGAD50000000756	The gut microbiome has been shown to be affected by the use of many human-targeted medications, and the interaction can be bidirectional. This has been clearly demonstrated for type 2 diabetes medications that have been in clinical use for several decades. However, the bidirectional effects of novel type 2 diabetes drugs semaglutide, empagliflozin, and the gut microbiome have yet to be clearly described. We investigated the effect of semaglutide and empagliflozin initiation on the gut microbiome of type 2 diabetes patients. In addition, we analyzed whether the pre-treatment gut microbiome can predict the treatment efficacy. Twenty subjects (10 women and 10 men) were enrolled in the study between November 2019 and January 2023 from the University of Tartu Hospital, Tartu, Estonia. All participants had been diagnosed with type 2 diabetes. The study group taking semaglutide included 6 women and 4 men, and the group prescribed empagliflozin included 4 women and 6 men. Gut microbiome fecal samples donated at four timepoints (Baseline, Month 1, Month 3; Month 12) were studied using 16S ribosomal RNA gene sequencing and analysis.	Illumina MiSeq	77
EGAD50000000757	This dataset consists of raw sequencing data from targeted next generation sequencing of samples from patients with chronic lymphocytic leukaemia. These samples were used to validate an approach for detection of sample mislabelling and swapping in clinical cohorts using single nucleotide polymorphisms (SNPs), as described in the associated article. The data are divided into two parts - data from the initial validation experiment (discovery cohort) and data from an additional independent cohort.	Illumina MiSeq	36
EGAD50000000758	Gene expression of MAC enriched by adhesion to Cyr61 in comparison to HUVEC. This dataset includes RNAseq Data from myeloid angiogenic cells (MAC) enriched from peripheral blood by adhesion to Cyr61 (4 samples prepared from different donors) and of human umbilical vein endothelial cells (HUVEC) (4 samples). Illumina NextSeq500 technology was used. FASTQ data are provided.	NextSeq 500	8
EGAD50000000760	9 DHG-H3G34 patient samples were sequenced by paired-end scRNA-sequencing using the Smart-Seq2 protocol on a NextSeq 500 sequencer (Illumina). Illumina bcl2fastq 1.5 was used for demultiplexing. This dataset contains the resulting 9,408 fastq files of 4,704 single cells/nuclei sequenced.	NextSeq 500	9
EGAD50000000763	Total RNA sequencing on 9 uveal melanomas. Libraries were prepared using the TruSeq Stranded Total RNA Library Prep Gold (Illumina, 20020599). Paired-end libraries (2 x 100 bp) were sequenced on a NovaSeq 6000 instrument (Illumina).	Illumina NovaSeq 6000	9
EGAD50000000764	Whole genome sequencing on 1 uveal melanoma and corresponding germline sample. Libraries were prepared using the Kapa HyperPrep kit (Roche, 07962363001). Paired-end libraries (2 x 150 bp) were sequenced on a NovaSeq 6000 instrument (Illumina).	Illumina NovaSeq 6000	2
EGAD50000000765	Whole genome sequencing on HAP-1 clones wild-type or knockout for MBD4 and/or TDG. Libraries were prepared using the Kapa HyperPrep kit (Roche, 07962363001). Paired-end libraries (2 x 100 bp) were sequenced on a NovaSeq 6000 instrument (Illumina).	Illumina NovaSeq 6000	20
EGAD50000000766	Whole Genome Bisulfite Sequencing on normal primary human uveal melanocytes. WGBS libraries were prepared using the Accel-NGS Methyl-Seq DNA Library Kit (Swift Biosciences 30024), the EZ DNA Methylation-Gold Kit (Zymo D5005), and DNA Clean & Concentrator-5 (Zymo D4013), following the instruction manual Accel-NGS Methyl-Seq DNA Library (Revision 160510). Paired-end (2 x 150 bp) libraries were sequenced on a DNBSEQ-T7 instrument (MGI) after library conversion.	DNBSEQ-T7	1
EGAD50000000767	Whole exome sequencing on 11 uveal melanoma samples and corresponding germline samples. Libraries were prepared using the SureSelectXT2 Clinical Research Exome V2 kit (Agilent, 5190-9500 and G9621B). Paired-end libraries (2 x 100 bp) were sequenced on HiSeq 2000/2500 or NovaSeq 6000 instruments (Illumina).	Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina NovaSeq 6000	21
EGAD50000000768	Dataset for the manuscript Scywalker: scalable end-to-end data analysis workflow for nanopore single-cell transcriptome sequencing. Contains fastq files for 4 brain samples obtained from short-read NovaSeq 6000 v1.5 Illumina and long-read Oxford Nanopore PromethION sequencing. Single-cell suspensions were generated using 10x Genomics Chromium Next GEM Single Cell 3'Kit v3.1	Illumina NovaSeq 6000 PromethION	8
EGAD50000000769	Investigation of post-zygotic and germline variants using whole exome and ultra-deep duplex sequencing in paired uninvolved margin and primaty tumor samples from 126 breast cancer patients with differing survival outcomes, with skin or blood samples as reference. Pairs of uninvolved margin and blood samples were also collected for 15 reduction mammoplasty patients without personal or familial history of cancer, serving as controls.	HiSeq X Ten	22
EGAD50000000770	Investigation of post-zygotic and germline variants using whole exome and ultra-deep duplex sequencing in paired uninvolved margin and primary tumor samples from 126 breast cancer patients with differing survival outcomes, with skin or blood samples as reference. Pairs of uninvolved margin and blood samples were also collected for 15 reduction mammoplasty patients without personal or familial history of cancer, serving as controls.	HiSeq X Ten	408
EGAD50000000771	29 Hi-C datasets [8 G3 (MB3667, MB3687, MB3690, MB3692, MB3693, MB4010, MB4037, MB4141), 13 G4 (MB0558, MB3510, MB3670, MB3689, MB3716, MB3760, MB3761, MB3807, MB4079, MB4132, MB4174, MBSF7, SNC_2_5), 7 SHH (MB3612, MB3661, MB3662, MB3695, MB3697, MB3724, MB4143), and 1 WNT (MB4036)]. Fresh tissue samples were obtained from The Hospital for Sick Children (Toronto, ON). In situ Hi-C libraries were generated using approximately 2.5 million dissociated cells as input. All Hi-C libraries were sequenced at 150 bp PE with a Hi-Seq X instrument (Illumina) at McGill Genome Centre (Montreal, QC).	HiSeq X Ten	29
EGAD50000000772	This dataset contains WGBS data from 20 samples, which are either from colorectal cancer or control colon tissue. Sequencing was performed on either Illumina HiSeq 2000 or HiSeq X and the sequences are paired.	Illumina HiSeq 2000 Illumina HiSeq X	20
EGAD50000000773	A phase II clinical trial (NCT04965766) of patritumab deruxtecan in 99 breast cancer patients. Whole-exome sequencing (WES) is available for 43 tumor samples and 43 blood samples collected at entry into the trial. RNA-squencing is available for 44 samples comprising 22 samples collected at entry into the trial and 22 samples collected during treatment (cycle 1 day 3, cycle 1 day 19, or cycle 2 day 3) from 22 patients.	Illumina NovaSeq 6000	136
EGAD50000000774	A clinically and genomically annotated Early onset colorectal cancer and late onset colorectal cancer. This study aimed to identify genomic characteristics of early-onset colorectal cancer (EOCRC) compared to late-onset cases. Whole-genome sequencing (WGS) was performed on EOCRC and late-onset colorectal cancer (LOCRC) tissues and blood samples to analyze their genomic profiles.	DNBSEQ-T7	198
EGAD50000000775	raw RNAseq data from blood plasma of patients diagnosed with liver disease. Total RNA libraries were prepared using the SMARTer Stranded Total RNA-Seq-Kit v3 - Pico Input Mammalian (Takara Bio). Libraries were paired-end sequenced (2x100) o a NovaSeq 6000 instrument using NovaSeq S2 or S1 kit (Illumina).	Illumina NovaSeq 6000	933
EGAD50000000776	Samples was collected at the Arnie Charbonneau Cancer Institute (University of Calgary). Whole Genome Sequencing was performed at the New York Genome Center, and libraries were prepared using the Truseq DNA Nano Library Preparation Kit. Libraries were sequenced on an Illumina Novaseq 6000 sequencer using 2 x 150-bp cycles. The dataset consists in 85 BAM files from patients with Multiple Myeloma treated with bi-specific and CAR-T therapies. Reference Genome: GRCh38.	Illumina NovaSeq 6000	85
EGAD50000000777	The dataset contains whole genome sequencing data of 58 high-grade serous carcinoma (HGSC) patients sequenced with Novoseq 6000. The 144 samples are either fresh frozen tumour samples or blood samples. The files provided are paired fastq files.	Illumina NovaSeq 6000	144
EGAD50000000778	This dataset contains snRNA-seq data of 11 regionally sampled GBM tissue (peritumoral region, tumor edge, and tumor core). Regionally sampled GBM patient tissue was dissociated and nuclei were processed in an unbiased manner without any sorting procedure. Nuclei were dissociated from frozen tissue using Chromium Nuclei Isolation Kit. Nuclei barcoding, cDNA preparation, and library construction were performed following the Evercode WT or WT mini User Manual, by combinatorial barcoding to assign a unique barcode to each cell.	Illumina NovaSeq 6000	11
EGAD50000000781	Deep sequencing of a targeted panel of 67 paediatric cancer related genes for cfDNA with UMIs at approximately 1500xUMI coverage.	Illumina NovaSeq 6000	404
EGAD50000000782	Low coverage whole genome sequencing (0.5-4X) for cfDNA collected from blood plasma.	Illumina NovaSeq 6000	477
EGAD50000000783	Deep sequencing of a targeted panel of 92 paediatric cancer related genes or 233 pan-cancer related genes for both relapse/archival disgnostic tumour tissue and germline DNA from whole blood cell pellet. Tumour tissue was sequenced to approximately 400X coverage and germline DNA was sequences to aproximately 100X coverage.	Illumina NovaSeq 6000	1149
EGAD50000000784	Low coverage whole genome sequencing (0.5-4X) for both relapse and archival diagnostic tumour tissue.	NextSeq 500	586
EGAD50000000785	This data includes paired WGS from tumor/normal and longitudinal panel-based sequencing data of plasma-derived cfDNA from melanoma patients being treated with ICI.	Illumina HiSeq 4000	224
EGAD50000000787	This dataset contains long-read whole-genome sequencing (lrWGS) data from 14 samples. Five lrWGS data are from single-cell (sc, sc_2, sc_3), multi-cell (mc, 10 cells), and bulk samples of HG002. The remaining nine lrWGS data are from two preimplantation genetic testing (PGT) families, including four from blood bulk DNA of the parental pairs and five from trophectoderm biopsies of two embryos from one family and three embryos from another family. The data are provided in raw FASTQ format and were generated using the PromethION device from Oxford Nanopore Technologies.	PromethION	14
EGAD50000000788	Tumor samples were microdissected from formalin-fixed, paraffin-embedded tissue and genomic DNA was extracted. DNA libraries were generated using xGen ssDNA & Low-Input DNA Library Preparation Kit (IDT). The libraries were enriched for the exome using xGen Exome Hyb Panel (IDT) and sequenced on DNBSEQ G400RS.	DNBSEQ-G400	6
EGAD50000000789	CXCL8 secreted by immature granulocytes inhibits wildtype hematopoiesis in chronic myelomonocytic leukemia [Abstract] In chronic myelomonocytic leukemia, blocking CXCL8 produced by clonal dysplastic granulocytes is a potential therapeutic strategy to slow disease progression. [Data provided] Exome-sequencing, bulk mRNA-seq and single-cell RNA-seq of iGRAN, monocytes and PBMC of CMML patients and controls.	Illumina HiSeq 2000 Illumina NovaSeq 6000	64
EGAD50000000790	Low-coverage whole genome sequencing and targeted (30 gene panel) deep sequencing of oral cancer. Note that the targeted deep sequencing is not actually amplicon sequencing but hybrid capture sequencing, which was not available as on option.	unspecified	554
EGAD50000000791	This dataset contains 207 samples sequenced using rapid-multiplexed barcoded library preparation on Nanopore PromethION R10 flow cells on P2 Solo and P24 systems. One unaligned BAM file per sample is provided.	PromethION	207
EGAD50000000792	The dataset contains WES profiles of 487 patients from the CA209-274 clinical trial. The Allprep DNA/RNA FFPE kit was used to simultaneously purify genomic DNA and total RNA from formalin-fixed, paraffin embedded (FFPE) tissue sections. Normalized WES libraries were pooled and sequenced on Illumina NovaSeq 6000 at a plex level appropriate to the coverage of tumor: 2 × 100 bp PE 100 M reads, germline: 2 × 100 bp PE 25 M reads.	Illumina NovaSeq 6000	487
EGAD50000000793	The dataset contains RNAseq profiles of 370 patients from the CA209-274 clinical trial. The Allprep DNA/RNA FFPE kit was used to simultaneously purify genomic DNA and total RNA from formalin-fixed, paraffin embedded (FFPE) tissue sections. RNAseq libraries (75PE, 50M) were sequenced on Illumina NovaSeq 6000 at a plex-level appropriate to the coverage of 2 × 50 base pair (bp) paired end (PE) 50M reads. Fastq files are included.	Illumina NovaSeq 6000	370
EGAD50000000794	10X Genomics single-cell transcriptomics of hepatoblastoma tissues using Chromium Next GEM Single-Cell 3’ Reagent Kits v3.1. Single-cell transcriptomics data (bam files) of PT9, post-chemotherapy tumor sample, and PT13, treatment naive tumor sample. Single-cell solutions were obtained from viably frozen tissue samples using enzymatic digestion.	Illumina NovaSeq 6000	2
EGAD50000000795	10X Genomics single-cell multiome profiling of hepatoblastoma tumor organoids using Chromium Next GEM Single-Cell Multiome ATAC + Gene Expression Kit. Single-cell transcriptomics data (bam file) of multiplexed sample of hepatoblastoma tumor organoids: 3E, 8F1, 10F2, 13F2, 13E, 17E and 96F1. Single-cell solutions were obtained from fresh organoid samples using enzymatic digestion.	Illumina NovaSeq 6000	7
EGAD50000000796	10X Genomics single-cell transcriptomic profiling of hepatoblastoma tumor organoids using Chromium Next GEM Single-Cell 3’ Reagent Kits v3.1 or 3’ CellPlex Multiplexing Kit. Single-cell transcriptomics data (bam files) of hepatoblastoma tumor organoids: 3E, 8F1, 10F2, 13F2, 13F2 late, 13E, 17E, 17F1, 22E, 27F1, 28F1, 31E, 96F1, 121E and 135. Single-cell solutions were obtained from fresh organoid samples using enzymatic digestion.	Illumina NovaSeq 6000	15
EGAD50000000797	10X Genomics Visium Spatial transcriptomics analysis of hepatoblastoma and adjacent normal liver. Spatial data of normal liver and PT2 (fastq files), and and PT13, PT14 and PT16 (bam files) using the Visium Spatial Gene Expression Solution. All samples, except PT13, were collected post-chemotherapy.	Illumina NovaSeq 6000	5
EGAD50000000798	This dataset contains 16 samples sequenced in pools on SMRT Cells 25M on a PacBio Revio instrument. One unaligned BAM file per sample is provided.	unspecified	16
EGAD50000000799	Paired-end RNA sequencing data from 95 brain metastasis samples with different primary origins. Sequencing was performed on the Illumina HiSeq 3000.	Illumina HiSeq 3000	95
EGAD50000000800	This dataset contains BAM files from whole-genome sequencing (WGS) of 3 agressive B-cell lymphoma tumour samples for cases 1, 6, and 7 as well as the corresponding non-tumor whole-genome sequencing BAM files.	unspecified	6
EGAD50000000801	This dataset contains BAM files from capture-based targeted sequencing of 12 agressive B-cell lymphoma tumour samples for cases 1 to 6 and 8 to 13. The sequencing panel used is a MCL-oriented panel containing 159 genes.	NextSeq 2000	13
EGAD50000000802	This dataset contains BAM files from capture-based targeted sequencing of 4 agressive B-cell lymphoma tumour samples for cases 2, 3, 4 and 13. The sequencing panel used is a DLBCL-oriented panel containing 136 genes, described in Mozas P. et al., Hematol Oncol 2023.	unspecified	4
EGAD50000000803	This dataset contains BAM files from whole-exome sequencing (WES) of 5 agressive B-cell lymphoma tumour samples for cases 2, 3, 4 and 5. For case 2, non-tumor whole-exome sequencing is also available.	unspecified	6
EGAD50000000804	Plasma samples from healthy individuals were subjected to low-coverage whole-genome sequencing (less than 10x average depth). This dataset contains raw fastq files from 18 healthy control plasma samples.	Illumina NovaSeq 6000	18
EGAD50000000805	This datasets consists os genomic WES of paired-end raw data (FASTQ R1, R2 and UMI sequence) obtained from plasma and HapMap control samples. Specifically, it consists in two Hapmap samples (NA12877, NA12878) and 2 plasma standarts (HD780 and HD816) with VAF mutations at 0.0%, 0.1%, 1.0%, and 5.0%, in total 21 plasma samples.	Illumina NovaSeq 6000	23
EGAD50000000806	This study explores the cell-free transcriptome in a humanized DLBCL patient-derived tumor xenograft (PDTX) model. Blood plasma samples (n=171) derived from a DLBCL PDTX model in-cluding 27 humanized (HIS) PDTX, 8 HIS non-PDTX and 21 non-HIS PDTX non-obese diabetic (NOD)-scid IL2Rgnull (NSG) mice were collected during humanization, xenografting, treatment, and sacrifice. The mice were treated with either rituximab, cyclophosphamide, doxorubicin, vincris-tine, and prednisone (R-CHOP), CD20-targeted human IFNα2-based AcTaferon combined with CHOP (huCD20-Fc-AFN-CHOP), or phosphate-buffered saline (PBS). RNA was extracted using the miRNeasy serum/plasma kit and sequenced on the NovaSeq 6000 platform using the using the SMARTer Stranded Total RNA-Seq Kit v3.	Illumina NovaSeq 6000	171
EGAD50000000807	Single-cell RNA-Seq data and TCR sequencing data (both by 10X Genomics) of 51 TNBC primary tumors obtained from 29 unique patients from BELLINI clinical trial. The data includes pre- and post-treatment samples. Patients in cohort A (16 patients, 28 samples) received nivolumab for 4 weeks, patients in cohort B (13 patients, 23 samples) received nivolumab + ipilimumab for 4 weeks. The included sequencing data was generated from frozen material for cohort A and from fresh material for cohort B.	Illumina HiSeq 4000 Illumina NovaSeq 6000	116
EGAD50000000808	RNA-Seq data of 78 breast cancer primary tumors obtained from 45 unique patients from BELLINI clinical trial. The data includes pre- and post-treatment samples. Patients in cohort A (15 patients, 25 samples) received nivolumab for 4 weeks, patients in cohort B (15 patients, 28 samples) received nivolumab + ipilimumab for 4 weeks, and patients in cohort C (15 patients, 25 samples) received nivolumab + ipilimumab for 6 weeks. The included raw transcriptome sequencing data in fastq format was generated using Illumina NovaSeq 6000.	Illumina NovaSeq 6000	78
EGAD50000000809	WES data of 30 breast cancer primary tumors obtained from 30 unique patients from BELLINI clinical trial (cohorts A & B) and 30 matched blood samples. The data includes pretreatment samples. The included raw WES sequencing data in fastq format was generated using Illumina NovaSeq 6000.	Illumina NovaSeq 6000	60
EGAD50000000810	Plasma samples from patients with breast cancer (stage IV) who had confirmed BRCA mutations were subjected to low-coverage whole-genome sequencing This dataset contains raw fastq files from 6 breast cancer plasma samples.	Illumina NovaSeq 6000	6
EGAD50000000811	This data set contains the fastq files from whole-genome sequencing of temporally matched tumour (fresh frozen biopsies), blood germline and plasma samples collected from a BRCA1-mutant breast cancer patient to directly compare mutation signature analysis using gold-standard tumour-germline paired variant calling with a novel ctDNA-based method (MisMatchFinder).	Illumina NovaSeq 6000	3
EGAD50000000812	This dataset includes FASTQ files from MARCHF6 alleles sequenced by targeted nanopore in 8 patients.	MinION	8
EGAD50000000813	Human skin acute wound healing contains 12 samples of scRNA-seq and 16 samples of spatial transcriptomics, Human skin chronic wound data contains 9 samples scRNA-seq.	Illumina NovaSeq 6000	46
EGAD50000000815	This study evaluated the effectiveness of long- and short-read sequencing as diagnostic tools for CA. We recruited 110 individuals (48 females, 62 males) with a clinical diagnosis of CA. Short-read genome sequencing (SR-GS) was performed to identify pathogenic RE and also non-RE variants in 356 genes associated with CA. Long-read adaptive sequencing (LR-AS) was performed to identify pathogenic RE in 67 genes.	MinION unspecified	110
EGAD50000000816	Paired-end SMARTseq2 data of EpCAM+ sorted cells from liquid biopsies from 3 patients. Sequencing was performed on Illumina HiSeq 2500.	Illumina HiSeq 2500	3
EGAD50000000817	Paired-end WGS (x25) and WES data (x2) from metastatic breast cancer organoids, tissues and germline controls. WGS was performed on Illumina NovSeq 6000 and HiSeq X, using Illumina TruSeq Nano DNA kit. WES was performed on Illumina NovSeq 6000 and HiSeq 2000, using Agilent SureSelect XT HS + Human All Exon V7 and SureSelect Human All Exon V6+UTR (hg19).	HiSeq X Ten Illumina HiSeq 2000 Illumina NovaSeq 6000	27
EGAD50000000818	Paired-end RNA sequencing data from 28 metastatic breast cancer samples. Sequencing was performed on the Illumina NovSeq 6000 and HiSeq 4000 using TruSeq Stranded mRNA Kit.	Illumina HiSeq 4000 Illumina NovaSeq 6000	28
EGAD50000000820	Read counts, per gene, aligning to exons. Alignment performed with GSNAP. 20 samples.		20
EGAD50000000822	The dataset contains bulk transcriptomics data from 23 patients with Acute Myeloid Leukemia (AML). Samples were collected from bone marrow or blood. Sequencing was performed in paired-end mode and sequencing data is provided in fastq format.	Illumina NovaSeq 6000	23
EGAD50000000823	The dataset contains single-cell RNA seq data from 23 patients with Acute Myeloid Leukemia (AML). Samples were collected from bone marrow or blood. Sequencing was performed in paired-end mode and sequencing data is provided in fastq format.	Illumina NovaSeq 6000 NextSeq 500	23
EGAD50000000824	The dataset contains single-cell DNA seq data from 21 patients with Acute Myeloid Leukemia (AML). Samples were collected from bone marrow or blood. Sequencing was performed in paired-end mode and sequencing data is provided in fastq format.	Illumina NovaSeq 6000 NextSeq 500	21
EGAD50000000825	Whole exome sequencing of two human samples run on the Illumina HiSeq2500 platform. It contains two BAM files aligned to the refrence genoeme GRCh38.		2
EGAD50000000826	RNA-seq data from ILC samples. This dataset includes raw, unnormalized counts per gene, per sample.		12
EGAD50000000827	Sample metadata for this experiment. Includes treatment and donor derivation information.		12
EGAD50000000828	RNA-seq data from ILC samples. This dataset includes counts normalized for sequencing depth and gene length.		12
EGAD50000000829	This dataset contains all sequencing data of the publication "Single-cell DNA and Surface Protein Characterization of High Hyperdiploid Acute Lymphoblastic Leukemia at Diagnosis and During Treatment". We provide both the raw fastq-files as well as the processed data in the form of .h5 files. There are 13 patients: XJ176, XH135, XJ180, XI145, XI148, XI167, XI150, XJ175, XJ178, XI162, XF98, XG111 and XG115. Of 9 patients, also follow-up samples during/after treatment were sequenced: XJ180, XI145, XI148, XI167, XI150, XJ175, XJ178, XI162 and XG111. To identify the cell types and distinguish between normal and leukemic cells, cell surface proteins were also captured for 6 samples: XF98, XG111 (at diagnosis and during treatment), XG115, XI162 (after treatment) and XJ175 (after treatment). All samples were processed on the MissionBio platform with a custom amplicon panel, and DaB-seq for the combined targeted DNA and protein sequencing.	Illumina NovaSeq X	26
EGAD50000000831	In this study, we conducted single cell full length total RNA sequencing (VASA-seq) on 13 matched pediatric T-ALL PDX samples at initial diagnosis and relapse, along with 5 non-relapsing PDX samples collected at initial diagnosis. We identified a subpopulation of T-ALL cells with stem-like cell features that expands substantially at relapse indicating resistance to first-line therapy. Chemotherapy resistance was further validated through functional testing: Two samples were subjected to in-vitro drug testing, treated with Cytarabine for three days, followed by single-cell transcriptomic analysis using 10x Genomics. Additionally, in-vivo drug testing was conducted on one sample, involving re-engraftment into mice and treatment with a combination of Vincristine, Doxorubicin, and Dexamethasone. Single-cell transcriptomic analysis using 10x Genomics was performed after 0, 15 and 30 days of treatment.	Illumina NovaSeq 6000 NextSeq 550	77
EGAD50000000832	This dataset contains 75 Nanopore sequencing experiments using a MinION sequencer and R9 flow cells from 51 patient biopsies. Gzipped tar files containing all fast5 files per sample are provided.	MinION	75
EGAD50000000833	RNA-seq data from lumbar spinal cord blocks from 10 ALS samples and Controls. Raw fastq files are provided in this dataset.	Illumina NovaSeq 6000	20
EGAD50000000834	The 3 tumor samples are PMLBM000JFR, PMLBM000JFT, and PMLBM000JFF. All samples are from different patients: Case 9 (PMLBM000JFR), Case 10 (PMLBM000JFT), and Case 12 (PMLBM000JFF) from the article by Kemps et al (Blood 2024 - In production). Methods: Total RNA was isolated and the generated libraries were sequenced on a NovaSeq 6000 (as described in Hehir-Kwa, et al. JCO Precis Oncol 2022). Data files: Provided are .cram and .crai files.	Illumina NovaSeq 6000	3
EGAD50000000835	BAM generated using Cell Ranger v3.1.0 and Space Ranger v1.3.1 and anonymized using BAMBoozle	NextSeq 550	41
EGAD50000000836	CellRanger counts processed with the Cell Ranger software v3.1.0.		41
EGAD50000000837	Short variants identified in 125 TOF probands enrolled through the CONCOR Biobank (Netherlands).	HiSeq X Ten	1
EGAD50000000838	Short variants identified in 731 CHD probands enrolled through the Heart Centre Biobank Registry (Canada).	HiSeq X Ten	1
EGAD50000000839	Short variants identified in 245 TOF probands enrolled through the Kids Heart BioBank (Australia).	HiSeq X Ten	1
EGAD50000000840	A total of 61 patients (33 mild and 28 severe) with a median age of 51 years who recovered from COVID-19 were included in this study (Table 1). All collected samples were confirmed in diagnosis of COVID-19 by RT-PCR and positive IgG serolog. For the 61 patients, TCR-seq, CyTOF, and genotyping data were generated and analyzed in separate studies, as part of larger cohorts. In total, there are 143 variables, with 129 being quantitative (TCRseq and CyTOF metrics) and 13 being qualitative (symptomatology and genotyping). For more information about cohort characteristics, data adquisition an pre-processing, please visit PMID: 39227859, 36969980 and 37287057.	NextSeq 500	61
EGAD50000000841	This data contains scRNA-seq on 14 PBMC samples from colorectal cancer patients.	Illumina NovaSeq 6000 NextSeq 500	14
EGAD50000000842	Twenty-seven patients were included and subjected to germline WES using NovaSeq 6000 platform. After filtering out variants for sequencing quality, variant allele fraction frequency, and population frequency, variants were manually prioritized by the ACMG criteria.	Illumina NovaSeq 6000	27
EGAD50000000843	RNAseq pilot data of PDiamond Korean lung cancer data	HiSeq X Ten	35
EGAD50000000844	WES and RNA-seq data of PDiamond Korean lung cancer data	HiSeq X Ten	463
EGAD50000000845	15 scRNA-seq samples were multiplexed, and 5' 10X Chromium scRNA-seq was performed. This dataset provides the FASTQ files for the multiplexed data, as well as the BAM files for the demultiplexed samples.	Illumina NovaSeq 6000	23
EGAD50000000846	We characterized leiomyosarcoma by targeted gene panel sequencing (MSK-IMPACT) and transcriptome profiling to discover biological mechanisms under metastatic progression.	Illumina HiSeq 2500 Illumina NovaSeq 6000 Illumina NovaSeq X	307
EGAD50000000847	The dataset includes sequencing data from five inherited retinal dystrophies (IRD) patients using three diferent platforms: Clinical Exome (CES), Long-read sequencing of RPE65 locus, and 3) whole genome sequencing.	Illumina NovaSeq X MinION NextSeq 2000	5
EGAD50000000849	We obtained bulk RNAseq data of CRC-PDX tumors after administration of drug or vehicle to clarify the mechanisms of action.	NextSeq 550	15
EGAD50000000850	We obtained bulk RNAseq data of CRC-PDX tumors and performed molecular subtype classification.	NextSeq 550	10
EGAD50000000851	The dataset contains bulk transcriptomics data from 80 samples from 77 different patients with melanoma. Samples were collected from the tumor. Sequencing was performed in paired-end mode and sequencing data is provided in fastq format.	Illumina NovaSeq 6000	80
EGAD50000000852	The dataset contains 10x Chromium single-cell DNA sequencing data from 63 samples from 60 different patients with melanoma. Samples were collected from the tumor. Sequencing was performed in paired-end mode and sequencing data is provided in fastq format.	Illumina NovaSeq 6000 NextSeq 500	63
EGAD50000000853	The dataset contains 10x Chromium single-cell transcriptomics data from 65 samples from 62 different patients with melanoma. Samples were collected from the tumor. Sequencing was performed in paired-end mode and sequencing data is provided in fastq format.	Illumina NovaSeq 6000 NextSeq 550	65
EGAD50000000854	Specific in vitro stimulation of patient-derived PBMC with the neoantigens SYTL4-S363F and KIF2C-P13L was followed by enriching for CD137+ activated T cells. Restimulated T cells as well as an unstimulated patient PBMC sample from the same time point were used for scRNA-Seq/scTCR-Seq.	Illumina NovaSeq 6000	4
EGAD50000000855	The 22q11.2 deletion syndrome (22q11.2DS) is the most common microdeletion disorder. Why the incidence of 22q11.2DS is much greater than that of other genomic disorders remains unknown. Short read sequencing cannot resolve the complex segmental duplications (SDs) to provide direct confirmation of the hypothesis that the rearrangements are caused by non-allelic homologous recombination between the low copy repeats on chromosome 22 (LCR22s). To enable haplotype-specific assembly and rearrangement mapping in LCR22 clusters, we used whole genome (ultra-)long read sequencing of 9 duos (patients and parent of origin).	PromethION	18
EGAD50000000856	A dataset of samples analyzed for the publication "Reconstructing oral cavity tumor evolution through brush biopsy", Springer Nature, DOI: 10.1038/s41598-024-72946-3	Illumina HiSeq 2000	7
EGAD50000000860	This dataset contains WGS data for 27 tumor, control and patient derived culture samples of patients with breast cancer. Sequencing was performed on Illumina NovaSeq 6000 and HiSeq X using TruSeq Nano DNA Kit. The sequencing was always paired	Illumina HiSeq X Illumina NovaSeq 6000	27
EGAD50000000861	A double primary colorectal cancer (CRC) in a familial setting signals a high risk of CRC. In order to find novel high/moderate penetrance CRC susceptibility genes, we performed whole-exome sequencing on germline blood samples of seven familial cases from Poland with a double primary CRC.	unspecified	7
EGAD50000000862	Gene expression profiles of single cells from 26 tumor and ascites samples samples from 17 patients	Illumina NovaSeq 6000	26
EGAD50000000866	In this study, we describe the role of whole genome sequencing (WGS) in providing a definitive diagnosis for a child with T cell deficiency, where targeted panel sequencing of SCID genes and whole exome sequencing had failed. A novel homozygous 8kb deletion in PTCRA, encoding pTCR, was identified. WGS sequence data from the proband, proband’s mother, and proband’s father are deposited.	Illumina NovaSeq 6000	3
EGAD50000000869	Pancreatic cancer (PC) is a leading cause of cancer-related deaths globally. Accurate PC detection at early or premalignant stage, when surgery is effective, would increase survival rates and prevent unnecessary surgery or surveillance. Molecular diagnostic attempts using PCy fluid (PCyF), as a liquid biopsy, to analyze common mutations associated with PC development are yet to contribute to early diagnosis. Whole genome sequencing (WGS) is utilized in various cancers, PC inclusive, for detecting genetic changes associated with carcinogenesis, but not in PCyF due to technical limitations, including isolated DNA purity. The goal of the study was to achieve high quality of WGS from PCyF from 9 patients to enable development of signatures of malignancy. The WGS was performed using Illumina sequencing. Experimental design was at a 50x depth.	Illumina NovaSeq 6000	9
EGAD50000000871	Raw data FASTQ files of RNAseq (Stranded mRNA prep Ligation-Illumina on NovaSeq 6000, 100bp paired-end) & WGS (PCR Free Roche KAPA Hyper Prep Library on Illumina Novaseq 6000, 150bp paired-end) for 40 samples corresponding to the matched tumor/normal of 10 patients. Raw data FASTQ files of WGS (PCR Free Roche KAPA Hyper Prep Library on Illumina Novaseq 6000, 150bp paired-end) of the tumor of two additional patients.	Illumina HiSeq 2000 Illumina NovaSeq 6000	22
EGAD50000000872	RNA sequencing of a Chromophobe renal cell carcinoma, Non-clear cell chromophobe renal cell carcinoma, Neuroendocrine PDAC and clear cell sarcoma	Illumina NovaSeq 6000	4
EGAD50000000874	snRNA-seq performed on patient tumours (n = 6 patients, one biopsy sample each sequenced) using the 10x technology. Single nuclei were acquired from six frozen mCRPC biopsies (4 lymph node, 2 liver metastases) obtained from consenting patients treated at the Royal Marsden Hospital between 2018 and 2023 under an institutional review board approved research protocol (Research Ethics Committee approval number: 04/Q0801/60). All six patients had previously received androgen-receptor signalling inhibitor(s) and five of six patients had previously received taxane(s).	Illumina NovaSeq 6000	6
EGAD50000000875	RNA-Seq raw data of PDX treated with acetalaxor bisacodyl for 4h00 to 24h00.	Illumina NovaSeq 6000	24
EGAD50000000876	RNA-Seq raw data of untreated breast cancer xenografts known as good or bad acetalax responders.	Illumina NovaSeq 6000	17
EGAD50000000878	The dataset contains RNA sequencing data from 18 paired samples of 9 CLL patients treated with idelalisib in vivo (median time on therapy 4 weeks; range 2-5 weeks).	NextSeq 500	18
EGAD50000000879	The dataset contains RNA sequencing data from 22 paired samples of 11 patients before and during ibrutinib therapy (median time on therapy 2 weeks; range 1-12 weeks). The sequencing was performed in two batches.	Illumina HiSeq 1000 NextSeq 500	22
EGAD50000000880	This dataset contains bam files from the whole exome sequencing performed on granulosa cell tumors and 7 sequencing experiments performed on human-derived KGN granulosa cell tumor cells and isogenic cell lines. The latter include transcription factor ChIP-seq for Foxl2 as well as transcription factor ChIP-seq for Foxl2 and GR under dexamethasone treatment and untreated conditions, two ATAC-seq data sets, two RNA-seq data sets, and Histone marks ChIP-seq.	Illumina NovaSeq 6000 NextSeq 500	133
EGAD50000000881	The dataset contains raw sequencing data from miRNA-seq, RNA-seq and single-nuclei RNA-seq experiments to investigate transcriptomic changes in cocaine use disorder.	Illumina NovaSeq 6000	40
EGAD50000000882	Dataset containing tumor, normal and blood sample data, for various tissue and tumor types. Data is targeted methylation data, using smMIP probes to target informative CpG sites in the genome. Sample type can be inferred from the name, with 'W' being normal samples and 'TC' tumor samples. Dataset contains 259 samples in duplicate, with one half of the runs being cut by MSRE's, and the other not. For every sample paired fastq files are present.	NextSeq 550	518
EGAD50000000883	Single cell whole genome sequencing was done using DLP+ (https://doi.org/10.1016/j.cell.2019.10.026) on luminal breast epithelial cells from wildtype and BRCA mutation carriers. Each BAM file is a merged BAM from single cells from a single sample, and the cells can be uniquely identified by their @RG group. Only high quality cells were kept, thus there are large differences in BAM sizes. The same sample/patient can be found across multiple BAM files to accumulate more cells, sometimes the same patient will have multiple FACS sorted cell types also split by BAMs.	unspecified	104
EGAD50000000887	This research project was a collaboration between Cardiff University, UK and the Stanley Center at the Broad Institute. In this project we sequenced and analyzed the whole exomes of Bipolar case/control samples from collaborators in UK. Genomic DNA from each sample was sequenced to a mean depth of 20x. The exome used Twist capture and samples were sequenced on Illumina HiSeqX machines producing CRAM files.	HiSeq X Ten	2947
EGAD50000000888	The dataset from Infinium Human MethylationEPIC v1.0 (IDAT files) includes 91 primary samples from patients with B-cell acute lymphoblastic leukemia (B-ALL) (n=34) and T-cell acute lymphoblastic leukemia (T-ALL) (n=57). This dataset was used in the article "A Comprehensive DNA Methylation Landscape of Human and Mouse Cell Lines Derived from Hematological Malignancies" to validate differentially methylated CpG sites, which were utilized to classify hematologic malignancies.	unspecified	91
EGAD50000000889	A total of 10 pre-menopausal patients were recruited for this study. Sequencing of both the fimbriae and ampullary region of the fallopian tubes was performed using 10x Genomics single-cell RNA-seq and ATAC-seq. For one patient, data were sequenced across 4 lanes, while the remaining patients' data were sequenced in 2 lanes each, resulting in a total of 44 sequencing datasets for each platform.	Illumina NovaSeq X	88
EGAD50000000891	We developed a high volume (100 mL) urine DNA collection kit and laboratory platform (UroScout) analyzing 25 commonly mutated UC genes and 8 copy number-altered loci for urine tumor DNA (utDNA) alterations. To assess accuracy of UC detection, we analyzed 498 urine samples from diagnostic and surveillance timepoints from 193 UC patients and 88 cancer-negative patients evaluated for UC.	Illumina NovaSeq 6000	488
EGAD50000000892	This is an RNA sequencing dataset of 101 mucosal melanoma tumors. The original corresponding study is titled "Diversity of the immune microenvironment and response to checkpoint inhibitor immunotherapy in mucosal melanoma" by J.L. Vos et al., published in JCI Insight.	Illumina HiSeq 2500	101
EGAD50000000893	DNA was extracted from whole blood, and sequenced on a Promethion flow cell. The target region covered chr15:22874354-55370932 (Grch38). The resulting data was aligned to Grch38 using minimap2.	PromethION	1
EGAD50000000894	In this study we established a comprehensive workflow to collect multi-omics single-cell data using a commercially available micro-well based platform. This included whole transcriptome, cell surface markers (targeted sequencing-based cell surface proteomics), T cell specificities, adaptive immune receptor repertoire (AIRR) profiles and sample multiplexing. With this technique we identified novel paired T cell receptor sequences for three prominent human CMV epitopes. In addition, we review the ability of dCODE dextramers to detect antigenspecific T cells at low frequencies by estimating sensitivities and specificities when used as reagents for single-cell multi-omics.	Illumina NovaSeq 6000	12
EGAD50000000897	DAC-2020-03-26-Lemola (DAC-039)), raw data in EGA, metadata in Harvard Dataverse		1
EGAD50000000898	HER2-positive gastric cancer (HER2+ GC) exhibits significant intra-tumoral heterogeneity and frequent development of resistance to HER2-targeted therapies. This study aimed to characterize the spatial tumor microenvironment (TME) and identify mechanisms of resistance to HER2 blockade including trastuzumab and trastuzumab deruxtecan (T-DXd) in HER2+ GC, with the goal of informing novel therapeutic strategies. We performed spatial transcriptomics on pre- and post-treatment samples from patients with HER2+ metastatic GC who received trastuzumab-based therapy.	unspecified	623
EGAD50000000900	We show dysregulated microRNA and tRNA fragment profiles related to FMS pathopyhsiology and opening new perspectives for FMS diagnostics and symptom monitoring. This dataset includes all sequencing raw files from whole blood (tempus tube) and keratinocyte sequencing.	NextSeq 500	94
EGAD50000000903	This Tissue Microarray (TMA) GeoMx Digital Spatial Profiler (DSP) dataset was a part of the validation cohort in the study. Tumor tissue blocks were first annotated as tumor core or tumor edge by pathologists, then Regions of Interest (ROIs) were selected on various blocks. The data is in FASTQ format. GeoMx DSP data from 840 ROIs (after QC) from 86 GCs were included in the datasets.	Illumina NovaSeq 6000	848
EGAD50000000904	Whole exome sequencing of high grade gliomas occurring in teenagers and young adults between the ages of 13 and 30. This study combines methylation array profiling, whole exome sequencing and fusion panel sequencing to provide an integrated molecular description of brain tumours in this age group. Dataset contains 96 BAM files aligned with BWA to GRCh37 from 88 tumours. Tumour DNA was isolated from FFPE material in most cases. 8 germline sequences from peripheral blood DNA are also included.	NextSeq 2000	96
EGAD50000000905	Genotyping arrays for 183 samples from patients with a neuroendocrine tumor of the small intestine. All samples have been hybridized on Illumina GSA-MD v3 arrays with standard automated protocols. Reading of the chips was performed on Illumina iScan+ scanners and this file is the result of primary analysis done using Illumina GenomeStudio software. It contains data for 147 tumors, 2 adenomas, 1 lymph node, 19 mesenteric nodules, 6 liver metastasis and 8 normal ileum samples.	unspecified	183
EGAD50000000906	This dataset contains 206 RNASeq samples from a cohort of patients with neuroendocrine tumors of the small intestine. Libraries were prepared using a Illumina TruSeq Stranded mRNA kit following the manufacturer’s recommendations. Sequencing was performed on Illumina Novaseq6000 machine with a 2x75bp paired-end fashion. This dataset contains data for 169 tumors, 22 mesenteric nodules, 8 liver metastases, 2 adenomas, 1 lymph node and 4 normal ileum samples.	Illumina NovaSeq 6000	206
EGAD50000000907	This dataset contains whole genome sequences for 13 patients with neuroendocrine (unique/multiple) tumor(s) of the small intestine. There are in total 25 tumor and 12 matched normal genomes. Libraries were prepared using a Illumina DNA PCR Free kit following the manufacturer’s recommendations and sequencing was performed on Illumina Novaseq6000 machine with a 2x151bp paired-end fashion. It contains data for : - 1 patient with 4 tumors and 1 matched normal - 1 patient with 3 tumors and 1 matched normal - 1 patient with 2 tumors, 1 liver metastasis and 1 matched normal - 2 patients with 1 tumor, 1 mesenteric nodule and 1 matched normal - 7 patients with 1 tumor and 1 matched normal - 1 patient with 3 tumors and 1 mesenteric nodule	Illumina NovaSeq 6000	37
EGAD50000000908	We analyzed cfDNA plasma samples of subjects with non-small cell lung cancer: treatment naïve (n=24) and prior-treated (n=8). Driver variants for 11 genes was available with Oncomine Lung cfDNA Assay, were assessed by Next-Generation Sequecing. Sequencing data were pre-processed in Ion S5 Torrent Server and aligned to the hg19 human reference genome. Variant calls were performed by IonReporter 5.10 software.	Ion Torrent S5	32
EGAD50000000909	This dataset contains the results of sequencing 9 samples (7 human/PDX, 2 cell lines) of medulloblastoma tumours (or 2 Neural Stem Cell lines). The data was collected to characterize the intratumour heterogeneity of chromothripsis in this tumour type. The 7 tumour samples underwent single cell DNA and RNA sequencing using the 10x CNVkit and 10xRNA protocols, and 1 of the tumour samples as well as the Neural Stem Cell lines underwent StrandSeq sequencing. All experiments were sequenced on Illumina sequencers. The provided files are paired-end fastq files from the sequencing experiments.	Illumina NovaSeq 6000 NextSeq 550	9
EGAD50000000910	This dataset contains 45 WES sequencing samples of patients with desmoplastic small round cell tumor. Sequencing was performed on Illumina HiSeq 2500, Illumina HiSeq 4000 and NovaSeq 6000 using Agilent SureSelect Human All Exon V5 Kit. The sequencing was always paired.	Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina NovaSeq 6000	45
EGAD50000000911	This dataset contains 25 WGS sequencing samples of patients with desmoplastic small round cell tumor. Sequencing was performed on Illumina HiSeq X and NovaSeq 6000 using Illumina TruSeq Nano DNA Kit. The sequencing was always paired.	Illumina HiSeq X Illumina NovaSeq 6000	25
EGAD50000000912	This dataset contains 32 RNA sequencing samples of patients with desmoplastic small round cell tumor. Sequencing was performed on Illumina HiSeq 2500, Illumina HiSeq 4000, Illumina HiSeq X and NovaSeq 6000 using Illumina TruSeq RNA and TruSeq Stranded RNA Kit. The sequencing was always paired.	Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina HiSeq X Illumina NovaSeq 6000	32
EGAD50000000913	Arcagen is an EORTC/SPECTA pan-European project that aims to recruit 1000 rare cancer patients from different tumour domains of EURACAN. This study collected samples from advanced or metastatic rare cancer from patients older than 12, and analysed them using Foundation Medicine next-generation sequencing (NGS) panels (FoundationOne CDx for FFPE samples or FoundationOne Liquid CDx for blood samples). Here were are submitting the dataset that contains NGS files in .BAM format from 85 patients with extra-pulmonary neuroendocrine tumour or cancer grade 3 (NET / NEC G3).	Illumina HiSeq 4000	85
EGAD50000000917	Whole genome sequencing raw data for fragile X associated unmethylated expansion carrier 1. DNA was sequenced using the illumina NovaSeq6000 system. 8x paired end FASTQ files from one DNA sample (UFM 1), 4x R1 files and 4x R2 files.	Illumina NovaSeq 6000	2
EGAD50000000918	Total RNA sequencing of fibroblasts from an individual with a fragile X syndrome related unmethylated full mutation (UFM1). Libraries were prepared with total RNA with Ribo-Zero ribosomal RNA depletion (illumina), and sequenced paired-end, 75bp on a NextSeq 550 system. This dataset contains three RNA-sequencing replicates from the same cell line.	NextSeq 550	1
EGAD50000000919	Total RNA sequencing of fibroblasts from an individual with a fragile X syndrome related unmethylated full mutation (UFM2). Libraries were prepared with total RNA with Ribo-Zero ribosomal RNA depletion (illumina), and sequenced paired-end, 75bp on a NextSeq 550 system. This dataset contains three RNA-sequencing replicates from the same cell line.	NextSeq 550	1
EGAD50000000920	Total RNA sequencing of fibroblasts from an individual with a methylated CGG repeat >200x in the FMR1 5'UTR. Libraries were prepared with total RNA with Ribo-Zero ribosomal RNA depletion (illumina), and sequenced paired-end, 75bp on a NextSeq 550 system. This dataset contains three RNA-sequencing replicates from the same cell line.	NextSeq 550	1
EGAD50000000921	Whole genome sequencing raw data for fragile X associated unmethylated expansion carrier 2. DNA was sequenced using the illumina NovaSeq6000 system. 8x paired end FASTQ files from one DNA sample (UFM 2), 4x R1 files and 4x R2 files.	Illumina NovaSeq 6000	2
EGAD50000000922	Total RNA sequencing of fibroblasts from an individual with 30-55x GCC repeats in the FMR1 5'UTR. Libraries were prepared with total RNA with Ribo-Zero ribosomal RNA depletion (illumina), and sequenced paired-end, 75bp on a NextSeq 550 system. This dataset contains three RNA-sequencing replicates from the same cell line.	NextSeq 550	1
EGAD50000000923	Paired fastq files of exome sequencing that belong to 5q myelodysplastic syndrome patients are shared in this submission. Illumina technology was used to obtain such data.	Illumina NovaSeq 6000	5
EGAD50000000924	Whole exome sequencing from ffpe samples DNA was extracted from FFPE tissue samples from colorectal cancer biopsies. Exome libraries where prepared using the KAPA Hyper Plus Kit (Roche Diagnostics) followed by target enrichment by hybridisation capture using the Roche Hyper Exome Kit. Sequenced on NovaSeq6000 to generated 2x 151 paired-end reads.	Illumina NovaSeq 6000	13
EGAD50000000925	We performed single and multi-region nanopore whole-genome sequencing on human osteosarcoma samples from adult and paediatric patients.	PromethION	52
EGAD50000000926	Metastatic colorectal cancer (mCRC) is the main cause of CRC mortality, with limited treatment options. Although immunotherapy has benefited some cancer patients, mCRC typically lacks the molecular features that respond to this treatment. However, recent studies indicate that the immune microenvironment of mCRC may be modified to enhance the effect of immune checkpoint inhibitors. This dataset was used to explore the metastatic immune microenvironment by comparing immune cell populations in colorectal liver (CLM), lung (mLu) and peritoneal (PM) metastases.	Illumina NovaSeq 6000	40
EGAD50000000927	This dataset includes single-nuclei RNA sequencing (snRNA-seq) and spatial transcriptomics data from biopsies of the right atrial appendage and pericardial fluid, collected during open-heart surgeries. The samples represent various clinical contexts, including control, ischemic heart disease, heart failure, and myocardial infarction. For each biopsy, nuclei were isolated, and RNA sequencing libraries were prepared using the 10x Genomics Chromium v3 platform. Additionally, spatial transcriptomics was performed on the heart samples. The dataset provides raw sequencing data, which includes fastq files generated from these libraries, offering valuable insights into the cellular composition of heart tissues and associated fluids across different disease states.	Illumina NovaSeq 6000 NextSeq 500	81
EGAD50000000928	SNP-based demultiplexing of single-cell RNA-seq data	Illumina NovaSeq 6000	10
EGAD50000000929	To explore T cell-intrinsic driving forces of CRISPR-induced aneuploidy at the TRAC gene locus, we performed scKaryo-seq (Bolhaqueiro et al., Nature Genetics 2019). We compared the impact of the small molecule pifithrin-alpha in activated TRAC KO T cells. In a separate experiment, we determined the effect of pifithrin-alpha on non-activated T cells edited at the TRAC locus or treated with non-targeting Cas9 RNPs.	NextSeq 2000	14
EGAD50000000931	Tissue site for RNAsequencing data. Tissue site is associated with the clinical biomarker data and be linked to the biomarker data using the SAMPLE and PAT identifiers.	Illumina NovaSeq 6000	271
EGAD50000000932	T cell signatures from tumor RNAseq in CITYSCAPE with overall survival data. T cell signatures can be associated with the other clinical CITYSCAPE data .		293
EGAD50000000933	This file set has 448 Greenlandic individuals sequenced using Illumina 150 Paired end sequencing and has an average sequencing depth of 35X. The data is unfiltered, in VCF format, and covers 19.751.308 variants.		448
EGAD50000000934	This file set has 1478 Greenlandic individuals scored on the Illumina MEGA array (1,748,250 sites). The data is in PLINK bed/bim/fam format. The individuals originate from the B2018 population survey.		1478
EGAD50000000935	72 fastq sequencing files of 72 brain-organoid paired-end sequenced samples (R1/R2). RNA sequencing libraries were prepared using the NEBNext Ultra II RNA Library Prep Kit for Illumina following the manufacturer’s instructions (New England Biolabs, USA). mRNAs were first enriched with Oligo(dT) beads. The samples were sequenced using a 2x150 Pair-End configuration (ver. 1.5) on an Illumina NovaSeq 6000 The raw sequence reads were aligned to the human reference genome GRCh38 using the STAR aligner (Dobin et al., 2013). Gene expression levels were quantified using the FeatureCounts tool (Liao et al., 2014) (file: "All_Counts_final.xlsx") For more information, see "Study Description"	Illumina NovaSeq 6000	72
EGAD50000000936	Atherosclerosis is a pervasive contributor to ischemic heart disease and stroke. Despite the advance of lipid lowering-therapies and antihypertensive agents, the residual risk of an atherosclerotic event remains high and developing therapeutic strategies has proven challenging. This is due to the complexity of atherosclerosis with a spatial interplay of multiple cell types within the vascular wall. Here we generate an integrative high-resolution map of human atherosclerotic plaques combining single-cell RNA-seq from multiple studies and spatial transcriptomics data from 12 human specimens, with different stages of atherosclerosis. We show cell-type and atherosclerosis-specific expression changes and spatially constrained alterations in cell-cell communication. We highlight the possible recruitment of lymphocytes via ACKR1 endothelial cells of the vasa vasorum, the migration of vascular smooth muscle cells towards the lumen by transforming into fibromyocytes, and cell-cell communication in the plaque region, indicating an intricate cellular interplay within the adventitia and the subendothelial space in human atherosclerosis.	Illumina NovaSeq 6000	12
EGAD50000000937	PBAT sequencing of primary first trimester cytotrophoblast and mural trophectoderm.	Illumina HiSeq 1000	5
EGAD50000000938	This dataset contains single-cell RNA sequencing and time-course bulk RNA sequencing data of brain organoids grown from multiple cell lines using four different protocols recapitulating dorsal and ventral forebrain, midbrain, and striatum.	Illumina NovaSeq 6000 NextSeq 500	1221
EGAD50000000941	Dataset with eQTL and e2QTL metadata. The analyses were performed using CD8+ T cells from 461 healthy European participants stimulated with high doses of 5 different carcinogens to induce early apoptosis. These include Methyl-methanesulfonate (MMS), tert-butyl-hydroperoxide (TBOOH), benzo(a)pyrene-7,8-diol-9,10-epoxide (BPDE), 4-hydroxycyclophosphamide (HC) and UVC radiation.		2728
EGAD50000000942	Controlled human infection experiments enable longitudinal profiling of immune responses to a pathogen. 36 healthy volunteers aged 18-29 years, with no evidence of previous infection or vaccination, were inoculated with SARS-CoV-2 virus and quarantined for 14 days. Blood samples (n=374) for RNA sequencing were collected into PAXgene tubes before virus challenge, 6 hours after challenge, daily thereafter for 14 days and on day 28. Mid-turbinate nose swabs (n=96) for RNA sequencing were collected before virus challenge, and on days 1, 3, 5, 7, 10 and 14 after challenge, preserved in RNAprotect. 18 of 36 participants developed a replicative SARS-CoV-2 infection as evidenced by consecutive PCR-positive swabs for the virus. For every participant, blood RNA from selected days were extracted and depleted for genomic DNA and globin mRNA, before cDNA libraries were constructed using KAPA RNA HyperPrep with RiboErase kits. Libraries were sequenced on the Illumina NovaSeq 6000 platform using NovaSeq 6000 S4 Reagent Kits (200 cycles). Nose swab RNA samples were extracted and depleted for genomic DNA before cDNA libraries were constructed using KAPA mRNA HyperPrep Kits. Libraries were sequenced on the Illumina NextSeq platform the using the NextSeq 500/550 High Output Kit (75 cycles).	Illumina NovaSeq 6000 NextSeq 550	470
EGAD50000000943	This dataset provides insight into the immune response in extrapulmonary tuberculosis (EPTB), using bulk RNA-seq data to analyze different disease severities. Hierarchical clustering identified three distinct levels of severity, revealing disease progression driven by interferon and IL-1β-mediated signaling. Additionally, the dataset helped develop a diagnostic gene expression signature for both EPTB and pulmonary TB.	Illumina NovaSeq 6000	50
EGAD50000000944	For this study, we compared the transcriptome of samples exposed to the influence of the FaDu cells (N=6) vs in the control condition (N=2). Muscle fragments were collected intraoperatively from patients with HNSCCs (N=9). The collection of these samples was made possible as part of an ancillary study linked to the Magnolia clinical trial (NCT 04842162). These fragments most often came from the sternocleidomastoid muscle.	Illumina NovaSeq 6000	20
EGAD50000000946	Mitochondrial DNA sequencing of n=157 single muscle fibers in Parkinson's disease patients and controls, n=27 bulk tissue samples and n=2 synthetic controls	Illumina HiSeq 2500	186
EGAD50000000947	This data set contains 2 paired fastq files (WGS).	Illumina NovaSeq 6000	1
EGAD50000000948	This data set contains 18 paired fastq files (RNA-seq).	Illumina NovaSeq 6000	9
EGAD50000000949	The dataset contains shallow whole genome sequencing data of 128 high-grade serous carcinoma (HGSC) patients sequenced with Novoseq 6000. The 128 samples are plasma samples that have been collected before treatment. The files provided are paired fastq files.	Illumina NovaSeq 6000	128
EGAD50000000950	This research project was a collaboration between University College London, UK and the Stanley Center at the Broad Institute. In this project we sequenced and analyzed the whole exomes of 4,627 Bipolar case/control samples from collaborators in UK. Genomic DNA from each sample was sequenced to a mean depth of 20x. The exome used Twist capture and samples were sequenced on Illumina HiSeqX machines producing CRAM files.	HiSeq X Ten	4627
EGAD50000000953	This dataset includes WES data for 116 runs, corresponding to 58 pairs of normal(PBMC)/tumor samples from 58 patients (116 BAM files).	Illumina HiSeq 2500	116
EGAD50000000954	This dataset consists of RNA-seq data from 104 CRLM patients. Reads were mapped to the GRCh37 genome version using STAR. The data include patients with no treatment, as well as those treated with FOLFOX, FOLFOX-bevacizumab, atezolizumab, and FOLFOX-bevacizumab-atezolizumab.	Illumina HiSeq 2500	104
EGAD50000000955	Synthetic - This dataset contains the pheno-clinical and genomic information of 42046 individuals from COVID Population 11 Finland, Subgroup 2. 2010 are affected by Phenotype 1, 2010 are affected by Phenotype 2, 188 are affected by Phenotype 1 and 2, and 37838 are control. The dataset also contains the information about the smoking habits of each individual.		42046
EGAD50000000956	Understanding of the transcriptomic profile of individuals in early influenza infection is limited. To investigate this, longitudinal whole-blood samples (n=178) were taken from adult participants following controlled inoculation with Influenza A H3N2 virus (sampling at baseline (0) and days 1, 2, 3, 7, 10 and 14 post-challenge). Most participants became influenza PCR-positive; a minority remained PCR-negative. Total RNA was extracted from PAXgene tubes before undergoing globin and rRNA depletion. DNA libraries were constructed using the NEBNext® Ultra™ II Directional RNA Library Prep Kit for Illumina. All samples were then sequenced using Illumina NovaSeq 6000.	Illumina NovaSeq 6000	178
EGAD50000000957	The dataset includes RNA-sequencing results from four patient-derived glioblastoma cell lines treated with DMSO (vehicle control) or the splicing modulating-drug indisulam (at 6 and 16 hours), resulting in 12 RNA-seq samples. Paired-end sequencing produced 24 FASTQ.gz files. The RNA libraries were prepared using PolyA selection and ERCC spike-ins and sequenced on a NovaSeq platform. This dataset supports the analysis of drug-induced transcriptomic changes and the prediction of subsequent non-canonical tumor-specific antigens (ncTSAs) using the NovumRNA pipeline.	Illumina NovaSeq X	12
EGAD50000000958	RNA sequencing data from sequential biopsies obtained at baseline, week 3 (post-immuno, before SBRT), and week 7 (after SBRT)	Illumina NovaSeq 6000	85
EGAD50000000959	Transcriptome profiling of Human dorsal root ganglia after plexus injury based on low Input Total RNA-seq from FFPE material	NextSeq 2000	15
EGAD50000000960	This dataset includes 14 RNASeq samples from 8 patients. For some patients, organoids at different passages have been profiled. They can be identified by their alias, composed of the following structure: <subjectId>_O<organoid_number>_<passage>_RNA. For example, ICSBCS002_O1_1_RNA indicates the earlier passage of organoid 1 for subject ICSBCS002, and ICSBCS002_O1_2_RNA indicates the later passage. Additionally, WCM2137_O1_RNA and WCM2137_O2_RNA indicate two organoids (O1 and O2) derived from different tissues. More information about the passages and tissues can be found in the sample information table.	Illumina HiSeq 4000	14
EGAD50000000961	This dataset includes 18 Whole Exome Sequencing (WES) samples from 5 subjects. WES is performed on a matched pair of case/control samples, e.g., tumor/control or organoid/control. For each patient, the same control sample is used for the analysis of tumor and organoid samples. The sample name structure identifies the type of sample: <subjectId>_[TON][12]_[Case\|Ctrl]_EX2; where T, O, and N refer to Tumor tissue, Organoid, and control sample, respectively. The number 1 or 2 refers to the specific tissue, and Case and Ctrl indicate a tumor tissue (or organoid) and a control sample, respectively. For example, ICSBCS007_T1_Case_EX2 is the first tumor tissue of subject ICSBCS007; ICSBCS007_O1_Case_EX2 is the tumor organoid derived from that tumor sample, and ICSBCS007_N1_Ctrl_EX2 is the matching control sample. More information can be found in the sample information table.	Illumina NovaSeq 6000	18
EGAD50000000962	This dataset comprises RNA-sequencing data from ten colorectal cancer (CRC) organoids, derived from histologically verified primary tumor tissues of patients undergoing surgical resection. Paired-end sequencing produced 20 FASTQ.gz files. The organoids were cultured in Geltrex-based droplets using PDO culture medium and harvested for RNA extraction. Total RNA was isolated and subjected to full-length mRNA sequencing, producing high-quality RNA-seq data. The dataset includes transcriptomic profiles of seven microsatellite stable (MSS) and three microsatellite instable (MSI) tumors, providing a valuable resource for the analysis of non-canonical tumor-specific antigens (ncTSAs) using the NovumRNA pipeline. This dataset aids in uncovering potential immunotherapy targets in CRC, particularly in hard-to-treat low-TMB MSS cases.	Illumina NovaSeq X	10
EGAD50000000964	This dataset contains 10X Multiome data generated on post-mortem brain from 92 individuals (70 PD, 22 controls). For the Roche_PD dataset, nuclei were isolated using Nuclei Pure Prep Nuclei Isolation Kit (Sigma Aldrich) with the following modifications. The tissue was lysed in Nuclei Pure Lysis Solution with 0.1% Triton X, 1mM DTT and 0.4U/ul SUPERase-In™ RNase Inhibitor (ThermoFisher Scientific) freshly added before use and homogenized with the help first of a 23G and then of a 29G syringe. Cold 1.8M Sucrose Cushion Solution, prepared immediately before use with the addition of1mM DTT and 0.4U/ul RNase Inhibitor, was added to the suspensions before they were filtered through a 30μm strainer. The lysates were then carefully and slowly layered on top of 1.8M Sucrose Cushion Solution previously added in new Eppendorf tubes. Samples were centrifuged for 45 minutes at 16000xg at 4°C. Pellets were re-suspended in Nuclei Storage Buffer with RNase Inhibitor, transferred in new Eppendorf tubes and centrifuged twice for 5 minutes at 500xg at 4°C. Finally purified nuclei were re-suspended in Nuclei Storage Buffer with RNase Inhibitor, stained with trypan blue and counted using Countess II (Life technology). After count, nuclei permeabilization was carried out following the demonstrated protocol for single cell multiome ATAC + Gene Expression sequencing from 10x Genomics. A total of 12,000 estimated nuclei from each sample was used for the transposition step and then loaded on the Chromium Next GEM Single Cell Chip J. ATAC library and gene expression library construction was performed using the Chromium Next GEM Single Cell Multiome ATAC + Gene Expression kit according to the manufacturer’s instructions. Libraries were sequenced using Illumina NovaSeq 6000 System and NovaSeq 6000 S2 Reagent Kit v1.5 (100 cycles), aiming at a minimum sequencing depth of 30K reads/nucleus. Genotype was generated as previously described (Bryois et al, Nature Neuroscience, 2022).	Illumina NovaSeq 6000	92
EGAD50000000965	This dataset includes snRNA-seq generated from post-mortem brain and genotype from 60 individuals. The snRNA-seq data was generated using the 10X Single Cell Next GEM Chip targeting a minimum 5,000 nuclei per sample and libraries prepared using the Chromium Single Cell 3′ Library and Gel Bead v3 kit according to manufacturer’s instructions. cDNA libraries were sequenced using the Illumina NovaSeq 6000 system at a minimum sequencing depth of 30,000 paired-end reads per nucleus. Samples were pooled (max 4 per pool) and sequenced over 4 lanes. Pool-specific genotype files (containing genotype information at individual level) were used to demultiplex mapped pools into individual-level single-cell data. Donor DNA from samples processed at Imperial College were genotyped using the Illumina Infinium Global Screening Array v2.0.	Illumina NovaSeq 6000	104
EGAD50000000968	The dataset consists in fastq files generated with RNA sequencing of 55 samples from 29 patients with stage I-III TNBC treated with anthracycline-taxane chemotherapy plus fasting-mimicking diet plus/minus metformin in the context of the BREAKFAST trial (NCT04248998). Samples are fresh-frozen tumor samples at baseline and 14-21 days after the first treament cycle sequenced to identify early predictors of treatment activity. Each sample is sequenced in multiple lanes (four or eight lanes each) and in paired ends mode (R1 and R2 fastq files). Samples were sequenced by an Illumina Nextseq500 platform.	NextSeq 500	55
EGAD50000000969	This dataset contains paired-end 10x scRNAseq, 10x ATACseq and bulk WGS data of fibroblast samples from LFS patients and patient-derived xenograft samples of medulloblastoma. Sequencing was performed on Illumina NovaSeq 6000.	Illumina NovaSeq 6000	40
EGAD50000000970	The uploaded data includes sequencing data of 862 individuals from the nasopharyngeal carcinoma (NPC) screening study. Samples from this cohort were sequenced using targeted sequencing methods for the Epstein Barr Virus (EBV) and selected autosomal DNA, but only ‘off-target’ reads were used for fragmentomic analyses. We have also performed genome-wide (non-targeted sequencing) for 1) Individuals with the highest and lowest cell-free DNA concentration (40 individuals); 2) A subsequent collection of the subjects with the highest and lowest DNA concentrations after six years (26 individuals); 3) 30 cases of pregnancy and 4) 20 patients with hepatocellular carcinoma. All sequencing data are of extracted plasma cfDNA from human subjects. The targeted sequencing samples from 862 individuals were sequenced on the NextSeq500 System (Illumina) and aligned to the EBV genome (AJ507799.2) and the human genome (hg19). The alignments were provided in bam format. The original fastq files of the remaining non-target sequencing samples were provided.	NextSeq 2000 NextSeq 500	977
EGAD50000000971	RNA sequencing files that were used in our study. The cohort was composed of 39 cases of RNA sequencing data. RNA-seq library was prepared with rRNA depletion method, and the sequence was performed using DNBSEQ sequencer.	DNBSEQ-G400	39
EGAD50000000972	Targeted DNA sequencing files that were used in our study. The cohort was composed of 36 cases of tumor-only targeted DNA sequencing data. All files are UMI processed. The panel is designed with Agilent Sure Select and the sequence was performed using DNBSEQ sequencer.	DNBSEQ-G400	36
EGAD50000000973	This is an RNAseq experiment from healthy donor CD8 T cells isolated from human PBMCs. T cells were engineered to express a CMV-specific TCR and were stimulated with CMV-peptide-loaded antigen presenting cells in the presence or absence of recombinant IL-27. Dataset consists of 6 fastqs.	Illumina NovaSeq 6000	6
EGAD50000000974	Genome-wide NanoRCS of cell-free DNA from plasma of healthy controls (4 samples) Genome-wide sequencing of reference blood sample of healthy controls (4 samples)	Illumina NovaSeq X PromethION	8
EGAD50000000975	Whole genome sequencing of childhood acute lymphoblastic leukaemia patients. Matched diagnostic and germline samples were obtained from bone marrow aspirates and sequenced on the Illumina platform to characterise the underlying genetic features (DOI: 10.1038/s41375-022-01806-8).	Illumina HiSeq X	418
EGAD50000000979	Targeted sequencing dataset used for the paper entitled "Development of a rapid and comprehensive genomic profiling test supporting diagnosis and research for gliomas". The cohort includes 53 cases of targeted DNA squencing, 15 of them have matched control sequences. The capture panel was made by Agilent SureSelect and the sequence was performed using DNBSEQ or Illumina sequencer.	DNBSEQ-G400	68
EGAD50000000980	This dataset contains targeted RNA-Seq generated with the Illumina RNA Pan-Cancer panel. It comprises 2200 cancers from children and adults. The cancers represent a wide range of CNS, solid and haematopoietic tumours. Samples are from pathology specimens including FFPE, fresh frozen, bone marrow aspirate and peripheral blood. Samples were tested in a clinical diagnostic lab to identify relevant SNVs, indels and gene fusions.	NextSeq 550	2200
EGAD50000000981	This dataset contains the raw data of the RNA-seq experiments described in the paper "RNA-Sequencing: a reliable tool to unveil transcriptional landscape of pediatric B-other acute lymphoblastic leukemia" It contains the raw data from 60 pediatric patients diagnosed with B cell progenitor acute lymphoblastic leukemia lacking the main genetic alterations. RNA material was extracted from bone marrow or periphereal blood and RNA libraries were prepared using Illumina® TruSeq™ stranded mRNA kit following manufacturer's instructions and were sequenced in a NextSeq 550 using a v2 HighOutput 150 cycle kit 2x75bp. Funding: Instituto de Salud Carlos III (ISCIII), PI21/00213	NextSeq 500	60
EGAD50000000982	Raw RNA-seq reads (FASTQ format) and Mapped sequencing reads (BAM format) of 20 Prostate cancer patients in Southern African Prostate Cancer Study (SAPCS). Each patient exhibit different pathogenic variations. The Illumina RNA-sequencing is performed from Blood extraction (from QIAamp RNA Blood Mini Kit). rRNA was removed prior using SortMeRNA. STAR aligner program was used for mapping to human genome GRCh38.p14, resulting in an average genome coverage of 9x.	unspecified	20
EGAD50000000983	This research project was a collaboration between UCLA and the Stanley Center at the Broad Institute. In this project we sequenced and analyzed the whole exomes of 1,965 Bipolar case/control samples from collaborators in the Netherlands and UCLA. Genomic DNA from each sample was sequenced to a mean depth of 20x. The exome used Twist capture and samples were sequenced on Illumina HiSeqX machines producing CRAM files.	HiSeq X Ten	1965
EGAD50000000984	This dataset consists of functional genomic data from iPSC derived Macrophages with or without CRISPR deletions and controls. It contains 21 paired end fastq files consisting of 6 total RNA-seq samples and 9 ATAC samples, for ChIP there are 2 H3K4me3 samples, 2 H3K27ac samples along with 2 paired input samples. The samples were sequenced on the Illumina HiSeq4000 amd Illumina NextSeq500 platforms.	Illumina HiSeq 4000 NextSeq 500	21
EGAD50000000985	This data set contains the RNA-seq BAM files for der(1;7)(q10;p10) MDS cases, -7/del(7q) MDS cases, and OTHER MDS cases. The CD34 selected RNA were collected from patient BM/PB. The dataset was used to identify the unique expression profiles for der(1;7)(q10;p10) MDS cases compared to non-der(1;7)(q10;p10) MDS cases.	Illumina NovaSeq 6000	93
EGAD50000000986	This dataset contains the BAM files of WES of 26 myeloid neoplasm patients with der(1;7)(q10;p10)(+) cases. The dataset was used to identify key driver genes in patients with der(1;7)(q10;p10). Already known drivers along with novel driver genes were identified through the use of this WES.	Illumina HiSeq 2500	52
EGAD50000000987	Shallow Whole Genome Sequencing of two (2) Gastric Cancer samples from the same subject. sWGS was used for Copy number aberration analysis to assess clonal relationship of the collision of two independent EBV+ and EBV- tumors.	Illumina HiSeq 4000	3
EGAD50000000988	This dataset includes two FASTQ files (R1 and R2) from an Illumina germline nuclear DNA whole-genome sequencing run of a single sample. The sample corresponds to a 34-year-old male with a rare deficiency in complement component C1s, linked to systemic lupus erythematosus (SLE). Genomic analysis revealed two novel mutations in the C1S gene.	Illumina NovaSeq 6000	1
EGAD50000000989	Saliva microbiota of 407 participants of PANIC study, is profiled with 16S rRNA gene sequencing (regions V3-V4). Amplification was performed using the TruSeq-switched tail amplicons. The sequencing was performed using the 2 × 301 base-pair on the Illumina MiSeq PE300 platform. Data includes information on sex, age, and caries status of participants. File format is fastq.gz.	Illumina MiSeq	407
EGAD50000000991	The dataset contains RNAseq profiles of 182 patients from the CA017-003 clinical trial. The Allprep DNA/RNA FFPE kit was used to simultaneously purify genomic DNA and total RNA from formalin-fixed, paraffin embedded (FFPE) tissue sections. RNAseq libraries (50PE, 50M) were constructed using Illumina TruSeq RNA Access method. Fastq files are included.	Illumina HiSeq 2500	182
EGAD50000000992	Shallow Whole Genome Sequencing of 92 mCRC samples. Chromosomal copy number alterations analysis was performed with the samples to predict response to bevacizumab.	NextSeq 2000	184
EGAD50000000995	This dataset includes transcriptome profiling of 88 head and neck primary patient tumor samples and 45 paired patient-derived xenograft samples that successfully engrafted in mice. NGS was performed using the Illumina TruSeq stranded total RNA sample preparation kit on the Illumina NovaSeq X Plus platform at the Princess Margaret Genomics Centre. This dataset consists of paired-end fastq files.	Illumina NovaSeq X	133
EGAD50000000996	TAPS data in the form of BAM files from 214 samples: plasma (at 80x) and matched germline (at 30x) pairs from 61 cancer and 30 non-cancer subjects, fresh-frozen matched tumour biopsies (at 80x) from 16 subjects, as well as several follow-up plasma samples (at 80x) from 10 subjects with colorectal cancer.	Illumina NovaSeq 6000	214
EGAD50000000997	In-house snRNASeq dataset generated from 8 post-mortem hypothalami	Illumina NovaSeq 6000	58
EGAD50000000998	Buccal samples and paired esophageal epithelium were obtained using the three sizes of swabs and endoscopic biopsy, respectively. Forty samples from 10 subjects were analyzed via duplex sequencing. This dataset contains bam files that were mapped to the GRCh37 reference genome.	DNBSEQ-G400	40
EGAD50000000999	16 buccal swab samples were analyzed by Nanoseq. S3 in the file names indicates 28 mm^2 swab size, and S5 indicates 254 mm^2 swab size. We analyzed 28 mm^2 swab samples using paired 254 mm^2 swab samples as controls. The files were generated by the Nanoseq pipeline using GRCh37 as the reference genome.	DNBSEQ-G400	32
EGAD50000001000	This dataset contains 22 long-read WGS fastq.gz files sequenced ONT PromethION. In total the dataset includes 20 individuals, for all of them sequencing starting from blood was performed, and for one of the samples sequencing from urine and from the buccal swab was also included. In all the samples methylation calls are included in the fastq file.	PromethION	22
EGAD50000001001	Single cell atlas of the human airways from 10 patients suffering of early stage Chronic Obstructive Pulmonary Disease and 12 healthy age matched volunteers. This dataset was sequenced by 10X Genomics 3’ RNA-seq profiling. It is composed of more than 400 000 cells coming from 119 samples total. Sampling was performed by brushings and biopsies from the nose to the 6 th division of the airways.	NextSeq 500	117
EGAD50000001002	Single cell atlas of the human airways from 10 patients suffering of early stage Chronic Obstructive Pulmonary Disease and 12 healthy age matched volunteers. This dataset was sequenced by 10X Genomics 3’ RNA-seq profiling. It is composed of more than 400 000 cells coming from 119 samples total. Sampling was performed by brushings and biopsies from the nose to the 6 th division of the airways. 14 samples were also sequenced using Nanopore long-read sequencing.	PromethION	14
EGAD50000001005	This dataset contains 4 paired fastq files sequenced by illumina Novoseq 6000. WES analysis (illumina platform, PE150, 100x coverage) was performed on a pool of CNV-verified CTCs that were enriched using high-throughout microfluidic device from entire leukopak from patients with metastatic prostate cancer (GU-1 and GU-2).	Illumina NovaSeq 6000	2
EGAD50000001006	This dataset contains 352 paired fastq files sequenced by illumina Novoseq6000. Single-cell low-pass WGS analysis (illumina platform, PE150, 1-2x coverage) was performed on CTCs enriched using high-throughout microfluidic device from entire leukopak from patients with metastatic prostate cancer (GU-1 and GU-2) and metastatic hepatocellular carcinoma (HCC-1 and HCC-2). Patient matched white blood cells (WBC) served as internal controls.	Illumina NovaSeq 6000	176
EGAD50000001007	Shallow whole genome sequencing of adavanced colorectal tumors from patients included in the FOCUS clinical trial. Performed to calculate assocations of irinotecan with copy-number alterations. Copy number aberration analysis of 349 enroled patients.	Illumina HiSeq 4000	349
EGAD50000001008	To elucidate the heterogeneity of tumour immune cell infiltration, we performed single cell RNAseq sequencing on CD45+ cells enriched from matched fresh tumour tissue following surgical resection of 12 treatment-naïve patients and matched PBMCs.	Illumina HiSeq 1500	30
EGAD50000001010	RNA sequencing (RNA-seq) was performed on baseline (before low-dose 5-aza treatment) and on-treatment (low-dose 5-aza treatment for 5 or 10 days) tumor samples from eight participants with head and neck cancer who were refractory to anti-PD-1 therapy. RNA library preparations, sequencing reactions, and initial bioinformatics analyses were conducted at GENEWIZ, LLC. In total, 16 150-bp pair-end RNAseq data were generated.	Illumina HiSeq 4000	16
EGAD50000001011	Whole exome sequencing (WES) was performed on baseline (before low-dose 5-aza treatment) blood and tumor paired samples from 12 participants with head and neck cancer who were refractory to anti-PD-1 therapy. DNA extractions were performed by the Broad Institute of MIT and Harvard using Qiagen AllPrep DNA/RNA kits (Cat. 80204). In total, 24 150-bp pair-end WES data were generated by Illumina.	Illumina HiSeq 4000	24
EGAD50000001012	Genome variation data for 2723 Brazilian individuals from the DNA do Brasil Project.	Illumina NovaSeq 6000	2723
EGAD50000001013	Raw paired FASTQ data of patients at baseline from the PEVOsq cohort	Illumina NovaSeq 6000	154
EGAD50000001014	A total of 46 frozen tumor biopsies underwent WES analysis, while 41 blood samples were used as germline controls.Tumor samples qualified for whole exome sequencing (WES) if they contained ≥10% tumor cells. For bulk RNA-seq, 20 paired baseline and on-treatment frozen tumor biopsies were analyzed. Tumor samples qualified if they contained ≥30% tumor cells. For GeoMx spatial transcriptomic study we used 7 patients from the ICARUS LUNG01 trial focusing on tumour and immune cell compartments. We selected 7 paired baseline and on-treatment patient samples for analysis. We selected up to 4 regions of interest (ROI) per sample tissue. We identified tumour and immune cell areas of illumination (AOI) using CK and CD45 antibodies respectively	Illumina NovaSeq 6000 NextSeq 500	205
EGAD50000001015	Methylation sequencing dataset for "Molecular counting enables accurate and precise quantification of methylated ctDNA." This dataset was created by running samples through Northstar Response, where bisulfite converted samples are amplified in a multiplex PCR targeting more than 500 genomic regions. Samples in this dataset include contrived samples made from 84 different tumors DNA, each mixed with their matched buffy coat DNA at varying ratios, as well as cfDNA and buffy coat DNA samples collected from 67 cancer patients undergoing treatment. All samples were sequenced on Illumina NextSeq 2000 with single end 100 cycle sequencing; files are in fastq.gz format.	NextSeq 2000	174
EGAD50000001016	Targeted panel somatic variant sequencing dataset for "Molecular counting enables accurate and precise quantification of methylated ctDNA." This dataset was created by running 60 cfDNA samples from patients with metastatic cancer to the lung through a research-use, tumor-naive, targeted-panel assay for measuring somatic mutations across a panel of 80 genes. All samples were sequenced on Illumina NextSeq 2000 with 150x2 paired end sequencing; files are in fastq.gz format. For the original study, the variant allele fractions of these somatic mutations were compared with the Tumor Methylation Scores calculated from Northstar Response in the same samples.	NextSeq 2000	60
EGAD50000001017	Chromium Next GEM Single Cell 3’ Gel Bead Kit (Dual Index) v3.1 (10X Genomics) of endometrium from 27 women including 5 without PCOS and 12 with PCOS. Following baseline biopsies, PCOS participants underwent a 16-week randomized controlled trial with metformin (n=7) or lifestyle intervention (n=3). 25,000 raw reads per nucleus	Illumina NovaSeq 6000	27
EGAD50000001018	- WES files from primary or metastatic prostate cancer biopsies, from patients with metastatic prostate cancer. - This dataset includes those cases profiled by WES and included in the manuscript Homologous recombination repair status in metastatic prostate cancer by next-generation sequencing and functional tissue-based immunofluorescence assays, Arce-Gallego et al, Cell Rep Med 2024 - The dataset includes WES for 80 tumor biopsies from 65 different patients, including paired tumor/normal samples.	Illumina NovaSeq 6000	145
EGAD50000001019	This dataset contains paired-end bulk RNA-seq data of advanced chordoma samples from 7 patients. Sequencing was performed on Illumina HiSeq 4000 or NovaSeq 6000.	Illumina HiSeq 4000 Illumina NovaSeq 6000	7
EGAD50000001021	This dataset contains the transcriptome analysis of human Schwann cell cultures obtained from sural nerve biopsies from six patients with polyneuropathies (N=3/patient, totalN=18) as well as the nerve fascicles from where they were isolated (N=1/patient, totalN=6). In addition, this dataset includes 8 samples out of commercial human Schwann cell cultures and one sample of a fibroblast culture, which serve as controls for further analysis.	NextSeq 2000	33
EGAD50000001022	Whole genome bisulfate sequencing using tagmentation of Tcells from skin and blood.The method improved TWGBS protocol (Weichenhan et al.), Methods Mol. Biol. 2018, vol. 1708 pp.105-122) was used	Illumina NovaSeq 6000	30
EGAD50000001023	Single-cell sequencing and genotyping for Cambridge samples analyzed as part of a project evaluating single-cell gene expression & lymphocyte receptor sequences in CSF and PBMC of MS and other neuroinflammatory disorders.	Illumina NovaSeq 6000	76
EGAD50000001024	One Jakun individual was sequenced on IIllumina HiSeq 2000. Variant calling was performed through GATK	Illumina HiSeq 2000	1
EGAD50000001025	The dataset contains RNA sequencing data from N=48 transurethral resections of bladder tumours (TURBTs) from N=48 patients with muscle-invasive bladder cancer who were treated with neoadjuvant chemotherapy followed by radical cystectomy. All TURBTs originate from formalin-fixed paraffin embedded tissue. We used the Illumina TruSeq RNA exome kit, previously known as the TruSeq RNA Access Library Prep Kit. This kit converts total RNA into template molecules of known strand origin, followed by sequence-specific capture of coding RNA. All samples were run in an Illumina NovaSeq6000 instrument. Paired FASTQ files from all 48 patients are provided.	Illumina NovaSeq 6000	48
EGAD50000001026	This research project was a collaboration between the Karolinska Institute BioBank and the Stanley Center at the Broad Institute. In this project we sequenced and analyzed the whole exomes of 4,765 control samples from collaborators in Sweden. Genomic DNA from each samples was sequenced to a mean depth of 20x. The exome used Twist capture and samples were sequenced on Illumina HiSeqX machines producing CRAM files.	HiSeq X Ten	4765
EGAD50000001027	The iPSC line EDi018-A / SAMEA4771918 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi018-A is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001028	The iPSC line BIONi010-C-8 / SAMEA4454011 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line BIONi010-C-8 is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001029	The iPSC line EDi019-A / SAMEA4774918 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi019-A is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001030	The iPSC line RCi009-A / SAMEA4339688 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line RCi009-A is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001031	The iPSC line SIGi001-A-13 / SAMEA104386250 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line SIGi001-A-13 is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001032	The iPSC line EDi016-A / SAMEA4562366 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi016-A is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001033	The iPSC line EDi017-A / SAMEA4768918 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi017-A is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001034	The iPSC line BIONi010-C-3 / SAMEA4342740 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line BIONi010-C-3 is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001035	The iPSC line BIONi010-C-7 / SAMEA4454010 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line BIONi010-C-7 is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001036	The iPSC line EDi019-C / SAMEA4777168 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi019-C is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001037	The iPSC line BIONi010-C-6 / SAMEA4454009 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line BIONi010-C-6 is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001038	The iPSC line EDi010-A / SAMEA4459354 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi010-A is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001039	The iPSC line BIONi010-C / SAMEA3158050 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line BIONi010-C is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001040	The iPSC line BIONi010-C-9 / SAMEA4454012 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line BIONi010-C-9 is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001041	The iPSC line SIGi001-A-3 / SAMEA4448571 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line SIGi001-A-3 is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001042	The iPSC line UKKi019-C / SAMEA17626918 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line UKKi019-C is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001043	The iPSC line BIONi010-C-4 / SAMEA4452060 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line BIONi010-C-4 is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001044	The iPSC line EDi011-B / SAMEA4459359 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi011-B is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001045	The iPSC line EDi011-C / SAMEA4459360 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi011-C is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001046	The iPSC line EDi012-A / SAMEA4459361 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi012-A is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001047	The iPSC line EDi015-A / SAMEA4459373 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi015-A is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001048	The iPSC line EDi015-C / SAMEA4459376 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi015-C is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001049	The iPSC line EDi017-B / SAMEA4770418 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi017-B is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001050	The iPSC line EDi011-A / SAMEA4459357 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi011-A is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001051	The iPSC line EDi014-A / SAMEA4459369 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi014-A is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001052	The iPSC line EDi013-B / SAMEA4459367 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi013-B is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001053	The iPSC line EDi014-B / SAMEA4459371 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi014-B is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001054	The iPSC line BIONi010-A / SAMEA3105765 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line BIONi010-A is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001055	The iPSC line BIONi010-B / SAMEA3158000 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line BIONi010-B is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001056	The iPSC line EDi010-B / SAMEA4459356 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi010-B is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001057	The iPSC line EDi013-A / SAMEA4459365 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi013-A is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001058	The iPSC line RBi001-A / SAMEA3368212 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line RBi001-A is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001059	The iPSC line RCi006-A / SAMEA3962402 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line RCi006-A is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001060	The iPSC line SIGi001-A-7 / SAMEA4448730 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line SIGi001-A-7 is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001061	The iPSC line UKKi017-C / SAMEA17621668 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line UKKi017-C is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001062	The iPSC line BIONi010-C-5 / SAMEA4452061 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line BIONi010-C-5 is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001063	The iPSC line EDi012-B / SAMEA4459363 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi012-B is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001064	The iPSC line EDi012-C / SAMEA4459364 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi012-C is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001065	The iPSC line EDi013-C / SAMEA4459368 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi013-C is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001066	The iPSC line EDi015-B / SAMEA4459375 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi015-B is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001067	The iPSC line UKKi018-C / SAMEA103988380 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line UKKi018-C is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001068	The iPSC line BIONi010-C-2 / SAMEA4342705 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line BIONi010-C-2 is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001069	The iPSC line RCi004-A / SAMEA3106011 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line RCi004-A is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001070	The iPSC line RCi004-B / SAMEA3106205 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line RCi004-B is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001071	The iPSC line RCi005-A / SAMEA3961534 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line RCi005-A is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001072	The iPSC line RCi007-C / SAMEA4084916 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line RCi007-C is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001073	The iPSC line SIGi001-A-1 / SAMEA4451096 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line SIGi001-A-1 is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001074	The iPSC line SIGi001-A-10 / SAMEA4451117 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line SIGi001-A-10 is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001075	The iPSC line SIGi001-A-11 / SAMEA4451118 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line SIGi001-A-11 is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001076	The iPSC line SIGi001-A-12 / SAMEA104237570 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line SIGi001-A-12 is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001077	The iPSC line SIGi001-A-2 / SAMEA4451116 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line SIGi001-A-2 is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001078	The iPSC line SIGi001-A-6 / SAMEA4447426 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line SIGi001-A-6 is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001079	The iPSC line UOXFi007-A / SAMEA103988274 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line UOXFi007-A is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001080	The iPSC line EDi018-B / SAMEA4773418 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi018-B is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001081	The iPSC line EDi018-C / SAMEA4774168 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi018-C is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001082	The iPSC line UKKi020-C / SAMEA103988344 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line UKKi020-C is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001083	The iPSC line UKKi021-B / SAMEA103988346 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line UKKi021-B is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001084	The iPSC line UKKi022-C / SAMEA103988349 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line UKKi022-C is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001085	The iPSC line WTSIi009-A / SAMEA2593858 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line WTSIi009-A is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001086	The iPSC line EDi016-B / SAMEA4767418 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi016-B is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001087	The iPSC line EDi017-C / SAMEA4771168 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi017-C is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001088	The iPSC line SIGi001-A-4 / SAMEA4448632 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line SIGi001-A-4 is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001089	The iPSC line UOXFi008-B / SAMEA103887561 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line UOXFi008-B is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001090	The iPSC line EDi016-C / SAMEA4768168 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi016-C is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001091	The iPSC line EDi019-B / SAMEA4776418 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line EDi019-B is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001092	The iPSC line SIGi001-A-5 / SAMEA4448708 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line SIGi001-A-5 is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001093	The iPSC line SIGi001-A-8 / SAMEA4448777 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line SIGi001-A-8 is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001094	The iPSC line UKKi019-A / SAMEA17624668 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line UKKi019-A is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001095	The iPSC line SIGi001-A-9 / SAMEA4447499 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line SIGi001-A-9 is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001096	The iPSC line UKKi019-B / SAMEA17626168 has undergone whole genome sequencing. Whole genome sequencing was done using Illumina HiSeq X Five platform with a reference genome of GRCh38. Raw data as FASTQ files and analysed data as CRAM files are available for this sample, in this dataset. The iPSC line UKKi019-B is available for research use at www.EBiSC.org.	HiSeq X Five	1
EGAD50000001097	This dataset contains 115 samples of WGS, ATAC, 4C and RNAseq samples of patients with acute myeloid leukemia. The sequencing was performed on Illumina HiSeq 4000, 2000 and HiSeq X using Illumina TruSeq Nano DNA and Agilent Strand Specific RNA Kits. The sequencing was always paired.	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 4000 Illumina NovaSeq 6000 Illumina NovaSeq X NextSeq 550	115
EGAD50000001098	This dataset contains 10 samples of WGS, ATAC, 4C and RNAseq samples of patients with acute myeloid leukemia. The sequencing was performed on Illumina HiSeq 4000, 2000 and HiSeq X using Illumina TruSeq Nano DNA and Agilent Strand Specific RNA Kits. The sequencing was always paired.	HiSeq X Ten Illumina HiSeq 2000 Illumina HiSeq 4000	10
EGAD50000001099	WES sequencing of two cases of MET amplified gastric cancer. Six unfixed fresh frozen samples were obtained from the primary tumors (three samples from each) and three samples from corresponding non-neoplastic mucosa. In addition, formalin-fixed and paraffin-embedded (FFPE) samples were obtained from the primary tumors (n=2), lymph node metastases (n=7) and non-neoplastic mucosa (n=2).	Illumina NovaSeq 6000	19
EGAD50000001100	RNAseq TPM matrix (transcriptome profiling by high-thoughput sequencing) Matrix (30,727 features by 2,803 samples) of log2-transformed TPM counts from RNAseq data for the four IMvigor trials.		2803
EGAD50000001101	Clinical data include demographics (gender, race), specimen type (primary vs. metastatic), tract (upper vs. lower), liver metastasis status, PD-L1 IHC, tumor mutation burden, CD8 T cell infiltration status, neutrophil enrichment score, treatment arm, objective response rate, overall survival and progression free survival, and ctDNA status (IMvigor010 only) for 2803 patients across the four IMvigor trials.		2803
EGAD50000001102	This dataset contains raw and processed data of four scRNA-seq PBMC samples from four Psoriasis patients (Pso3 = male, Pso4 = female, Pso7 = female, Pso8 = male). Raw sample libraries were prepared with 10x 5'-kit v2 multiplexing the individual samples using TrueSeq barcoded antibodies to stain the cells. Resulting sequencing data consists of four FASTQs (2 x antibody multiplex to demultiplex the samples + 2 x transcriptome data containing the typical cell x UMI barcode data for count matrix generation). The raw sequencing data was processed with cellranger 6.0.1 with the GRCh38 3.0.0 reference and cell to sample assignment was subsequently done with couhto 1.1.0 (https://github.com/dmalzl/counhto) employing the same algorithm as cellranger multi (couldn't be used here though as multi does not handle 5'-chemistry data). Processed data contains all main outputs of cellranger count in raw text format or in h5 formats including the used feature reference for (i.e. antibody barcode to sample assignment) as well as the computed cell to sample assignment computed by counhto. For further details including all used code in the analysis of the data please consult https://doi.org/10.5281/zenodo.13846873	Illumina NovaSeq 6000	1
EGAD50000001103	This dataset contains one VCF file with 94 individuals genotyped with the Human Origins Array.		94
EGAD50000001105	This dataset contains RNA sequencing (RNAseq) data of 814 patients from the CheckMate 649 clinical trial whose ICF allows data deposition into a public repository. Gene expression profiling was performed retrospectively using RNAseq on a subset of baseline tumor samples. Paired-end FASTQ files were processed on Seven Bridges platform (Seven Bridges Genomics).	Illumina NovaSeq 6000	814
EGAD50000001108	This dataset contains five bone marrow samples and three mobilised peripheral blood ones from different healthy individuals. The three bone marrow samples have been sequence with the 10x multiome kit using Illumina sequencers. The three mobilised peripheral blood ones have been split in two aliquots, one for 10x multiome and the other for 10x CITE-seq. Both libraries types have been sequenced with an Illumina sequencer. Raw data has been processed with cell-ranger and map to the human genome (GRCh38) to obtain the bam files.	Illumina NovaSeq 6000	18
EGAD50000001109	Samples were derived from five lines of human iPSC-derived (hiPSC-derived) astrocytes, both alone and in co-culture with neurons +/- alpha-synuclein oligomer treatment. Paired-end FASTQ files for each of the samples are provided.	Illumina NovaSeq 6000	44
EGAD50000001110	Samples were derived from an human iPSC-derived astrocyte line, both alone and in co-culture with neurons +/- alpha-synuclein oligomer treatment.	Illumina HiSeq 4000	6
EGAD50000001111	Dataset contains RNA-sequencing data (fastq files) from 45 patients treated on the the Australasian Leukaemia and Lymphoma Group (ALLG) ALL06 study, an MRD-stratified Paediatric Protocol. Patient samples (bone marrow or peripheral blood) were taken at diagnosis. mRNA was extracted from blast cells and underwent paired-end sequencing (75bp) using the illumina NextSeq platform. Genomic data was used for identification of genomic drivers of acute lymphoblastic leukaemia (ALL).	unspecified	45
EGAD50000001112	FASTA files from quality-checked Sanger Sequencing results from BCR-specific PCR products. B cells from 4 different MPO+ AAV patients were single cell sorted and processed to sequence the specific BCR, resulting in 51, 175, 12 and 3 heavy chain BCR sequences for AAV1, 2, 3, and 5 respectively. Sequence names indicate patient ID and isotype (IgG, IgM or IgA).	AB 3730xL Genetic Analyzer	4
EGAD50000001113	This project generated a whole-exome sequencing (WES) dataset of 83 boys with a pathogenic variant in the DMD gene, along with WES data from the parents of 38 of them, totaling 159 samples. In addition to DMD, 12 boys also had ID (12 DMD-ID samples), 36 boys had ASD (DMD-ASD samples) and 35 did not present ID or ASD diagnosis (DMD-Control samples).	Illumina HiSeq 2500 Illumina NovaSeq 6000	159
EGAD50000001116	This dataset contains raw sequencing data from eQTL Capture Hi-C, ATAC-seq and RNA-seq experiments in monocytes isolated from 34 healthy male donors, as well as from Hi-C, 4C-seq and CTCF ChIP-seq in subsets of individuals. The data are released as paired-end Illumina FASTQ files. Hi-C and Capture Hi-C used DpnII digestion and Tn5-based low-input library preparation. In Capture Hi-C, this was followed by targeted capture with custom-designed biotinylated RNA probes targeting ~1100 eQTL regions. ATAC-seq was performed on PFA-fixed cells using a digitonin-based protocol. 4C-seq and CTCF ChIP-seq were performed using standard protocols.	Illumina NovaSeq 6000	34
EGAD50000001117	This dataset contains 5 scRNA samples of human glioblastoma samples, human cortical spheroid derived from hiPSC line 028-4 and 3 samples of mouse xenografts. Sequencing has been performed on Illumina NovaSeq 6000. Sequencing has always been paired.	Illumina NovaSeq 6000	8
EGAD50000001118	The dataset comprises single-cell RNA sequencing (scRNA-seq) data derived from around 60,000 PBMCs collected from donors across different severity groups of extrapulmonary tuberculosis (EPTB) and healthy controls (3 healthy, 3 mild, 3 intermediate, 3 severe). The data include both transcriptomic and surface protein expression profiles obtained using the BD Rhapsody Single-Cell Multiomics System. Libraries were prepared for whole transcriptome analysis (WTA), AbSeq, and sample tagging, with sequencing performed on a NovaSeq 6000 system.	Illumina NovaSeq 6000	2
EGAD50000001119	Single homogenized human stool samples from five healthy donors and mix were analyzed by shotgun sequencing to generate a baseline for the establishment of humanized microbiome to be transferred into mice.	NextSeq 500	6
EGAD50000001120	The dataset contains bulk RNA sequencing data from 12 pediatric thymus samples collected during cardiac surgeries. These samples represent immunologically healthy donors with a mean age of 6 months. The sequencing was performed using Illumina HiSeq4000, and the processed files are available in standard formats such as FASTQ.	Illumina HiSeq 4000	12
EGAD50000001121	Fastq files of single-cell RNAseq data from from Cancer Associated Fibroblasts (CAFs), cancer cells and immune cells isolated from 6 primary Invasive Lobular Carcinomas (ILCs)	Illumina NovaSeq 6000	6
EGAD50000001123	This research project was a collaboration between the Karolinska Institute and the Stanley Center at the Broad Institute. In this project we sequenced and analyzed the whole exomes of 5340 SWEBIC Bipolar case samples from collaborators in Sweden. Genomic DNA from each samples was sequenced to a mean depth of 20x. The exome used Twist capture and samples were sequenced on Illumina HiSeqX machines producing CRAM files.	Illumina NovaSeq 6000	5311
EGAD50000001124	Micro-C sequencing data of 35 B-cell precursor (BCP) acute lymphoblastic leukemia (ALL) cases were constructed on an Illumina NovaSeq 6000 Sequencing System.	Illumina NovaSeq 6000	35
EGAD50000001125	RNA sequencing data of 33 B-cell precursor (BCP) acute lymphoblastic leukemia (ALL) cases were constructed on an Illumina nextseq 2000 Sequencing System.	NextSeq 2000	33
EGAD50000001126	Samples were obtained from patients diagnosed with serrated polyposis syndrome (SPS). To provide further detail, the dataset comprises two whole-exome sequencing samples from patients within the same family, uploaded in BAM file format with IDs 449758 and 449549. Exomic regions were sequenced using HiSeq2000 Platform (Illumina, San Diego, USA) and Sure SelectXT All Exon v5 kit (Agilent, Santa Clara, CA, USA) for exon enrichment. Only those potentially pathogenic germline genetic variants shared by both individuals were considered.	Illumina HiSeq 2000	2
EGAD50000001127	The current sequencing dataset consists of 91 whole genome sequencing data files (Fastq files) of human cell-free DNA (cfDNA) obtained from plasma and urine samples. Specifically, we extracted 20 ucfDNA and 17 plasma DNA samples from healthy pregnant women, 11 ucfDNA samples from pregnant women with preeclampsia, 19 ucfDNA samples from healthy controls, and 24 ucfDNA samples from patients with proteinuria. These extracted cfDNA samples were then constructed into DNA libraries using the TruSeq Nano DNA Library Prep Kit (Illumina) and subsequently sequenced on the Nextseq2000 platform (Illumina).	Illumina NovaSeq 6000 NextSeq 2000	91
EGAD50000001128	Single cell mRNAseq of T-ALL samples including presentation samples for 13 with refractory disease and 8 with responsive disease, and Day 28 samples for 8 patients with refractory disease. Also includes bulk mRNA sequencing on 6 of the patients with responsive disease.	HiSeq X Ten Illumina NovaSeq 6000	29
EGAD50000001129	We report single-nuclei RNA sequencing data (n=18) from human livers in metabolic dysfunction-associated steatotic liver disease (MASLD) patients. there are 5 MASH, 4 MASL, 5 control obese and 4 control lean samples.	Illumina NovaSeq 6000	18
EGAD50000001130	This dataset consists of transcriptome data of 25 human samples with myocarditis, thereof: 8 without COVID-19, 10 post COVID-19, 4 after COVID-19 vaccination and 3 with MIS-C (Multisystem Inflammatory Syndrome in Children, PIMS).	Illumina NovaSeq 6000	25
EGAD50000001131	Tumour somatic variants (in VCF format) called from paired tumour-germline whole genome sequencing (using IDT library preparation system followed by sequencing on Illumina NovaSeq platform) of three high-grade serous ovarian tumour samples. The dataset comprises three VCF variant files.	Illumina NovaSeq 6000	3
EGAD50000001132	Tumour variants (in VCF format) from exome sequencing (using Agilent SureSelect and TWIST library preparation systems followed by sequencing on Illumina HiSeq and NovaSeq platforms) of 108 high-grade serous ovarian tumour samples. Germline variants from EGAD00001006030 for these individuals were used to call somatic and tumour-only variants separately, creating two VCF files per sample (except for one sample without paired germline data- somatic variant file only provided). The dataset comprises 215 VCF variant files.	Illumina HiSeq 2500 Illumina NovaSeq 6000	215
EGAD50000001133	This dataset contains WES bam/bai files associated with 119 breast cancer patients as part of with the Liberate Tracer Study	Illumina NovaSeq 6000	119
EGAD50000001135	'Repression of CADM1 transcription by HPV type 18 is mediated by three-dimensional rearrangement of promoter-enhancer interactions	Illumina HiSeq 2000 Illumina HiSeq 4000	40
EGAD50000001136	Here, 34 iNHL samples were sequenced using whole exome sequencing to describe the mutational landscape of indolent primary renal B-Cell lymphomas. Mutational profiles were compared to other Marginal Zone Lymphomas.	Illumina NovaSeq 6000	34
EGAD50000001137	Whole exome sequencing on 4 uveal melanoma samples and corresponding germline samples. Libraries were prepared using the Nextera Rapid Capture Expanded Exome kit. Paired-end libraries (2 x 75 bp) were sequenced on HiSeq 4000 instrument (Illumina).	Illumina HiSeq 4000	8
EGAD50000001138	Total RNA sequencing on 4 uveal melanoma samples. Libraries were prepared using the TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero Gold. Paired-end libraries (2 x 75 bp) were sequenced on HiSeq 4000 instrument (Illumina).	Illumina HiSeq 4000	4
EGAD50000001139	Whole Genome Bisulfite Sequencing on normal primary human uveal melanocytes. WGBS libraries were prepared using the Accel-NGS Methyl-Seq DNA Library Kit (Swift Biosciences, Catalog No. 30024), the EZ DNA Methylation-Gold Kit (Zymo, D5005) and DNA Clean & Concentrator-5 (Zymo, D4013), following the instruction manual Accel-NGS® Methyl-Seq DNA Library (Revision 160510). Paired-end libraries (2 × 150 bp) were sequenced on a NovaSeq 6000 instrument.	Illumina NovaSeq 6000	3
EGAD50000001140	Long-term Survival Update and Extended RAS Mutational Analysis of the CAIRO2 Trial: Addition of Cetuximab to CAPOX/Bevacizumab in Metastatic Colorectal Cancer. Whole Exome Sequencing (WES) of 64 metastatic colorectal cancer (mCRC) from CAIRO2 trial. WES data was used for mutational analysis.	Illumina HiSeq 2500	50
EGAD50000001141	Long-term Survival Update and Extended RAS Mutational Analysis of the CAIRO2 Trial: Addition of Cetuximab to CAPOX/Bevacizumab in Metastatic Colorectal Cancer. Shallow Whole Genome Sequencing (sWGS) of 52 metastatic colorectal cancer (mCRC) samples from CAIRO2 trial. sWGS was used for mutational analysis.	Illumina HiSeq 2500	49
EGAD50000001142	This dataset containes 44 paired of bulk RNA sequencing and 42 parired of single cell RNA sequencing issued from bone marrow aspirations obtained from patients treated with CAR-T (CD3+, CAR T+, CD34+ sorted populations).	Illumina NovaSeq 6000	34
EGAD50000001143	Cryopreserved cells from human nodal follicular lymphoma and lymph node samples were thawed and processed for scRNA-seq analysis. CD3+ CD19− 7-AAD− live T cells (six FL samples from the Original cohort) or 7-AAD− whole live hematopoietic cells (five lymph node and nine FL samples from Validation cohort 1) were sorted using a FACS Aria II or III instrument (BD Biosciences). To generate single-cell emulsions, 6,000–10,000 live cells were loaded into the 10X Genomics Chromium Controller for single-cell partitioning, along with the reverse transcriptase reagent mixture and 3' or 5' gel beads. Sorted T or whole hematopoietic cells were converted to barcoded scRNA-seq libraries using 10X Genomics Chromium single cell 3' reagent kits (v3) for the Validation cohort 1, 5' reagent kits (v1) for the Original cohort, according to the manufacturer’s instructions. The libraries were sequenced on an Illumina HiSeq X Ten or NovaSeq X Plus system, mapped to the human genome (build GRCh38), and demultiplexed using CellRanger pipelines (v3.1.0, 10x Genomics).	Illumina HiSeq X	20
EGAD50000001144	Surgical treatment of gastric cancer: 15-year follow-up results of the randomised nationwide Dutch D1D2 trial. Molecular characterisation of N=80 resectable gastric adenocarcinomas from D1D2 trial, describing mutations obtained by matched (Tumor-Normal) whole exome sequencing (WES). Samples are annotated for TCGA’s molecular classification (MSI, GS and CIN).	Illumina HiSeq 4000	160
EGAD50000001145	Chemotherapy versus chemoradiotherapy after surgery and preoperative chemotherapy for resectable gastric cancer (CRITICS): an international, open-label, randomised phase 3 trial. Molecular characterisation of N=36 resectable gastric adenocarcinomas from CRITICS trial, describing mutations obtained by matched (Tumor-Normal) whole exome sequencing (WES). Samples are annotated for TCGA’s molecular classification (MSI, GS and CIN).	Illumina HiSeq 4000	72
EGAD50000001146	Shallow Whole Genome Sequencing (sWGS) of 148 FFPE DNA samples from peripheral T cell lymphoma (PTCL) samples. Assoicated with the "Molecular biomarkers for stratification peripheral T cell lymphoma" study.	Illumina HiSeq 2000	147
EGAD50000001147	Full description of the dataset and all necessary information on metadata may be found under (https://zenodo.org/records/11237107): DAC-2023-07-05-Ritz (DAC-007), raw data in EGA, metadata in Zenodo (https://zenodo.org/records/11237107)		1
EGAD50000001148	PDX samples of breast cancer IDC mouse models created from patient tumor material of the NKI. Genomescan prepared the samples according to the procedure for Hybridization Capture using the Agilent SureSelectXT Human All Exon V7 kit. The prepared libraries were sequenced with Illumina sequencing technology. The samples are in fastq format and consist of the following PDX samples: IDC001, IDC125, IDC147, IDC200, IDC233, IDC244, IDC251, IDC272, IDC278, IDC281 and ILC120.	Illumina HiSeq 2500	11
EGAD50000001149	PDX samples of breast cancer IDC mouse models created from patient tumor material of the NKI. Genomescan prepared the samples according to the procedure for Hybridization Capture using an Agilent SureSelect custom 0.5-2.9Mb kit. The prepared libraries were sequenced with Illumina sequencing technology. The samples are in fastq format and consist of the following PDX samples: IDC025, IDC026, IDC029, IDC031, IDC032, IDC038, IDC057, IDC062, IDC065, IDC069, IDC072, IDC090B, IDC092, IDC097, IDC099, IDC107, IDC113, IDC117, IDC143, IDC152, IDC159A, IDC159B, IDC180, IDC186, IDC192, IDC197, IDC198, IDC207, IDC209, IDC216, IDC218, IDC222, IDC229, IDC232, IDC274, IDC282, IDC290, IDC299, IDC307, IDC338, IDC344, IDC346, ILC006, ILC012, ILC083 and ILC248.	Illumina HiSeq 2500	47
EGAD50000001150	Whole exome sequencing of DNA samples from tumor samples of patients in the Neo-Pembro trial. For each patient, up to four samples were sequenced; one tumor biopsy at baseline, one tumor biopsy after induction chemotherapy, a tumor sample from resection material at cytoreductive surgery, and a reference normal DNA sample isolated from whole blood. DNA was obtained from fresh-frozen samples where possible, or formalin-fixed paraffin-embedded when no fresh-frozen sample was available.	Illumina NovaSeq 6000	103
EGAD50000001151	Whole transcriptome sequencing of RNA samples from tumor samples of patients in the Neo-Pembro trial. For each patient, up to three samples were sequenced; one tumor biopsy at baseline, one tumor biopsy after induction chemotherapy, and a tumor sample from resection material at cytoreductive surgery. RNA was obtained from fresh-frozen samples where possible, or formalin-fixed paraffin-embedded when no fresh-frozen sample was available.	Illumina NovaSeq 6000	71
EGAD50000001152	Sequence reads were obtained from a set of 13 families with familial pulmonary fibrosis in the Canary Islands, Spain, using Illumina paired-end reads, respectively, at the Institute of Technology and Renewable Energy (ITER). Briefly, we used bcl2fastq v2.18 to perform sample demultiplexing and BWA-MEM v0.7.15 (https://github.com/lh3/bwa) to align reads to GRCh37/hg19 reference. Resulting BAM files were assessed with SAMtools v1.3 (http://www.htslib.org) and Picard v2.10.10 (https://broadinstitute.github.io/picard/) for quality control steps. Small insertions/deletions (< 50 bp) and single nucleotide variants (SNVs) were identified using an in-house bioinformatics pipeline based on GATK HaplotypeCaller v3.8 (https://gatk.broadinstitute.org/hc/en-us/articles/360037225632-HaplotypeCaller). This pipeline follows the Best Practices recommendations for germline variant calling and its description is publicly available (https://github.com/genomicsITER/benchmarking/tree/master/WES).	unspecified	61
EGAD50000001153	Whole exome sequencing (WES) data of patients with resectable esophageal adenocarcinoma treated with neoadjuvant atezolizumab and chemoradiation (PERFECT). 78 matched samples (Tumor-Normal) from 39 patients with resectable esophageal adenocarcinoma of the PERFECT trial. Whole exome sequencing performed for mutation detection analysis.	Illumina HiSeq 4000	78
EGAD50000001154	This dataset comprises whole exome sequencing data from 29 tumor-normal pairs and RNA-seq data from 39 tumors of newly diagnosed IDH-wt GBM patients. It includes analyses such as variant calls, copy number variations, the transcriptomic expression matrix, and patient metadata and outcomes.	Illumina NovaSeq 6000	97
EGAD50000001155	The Chromium Controller and Chromium X platfor of 10X Genomics were used for single cell partitioning and barcoding. Each cell's transcriptome was barcoded during reverse transcription, pooled cDNA was amplified and Single Cell 5' Gene Expression (GEX), V(D)J and Feature Barcode (FB) Libraries were prepared according to the manufacturer's protocols (CG000330 and CG000331, 10X Genomics). All libraries were quantified and normalized on library QC data generated on the Bioanalyzer system according to manufacturer's protocols (G2938-90321 and G2938-90024, Agilent Technologies). Based on the expected target cell counts, a balanced library sub-pool of samples was compsed for SC5'GEX, V(D)J and FB libraries. Library sub-pools were quantified by qPCR, accodring to the KAPA Library Quantification Kit Illumina(R) Platforms protocol (KR0405, KAPA Biosystems). Based on qPCR results a final sequencing pool was composed. Paired end sequencing was performed on a NovaSeq 6000 Instrument (Illumina) using NovaSeq 6000 Reagent Kits v1.5 100 cycles (cat. no. 20028401, 20028319, 20028316 Illumina), using 28 cycles for Read 1, 10 cycles for Read i7, 10 cycles for Read i5 and 90 cycles for Read 2.	Illumina NovaSeq 6000	31
EGAD50000001156	Acute myeloid leukemia (AML) represents a group of aggressive hematological malignancies, the clinical management of which is made challenging due to the persistence of rare and therapy resistant leukemia stem cells (LSCs) which serve as a source of disease relapse and poor outcomes. There are currently a paucity of methods to reliably enrich and study LSCs, hindering the development of therapies that specifically target LSCs. In this study, we deeply characterize the OCI-AML8227 culture model, which maintains a functional stemness hierarchy originating from its highly primitive CD34⁺CD38⁻ cells, to elucidate LSC biology and uncover LSC-specific therapeutic vulnerabilities. We analyzed both bulk and single-cell proteomics, transcriptomics, and epigenomics to generate a LSC protein-protein interaction network, which was then integrated with an LSC-focused small molecule screen using this model. From these findings, CDK6 was discovered as a therapeutic vulnerability specific to LSCs, which was validated in findings from the BEAT-AML cohort and a patient-derived xenograft (PDX) panel of AML samples through palbociclib treatment. Taken together, our studies validate CDK6 as a druggable vulnerability in LSCs, and authenticate OCI-AML8227 cells as a LSC target discovery engine.	Illumina HiSeq 2500 Illumina NovaSeq 6000	30
EGAD50000001157	This dataset contains multiplexed single-cell RNA sequencing (scRNA-seq) data from human thymic tissue, focusing on ILC1 progenitors and NK cell differentiation. The data were generated using the BD Rhapsody system and sequenced on the Illumina NovaSeq S4 platform. A demultiplexing file is required to assign sample identities.	Illumina NovaSeq X	4
EGAD50000001158	Between November 2010 and May 2014, 20 cases—17 thymomas and 3 thymic carcinomas (including 2 squamous and 1 neuroendocrine carcinoma)—were newly sequenced from surgically removed TETs at Seoul National University Hospital, with written informed consent. Tumors and normal tissues were carefully separated and immediately preserved in liquid nitrogen after resection. DNA and RNA were extracted from the tumors, adjacent normal tissues, and/or blood samples. Libraries for RNA-seq were prepared using the TruSeq Stranded mRNA LT Sample Prep Kit following standard Illumina protocols.	Illumina HiSeq 2500	40
EGAD50000001159	Between November 2010 and May 2014, 20 cases—17 thymomas and 3 thymic carcinomas (including 2 squamous and 1 neuroendocrine carcinoma)—were newly sequenced from surgically removed TETs at Seoul National University Hospital, with written informed consent. Tumors and normal tissues were carefully separated and immediately preserved in liquid nitrogen after resection. DNA and RNA were extracted from the tumors, adjacent normal tissues, and/or blood samples. Libraries for WES were prepared using the SureSelect XT (Human All Exon + UTR v5) Library Prep Kit following standard Illumina protocols.	Illumina HiSeq 2500	40
EGAD50000001160	Among the 20 newly sequenced cases, we conducted additional whole-genome sequencing (WGS) on five out of eight CN-type thymomas, excluding three with low tumor cell fractions (< 0.1). Libraries for WGS were prepared using the TrueSeq DNA PCR-free Prep Kit following standard Illumina protocols.	HiSeq X Ten	10
EGAD50000001161	Samples associated to: "A phase Ib/II study of regorafenib and paclitaxel in patients with beyond first-line advanced esophagogastric carcinoma (REPEAT)". sWGS from 97 Esophagogastric Carcinoma samples. sWGS was done to carry out Copy Number Analysis. Dataset includes: .fastq, .bam and .bai files.	Illumina HiSeq 4000	40
EGAD50000001162	Exome sequencing (BAM and VCF files) in advanced urothelial carcinoma in diagnostic and pos-ttreatment samples	Illumina NovaSeq 6000	70
EGAD50000001163	Associated with the study: Blood-based Monitoring of Relapsed/Refractory Hodgkin Lymphoma Patients Predict Responses to Anti-PD-1 Treatment. 26 ctDNA Samples from 4 patients followed longitudinally. sWGS performed for copy number aberration (CNA) analysis.	Illumina HiSeq 4000	26
EGAD50000001164	Associated with Molecular biomarkers in progression from refractory celiac disease to the lethal cancer variety enteropathy associated T cell lymphoma (EATL) Study. Whole Exome Sequencing (WES) of 63 samples. WES analysis done to detect mutations and copy number aberration predictive for progression from RCD to EATL.	Illumina NovaSeq 6000	63
EGAD50000001165	Associated with Molecular biomarkers in progression from refractory celiac disease to the lethal cancer variety enteropathy associated T cell lymphoma (EATL) Study. shallow Whole Genome Sequencing (sWGS) of 63 samples. sWGS was done to analyze copy number aberration.	Illumina NovaSeq 6000	63
EGAD50000001166	RNA sequencing of fibroblasts from pediatric patients with childhood epilepsy. Fibroblast RNA was extracted at passage 3-6 using MN nucleospin, >90M 150bp paired-end reads per sample were obtained with illumina sequencing. Reads were aligned to hg38.	Illumina NovaSeq 6000	41
EGAD50000001167	This dataset comprises BAM files from 18 control samples analyzed with the eSENSES targeted NGS panel. Spanning approximately 2 Mb, the panel features around 15,000 evenly distributed genome-wide SNPs, over 500 focal SNPs targeting key breast cancer driver regions, and more than 2,000 exons from 81 commonly altered genes.	Illumina NovaSeq 6000	18
EGAD50000001168	Targeted deep sequencing data of 386 T-ALL patients in several high-risk T-ALL genes. Samples were prepared according to Agilent's HaloPlex HS Target Enrichment Protocol and sequenced as 75bp paired-end reads. Data is available as paired-end FASTQ files for each sample.	NextSeq 2000	386
EGAD50000001171	Datasets associated with Young Boost Trial for Breast Cancer patients. Whole Exome Sequencing (WES) performed on 109 Samples (75 Tumor and 34 Normal). WES was performed to identify potential predictive biomarkers for treatment response.	Illumina HiSeq 4000	109
EGAD50000001172	Associated with Molecular profiling of DLBCL patients treated in the PETAL trial. Whole Exome Sequencing (WES) of 224 Samples (162 Tumor + 62 Normal). WES was used for Molecular Profiling by simultaneous screening of mutations. Data is Complementary to sWGS.	Illumina HiSeq 4000	224
EGAD50000001173	We undertook single-cell RNA sequencing to establish gene expression and lymphocyte receptor sequences of CSF and peripheral blood leukocytes derived from patients with a variety of inflammatory, infectious and non-inflammatory neurological disorders. The aim of the study was the investigation of functional changes of the cells involved in neuroinflammatory responses.	Illumina NovaSeq 6000	132
EGAD50000001174	This dataset contains tumor-normal WES data in BAM files aqcuired from patients with renal cancer tumors and treated with Sunitinib	Illumina HiSeq 2500	162
EGAD50000001175	Whole Genome Sequencing data of 100 Breast Cancer patients from "Ultrasensitive Detection and Monitoring of Circulating Tumor DNA using Structural Variants in Early-Stage Breast Cancer" study.	Illumina NovaSeq X	100
EGAD50000001176	Whole exome sequencing of cetuximab-sensitive and -resistant patient-derived xenograft models from two HNSCC patients.	Illumina NovaSeq 6000	6
EGAD50000001177	This dataset includes paired tumor and non-tumor WGS data from patients diagnosed with newly diagnosed (n = 6), refractory/relapsed multiple myeloma (n = 3) or plasma cell leukemia (n = 1). These samples presented >95% tumor PCs in bone marrow aspirates. Plasma cells and T cells (non-tumor cells) were isolated from bone marrow aspirates using a magnetic approach.	Illumina NovaSeq 6000	20
EGAD50000001178	This dataset includes Fastq files from bulk RNA seq from myeloma cell lines (MM.1S and NCI-H929) expressing a DOX-inducible LAMP5 shRNA, S1 (CACTTCAAAGACGCAGTCAGT) or S5 (GCACACAGAATACAACCTCAT), or a scramble control treated with DOX during 48h.	Illumina NovaSeq X	29
EGAD50000001179	This dataset includes FastQ files of single-cell RNA sequencing data of 5' GEX and VDJ libraries from CD138+ plasma cells enriched from 72 bone marrow aspirates from patients diagnosed with a monoclonal gammopathy from the multiple myeloma spectrum, including MGUS (n=13), SMM (n = 22), NDMM (n = 17), RRMM (n = 15) and PCL (n = 2); and 3 controls.	Illumina NovaSeq 6000	72
EGAD50000001180	The dataset "Labcorp® Plasma Detect™ assay: whole genome sequencing analyses of plasma cfDNA, white blood cells and FFPE tumor tissue" includes CRAM and CRAI files derived from WGS of 555x plasma cfDNA, 209x white blood cells (normal DNA) and 209x FFPE tumor tissue (tumor DNA) samples for 209 stage III colorectal cancer patients. WGS (150bp PE) was performed on a NovaSeq 6000 with a target depth of coverage of 80x for tumor DNA, 40x for normal DNA and 30x for cfDNA. Reads were aligned to GRCh37 (plasma cfDNA) or GRCh38 (tumor/normal DNA).	Illumina NovaSeq 6000	969
EGAD50000001181	This dataset contains RNA-Seq data of acute lymphoblastic leukemia in LLAG-0707 study. Diagnostic samples of 188 patients were sequenced. Reads were aligned to hg19 genome reference using STAR. Aligned bam files are provided in this dataset.	unspecified	188
EGAD50000001182	16S rRNA sequencing data of fecal samples of 1211 RODAM cohort participants in three research sites: rural Ghana, urban Ghana and Amsterdam, the Netherlands.	Illumina MiSeq	1211
EGAD50000001183	This dataset contains WGBS sequencing data of lymphnode metastasis samples from prostate cancer. Sequencing was performed on Illumina HiSeq X Ten. The sequencing was always paired.	HiSeq X Ten	15
EGAD50000001184	This dataset contains whole exome sequencing of ETV::RUNX1 positive acute lymphoblastic leukemia. Germline and leukemia samples were sequenced in pairs. A total of 94 samples were sequenced and reads were aligned to hg19 genome reference. Aligned bam files are provided in this data set.	unspecified	94
EGAD50000001185	Single-cell RNA sequencing (scRNA-seq) was generated using the 10x Genomics Chromium Next GEM Single Cell 3ʹ platform	Illumina NovaSeq 6000	1
EGAD50000001186	Single-cell RNA sequencing (scRNA-seq) was generated using the 10x Genomics GEM-X Flex Gene Expression Multiplexed platform	Illumina NovaSeq 6000	20
EGAD50000001187	This dataset contains fecal metagenomic sequencing of Lifelines NEXT cohort samples (main batch sequenced until first data freeze, end of 2023).	Illumina HiSeq 2000	4577
EGAD50000001188	Dataset contains 12 ChIP-Seq samples sequenced on the Illumina NextSeq500 sequencer. 3 samples are control samples (HcAct-Chip 1,HcAct-Chip 2,HcAct-Imput 2) 9 samples are case samples (HcAct2DG-Chip 1,HcAct2DG-Chip 2,HcAct2DG-Imput 1,JIAAct-Chip1,JIAAct-Chip2,JIAAct-Imput2,JIAAct2DG-Chip1,JIAAct2DG-Chip2,JIAAct2DG-Imput2)	NextSeq 500	12
EGAD50000001189	Dataset contains 12 RNA-Seq samples sequenced on the Illumina NextSeq500 sequencer. 6 samples are controls (D1-Tact-Plasma,D2-Tact-Plasma,D3-Tact-Plasma,D1-TUnstim-Plasma,D2-TUnstim-Plasma,D3-TUnstim-Plasma) 6 samples are case (D1-Tact-SF,D2-Tact-SF,D3-Tact-SF,D1-TUnstim-SF,D2-TUnstim-SF,D3-TUnstim-SF)	NextSeq 500	12
EGAD50000001190	This dataset contains paired-end WES data of salivary gland tumor samples and control blood samples from 53 patients. Sequencing was performed on Illumina HiSeq 2500, HiSeq 4000 or NovaSeq 6000.	Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina NovaSeq 6000	106
EGAD50000001191	This dataset contains paired-end RNA-seq data of salivary gland tumor samples from 94 patients. Sequencing was performed on Illumina HiSeq 2500, HiSeq 4000, HiSeq X Ten or NovaSeq 6000.	HiSeq X Ten Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina NovaSeq 6000	95
EGAD50000001192	This dataset contains paired-end WGS data of salivary gland tumor samples and control blood samples from 50 patients. Sequencing was performed on Illumina HiSeq 2500, HiSeq 4000, HiSeq X Ten or NovaSeq 6000.	Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina HiSeq X Illumina NovaSeq 6000	101
EGAD50000001193	This dataset contains processed RNA count data for the 49 Heart Failure samples profiled by RNAseq for the Indian HFrEF Cohort from the study Integrated Transcriptomic and Regulatory RNA Profiling Reflects Complex Pathophysiology and Uncovers a Conserved Gene Signature in End Stage Heart Failure RNA-Seq data.	Illumina HiSeq 4000	49
EGAD50000001194	The dataset contains miRNA profiling from a cohort of 49 HFrEF patients. The myocardium samples were collected from these patients during heart transplant of LVAD implantation procedure. The RNA samples were collected in RNAlater. Small RNA libraries were prepared using QIAseq miRNA Library Kit, Qiagen and sequenced on Illumina HiSeq4000.	Illumina HiSeq 4000	49
EGAD50000001195	This dataset contains processed miRNA count data for the 49 Heart Failure samples profiled by miRNAseq for the Indian HFrEF Cohort from the study Integrated Transcriptomic and Regulatory RNA Profiling Reflects Complex Pathophysiology and Uncovers a Conserved Gene Signature in End Stage Heart Failure RNA-Seq data.	Illumina HiSeq 4000	49
EGAD50000001196	The dataset contains whole transcriptomics profiling from a cohort of 49 HFrEF patients. The myocardium samples were collected from these patients during heart transplant of LVAD implantation procedure. The RNA samples were collected in RNAlater. Total transcriptome libraries were prepared using TruSeq Stranded Total RNA Library Prep with Ribo-Zero Gold kit and sequenced on Illumina HiSeq4000.	Illumina HiSeq 4000	49
EGAD50000001197	This dataset includes single-cell RNA sequencing profiles from five mice bearing Group 3 medulloblastoma (CSCG) treated with oncolytic measles virus (MV-NIS) and three mice treated with MV-NIS combined with anti-PD1 therapy. Additionally, bulk RNA sequencing data were generated from 12 samples across PBS control, heat-inactivated (HI) control, MV mid-treatment, and treatment resistance conditions. Pediatric patient data were collected from the PNOC005 trial, which enrolled patients into three strata: Stratum A (focal recurrence, local MV-NIS administration post-surgical resection), Stratum B (disseminated recurrence, single MV-NIS dose via lumbar puncture), and Stratum C (disseminated recurrence, two consecutive MV-NIS doses via lumbar puncture on Day 0 and Day 7). Whole blood and PBMC samples were collected from 17 patients across all strata, generating 72 bulk RNA-seq datasets at multiple time points. All datasets are provided as raw sequencing data in FASTQ format.	Illumina HiSeq 4000 Illumina NovaSeq 6000	93
EGAD50000001198	Whole Exome Sequencing (WES) analysis was performed on three distinct tumor biopsies collected at different time points, along with their matched germline non-tumor sample, from the same LUAD patient. DNA was extracted using the DNeasy Blood and Tissue Kit (Qiagen, Germantown, MD) according to the manufacturer’s instructions and quantified using the Qubit Fluorometer assay (Life Technologies, Carlsbad, CA). To minimize FFPE-related sequencing artifacts (e.g., C:G > T:A transitions), the extracted DNA was treated with the DNA repair enzyme Uracil-DNA-Glycosylase (UDG) following the manufacturer’s protocol (New England Biolabs, Ipswich, MA). Whole-exome capture was performed with the Twist Human Core Exome + RefSeq + Mito-Panel kit (Twist Bioscience), in accordance with the manufacturer’s guidelines. Sequencing generated paired-end 100-bp reads on the Illumina NovaSeq 6000 platform. The resulting reads were aligned to the reference human genome (GRCh38) using the Burrows-Wheeler Aligner (BWA, v0.7.12).	Illumina NovaSeq 6000	4
EGAD50000001199	This dataset consist of 72 samples profiled on whole exome sequencing and 94 samples profiled on whole transcriptome sequencing. The Agilent SureSelect Human All Exon v6 kit (Agilent Technologies) was used for whole exome sequencing. For RNA-seq experiments, library preparation was conducted using the Tru-Seq Stranded Total RNA with Ribo-Zero Gold kit protocol (Illumina). Libraries were sequenced on a HiSeq4000 sequencer using the paired-end 150 bp read option. Sequences in cram format was provided.	Illumina NovaSeq X	166
EGAD50000001200	RNASeq fastq files from PC12 cells (derived from rat PPGL) were cultivated in normoxia conditions (37˚C, 5% CO2 and 21% O2 balanced with N2) or in hypoxia conditions in a hypoxic incubator (37˚C, 5% CO2 and 1% O2 balanced with N2), for short time (12h, 24h, and 48h) or prolonged time (36 days). Samples at each timepoint were cultured in triplicate (total 24 paired end fastq files). Sequencing was performed with the Illumina NovaSeq6000.	Illumina NovaSeq 6000	24
EGAD50000001201	Fastq files from Whole Exome Sequencing of 12 paired germline DNA (from blood/saliva) and 14 PPGL tumor DNA (from formalin-fixed paraffin-embedded (FFPE) blocks from surgery) from patients with cyanotic congenital heart disease (CCHD) and pheochromocytoma and paraganglioma (PPGL). Sequencing was performed using the Illumina HiSeq4000.	Illumina HiSeq 4000	26
EGAD50000001203	Single-cell RNA-sequencing on malignant and benign tissue samples from untreated donors with colorectal adenocarcinoma and liver metastasis was performed in order to deduce tissue adaptive expression patterns of the disease. Matching tissue from five donors with untreated CRC and liver metastasis were sampled.	Illumina NovaSeq 6000	19
EGAD50000001204	Briefly, naive CD4+ T cells were isolated from PBMCs obtained from buffy coats from three healthy human donors, and differentiated into Th0, iTreg, Th2, Th1, Th17, and IFN-β-activated subsets. This allowed us to study the differentiation of naive CD4+ T cells into distinct helper subsets under the influence of a specific cytokine environment. Following 5 days of differentiation, T cells were left unstimulated or re-stimulated with phorbol 12-myristate 13-acetate (PMA) and ionomycin to assess their functional responses. Thus, this SUM-seq experiment consisted of 36 multiplexed conditions (3 donors x 6 stimulations x with/without re-stimulation). Single-cell chromatin accessibility and gene expression data is provided as demultiplexed fastq files.	unspecified	72
EGAD50000001205	We combined SUM-seq with arrayed CRISPR screening, modulating the expression of key lineage transcription factors (TFs) (GATA2 - mesoderm, SOX17 - endoderm, and NR4A2 - neuroectoderm) in hiPSCs via CRISPR interference (CRISPRi) or activation (CRISPRa) over a time course of spontaneous differentiation (days cultured in vitro: 0, 4, 12, and 18). This totalled to 54 samples (4 day/time points x 3 target TFs x 2-3 gRNAs per target x CRISPRi/a). Single-cell chromatin accessibility and gene expression data is provided as demultiplexed fastq files.	unspecified	108
EGAD50000001206	We stimulated iPSC-derived M0 macrophages with LPS and IFN-γ to induce M1 polarization or IL-4 to induce M2 polarization. To discern early and sustained responses at chromatin accessibility and gene expression levels, we collected samples at five time points along the two polarization trajectories; prior to stimulation (M0) and at 1-hour, 6-hour, 10-hour, and 24-hour intervals, each sampled in duplicates totaling 18 samples, and subjected them to SUM-seq library preparation. Sequenced files for both data modalities are provided as demultiplexed fastq files.	Illumina NovaSeq 6000	36
EGAD50000001208	We used the tuberculin skin test (TST) as human challenge model to study the temporal evolution of the immune response to Mycobacterium tuberculosis (Mtb) antigens in vivo. Study participants comprised healthy HIV seronegative adults, 18-60 years of age. Latent tuberculosis (TB) infection was defined as immune memory for Mtb-specific antigens identified by positive peripheral blood IFNg release assays, but no clinical or radiological evidence of active TB. Two units tuberculin each were injected intradermally into the contralateral forearms of participants. After 2 days (n=216) or 7 days (n=158), 3 mm skin punch biopsies were taken from the injection sites and processed for whole genome transcriptional profiling by bulk RNA sequencing. The time points reflect maximum clinical inflammation (day 2) and maximum T cell infiltration (day 7) of the TST. TST samples were compared to skin biopsies taken two days after control injection of saline in a separate group of individuals (n=33), comprising healthy volunteers as well as patients with active or latent TB, ranging from 18-75 years of age. This dataset includes 407 samples from 256 individuals (n=33 saline, n=223 with Day 2 and/or Day 7 TST). Skin biopsies were collected into RNAlater (Qiagen) for RNA extraction. Total RNA was obtained with the Precellys Evolution homogenizer (Bertin Instruments) and Qiagen RNeasy mini kit. RNA was subjected to DNase treatment to remove contaminating genomic DNA using a TURBO DNA-free kit (Ambion, Life Technologies). The KAPA mRNA HyperPrep Kit (Roche Diagnostics) was used to construct stranded mRNA-Seq libraries from up to 500 ng intact total RNA. Paired-end sequencing was performed on the Illumina Nextseq using the Nextseq 500/550 High Output 75 cycle kit (Illumina). Runs were demultiplexed using bcl2fastq by Illumina.	NextSeq 500	407
EGAD50000001209	Whole exome sequencing (WES) on early oral squamous cell carcinoma (OSCC) with clear margins, and on resection margins, 86 patients with a median follow-up of 58 months (range 30.4-83 months). Raw data (fastq format).	Illumina NovaSeq 6000	241
EGAD50000001210	This dataset contains raw sequencing data of mRNA from granulosa cell samples from 8 human patients undergoing in vitro fertilization. All patients were normal responders referred for male infertility and were grouped into two age groups (younger than 31 or older than 38 years old). Each sample contains all granulosa cells collected from one patient on one collection date. Raw sequencing fastq files are available in this dataset, count tables are available in ArrayExpress (E-MTAB-13496).	NextSeq 2000	8
EGAD50000001211	CITE-seq dataset of human bone marrow from four healthy donors. Single-cell RNA sequencing was performed using the 10x Genomics 3’ v3 platform. Approximately 7,000–10,000 cells were loaded in each 10x channel, after which the single-cell libraries were prepared according to manufacturer’s instructions, except for the added CITE-seq steps. Demultiplexed FASTQ files are provided.	NextSeq 500	4
EGAD50000001212	This research project was a collaboration between Cardiff University, UK and the Stanley Center at the Broad Institute. In this project we sequenced and analyzed the whole exomes of 1,105 Bipolar case/control samples from collaborators in UK. Genomic DNA from each sample was sequenced to a mean depth of 20x. The exome used Twist capture and samples were sequenced on Illumina HiSeqX machines producing CRAM files.	HiSeq X Ten	1106
EGAD50000001213	MCL cells were isolated from different tissues of five patients with MCL at diagnosis, and at the time of the first clinical relapse after failure of standard immunochemotherapy based on alternation of R(ituximab)-CHOP and R-high-dose-araC. All patients signed informed consent according to the Declaration of Helsinki. Four patients were diagnosed with classical nodal aggressive MCL, and all these patients experienced early relapse (i.e., progression of disease POD < 12 months). One patient was diagnosed with indolent non-nodal MCL and relapsed 9 years after initial MCL diagnosis. The dataset presents single cell RNA sequencing data as well as whole exome sequencing data for these five patients.	Illumina NovaSeq 6000 NextSeq 500	28
EGAD50000001214	SCRNAseq and TCRseq are performed using 10X genomics technology on lymphocytes from tumor of 8 individuals (4 different conditions and 2 replicates per condition): - ISO : isotype sample_1 - ISO_dup :isotype sample_2 aPD1 : sample anti-PD1_1 aPD1_dup :sample anti-PD1_2 aCTLA4: sample anti-CTLA4_1 aCTLA4_dup :sample anti-CTLA4_2 combo : sample_anti-PD1_anti-CTLA4_1 combo_dup : sample_anti-PD1_anti-CTLA4_2	Illumina NovaSeq 6000	16
EGAD50000001216	Both control and vascularized organoids were processed using the 10x Chromium 3' RNA method. Sequencing reads were aligned with STARsolo.	Illumina NovaSeq 6000	8
EGAD50000001217	Whole exome sequencing data in FHHNC patients stratified in extreme phenotypes to identify potential phenotype modifier variants	Illumina NovaSeq 6000	36
EGAD50000001218	The dataset includes FASTQ source files of scRNA-seq for 56 CRC patients, sequenced with Illumina 10x platform. All patients were diagnosed with colorectal cancer between 2014 and 2022 at the Gustave Roussy Institute. Cohort includes 32 samples from untreated patients and 24 samples from patients who received chemotherapy. Chemotherapy included FOLFOX or FOLFIRI as first line, in some cases in combination with anti-VEGF or anti-EGRF drugs (avastin, bevacizumab or cetuximab) as the second and third lines. The post-treatment cohort included 18 responders (partial response, stable disease and complete response) and 6 progressors (progressive disease).	Illumina HiSeq 1000	56
EGAD50000001220	Targeted Illumina sequencing of 41 tumor tissue samples and 3 urine tumor DNA samples from 34 patients with Lynch syndrome associated urothelial cancer. Target capture was performed using the UroScout hybridization capture panel targeting the coding regions of 25 urothelial cancer genes. Sequencing libraries were constructed using the Twist Biosciences Enzymatic Fragmentation DNA Library Prep kit. Sequencing was performed using an Illumina NovaSeq 6000 instrument with 2x150 bp paired sequencing.	Illumina NovaSeq 6000	44
EGAD50000001222	This dataset include the snRNA-seq raw sequencing files from the placentas of 4 normal weight control and 8 obese individuals. The files are paired-end fastq files. The snRNA-seq libraries were generated with Chromium Single Cell 3’ kit v3.1 (10X Genomics) and sequenced on Illumina NovaSeq 6000 at Novogene, UK.	Illumina NovaSeq 6000	12
EGAD50000001223	This dataset includes the Prime-seq raw sequencing data of the 3 replicates of trophoblast organoids co-cultured with adipose spheroids and 3 without adipos spheorids. The libraries were generated according to the published Prime-seq library preparation protocol, and were multiplexed with all the samples from the experiment. The libraries were sequenced on Illumina NovaSeq 6000 at Novogene. The files are paired fastq files.	Illumina NovaSeq 6000	1
EGAD50000001224	6 scRNA-seq and 4 scATAC-seq datasets on cardiac fibroblasts derived from two patients (P1, P2). The sequencing dataset consists of two control samples pretreated with vehicle (resting, only scRNA-seq data available); two samples stimulated with TGFb; and two samples pre-treated with KAT5 inhibitor, followed by stimulation with TGFb (iKAT5).	NextSeq 2000	6
EGAD50000001225	Tumour biopsies were obtained from uveal melanoma patients (n = 35) pre and 16 days post treatment with tebentafusp. Biopsies, which were either snap frozen or put in RNA later, were analysed by bulk RNA sequencing (50 million reads per sample) using the Illumina NovaSeq system	Illumina NovaSeq X	70
EGAD50000001226	Monocyte derived macrophages were polarised into M1 or M2 using IFNg+ ILP or IL-4 respectively (n =4) . Pan T cells were cultured with or without IL-2 for 4 days (n=4 - 6). Pan T cells were co-cultured with THP1 tumour cells for 20 hours in the presence or absence of ImmTAC molecules and in the presence or absence of M2 macrophages. Sorted T cells and macrophage populations prior and post co-culture were analysed by bulk RNA sequencing (60 million reads per sample) using the Illumina NovaSeq system	Illumina NovaSeq X	52
EGAD50000001227	The dataset includes FASTq files and normalised gene expression counts from bulk RNA-sequencing of pre-treatment tumour (n=39) and adjacent non-tumour tissue samples (n=13) from advanced hepatocellular carcinoma (HCC) patients treated with atezolizumab plus bevacizumab. Specifically, RNA was extracted from the FFPE tumour tissue using RNeasy Plus Mini Kit (Qiagen), according to the manufacturer’s instructions. RNA quality and quantity were evaluated using Agilent RNA 600 Pico Chips (Agilent Technologies). Sequencing libraries were prepared using the SMARTer Stranded Total RNA-seq kit v3 (Takara Bio). Sequencing was performed on an Illumina NovaSeq 6000 system. After obtaining the Fastq files, read 2 UMIs and adapters were trimmed using seqtk (v1.2), reads were aligned using STAR (v2.7.7a) to the human reference genome (GRCh38) and quantified with HTSeq (v2.0.2).	Illumina NovaSeq 6000	51
EGAD50000001229	This dataset contains genomic profiles of 7 PPM1D-mutated patients across the spectrum of myeloid disorders using single-cell analyses (Tapestri technology) on diagnostic and longitudinal samples.	Illumina NovaSeq 6000	7
EGAD50000001231	This study involved performing long read whole genome sequencing using Oxford Nanopore Technology platform on to detect causative structural variants in patients with non-syndromic autism spectrum disorder. This study was performed on 23 such children in whom prior karyotyping, Fragile-X analysis (in males), chromosomal microarray and whole exome sequencing did not identify a causative variant.	MinION	23
EGAD50000001232	These raw sequencing dataset includes 10x Genomics 5' scRNA-seq, TCR-seq, BCR-seq, and CITE-seq profiles of cells isolated from the deep cervical lymph nodes of three healthy controls.	Illumina NovaSeq 6000	3
EGAD50000001233	These raw sequencing dataset includes 10x Genomics 5' scRNA-seq, TCR-seq, BCR-seq, and CITE-seq profiles of cells isolated from the deep cervical lymph nodes of six Multiple sclerosis (MS) patients.	Illumina NovaSeq 6000	6
EGAD50000001234	The dataset includes Illumina (n=1) and Ultima (n=92) sequencing of circulating cell-free DNA. The dataset was generated through standard whole genome sequencing (Illumina n=1 & Ultima n = 15) and using a duplex unique molecular identifier whole genome sequencing workflow (Ultima, n=77). This dataset includes cancer samples and cancer-free control samples.	HiSeq X Ten unspecified	93
EGAD50000001235	10X 3' (RNA) samples of isolated single nuclei from highly inflamed leptomeninges and adjacent grey matter from 3 MS patients used in the paper.	Illumina NovaSeq 6000	6
EGAD50000001236	Methylome and the corresponding transcriptome of autoproliferating/Dim B cells and resting/hi B cells in Multiple sclerosis patients	Illumina NovaSeq 6000	10
EGAD50000001237	10X 3' (RNA) and 10X 5' (VDJ) samples of B cells (non-proliferating/hi and autoproliferating/Dim) for healthy donor and Multiple sclerosis patients used in the paper.	Illumina NovaSeq 6000	126
EGAD50000001238	Advances in whole-genome sequencing (WGS) have significantly enhanced our ability to detect genomic variants underlying inherited diseases. In this study, we performed long-read WGS on 24 patients with inherited retinal dystrophies (IRDs) to validate the utility of nanopore sequencing in detecting genomic variations. We confirmed the presence of all previously detected variants and demonstrated that this approach allows for precise refinement of structural variants (SVs). Furthermore, we could perform genotype phasing by sequencing only the probands, confirming variants were inherited in trans. Moreover, nanopore sequencing enabled the detection of complex variants, such as transposon insertions and structural rearrangements. This comprehensive assessment illustrates the power of long-read sequencing in capturing diverse forms of genomic variation and in improving diagnostic accuracy in IRDs.	PromethION	23
EGAD50000001239	NGS dataset including patients affected by genomic variants at gene PRPH2. It includes NGS test using custom IRD gene panels, clinical exome sequencing (CES) and whole-exome sequencing (WES).	NextSeq 550	61
EGAD50000001240	DNA sequencing of sgRNAs in CRISPR-Cas9 screening and RNA sequencing of SF3B4-overexpressing liver organoids	Illumina NovaSeq 6000 Illumina NovaSeq X Sequel II	18
EGAD50000001242	LAM-HTGTS dataset after DIS3 ASO or Control ASO treatment on human cells under plasmablast differentiation.	Illumina MiSeq	8
EGAD50000001243	RNAseq experiment after DIS3 ASO or Control ASO treatment on human cells under plasmablast differentiation.	Illumina NovaSeq 6000	26
EGAD50000001245	Please refer to the Supplemental Materials of the paper for more information.	Illumina NovaSeq 6000	13
EGAD50000001246	Ancient individual from Picuris Pueblo. Please refer to paper for more information.	Illumina NovaSeq 6000	16
EGAD50000001247	The dataset includes 152 FASTQ files from paired-end WXS sequencing on Illumina HiSeq2500 or Novaseq 6000 for 31 pMMR patients. This includes 31 tumor DNA and 31 blood germline samples.	Illumina HiSeq 2500 Illumina NovaSeq 6000	62
EGAD50000001248	The dataset includes 29 FASTQ files from single-end total RNA sequencing on Illumina HiSeq2500 and 66 files from 33 paired-end RNA sequencing samples on Novaseq 6000 for 31 pMMR colon cancer patients.	Illumina HiSeq 2500 Illumina NovaSeq 6000	62
EGAD50000001250	Whole Exome Sequencing (WES) of 153 FFPE DNA samples of Peripheral T-Cell Lymphoma. Associated with "Molecular biomarkers for stratification periheral T cell lymphoma" study.	Illumina NovaSeq 6000	153
EGAD50000001251	This dataset will include Spatial Transcriptomics, Single-Cell RNA-Seq, Bulk RNA-Seq, Clinical data, WES, and H&E data from 15 Muscle-invasive Bladder Cancer patients, treated with upfront cystectomy. Researchers from private or public institutions outside the MOSAIC Consortium will be able to apply to access this data and, pending approval, use the data for their research.		15
EGAD50000001253	The investigators would like to correlate NGS data from OAK to immunophenotype designation performed manually based on a panCK/CD8 immunohistochemistry stain for a subset of squamous cell carcinomas.		1
EGAD50000001254	This dataset comprises low-coverage (~0.5X) whole genome sequencing data from 179 single cells isolated from paired meningioma samples (88 cells from initial grade II and 91 cells from recurrent grade III tumor from the same patient). The single-cell libraries were constructed using Primary Template Amplification and ResolveDNA protocol, with libraries subsequently sequenced on NextSeq 1000 using 2x150bp paired-end reads targeting 8M reads per cell. This dataset includes the sequencing data in Fastq format.	NextSeq 1000	2
EGAD50000001255	Inherited retinal dystrophies (IRD) are a group of rare diseases that cause a large disability rate. THRB (MIM*190160), located on 3p24.2, encodes the thyroid hormone receptor beta (TRβ). THRB mutations may be associated with human cone disorders, as this gene regulates the expression of red cone opsins. The aim of this work is to refine the ophthalmologic phenotype and describe a novel variant in THRB associated to IRD, thus confirming the role of this gene in tiopathogenesis of IRD.	Illumina HiSeq 2000	5
EGAD50000001257	In this dataset, samples correspond to pools of genetically unique biospecimens derived from individuals with or without Type 1 Diabetes. Each sample alias contains all the DNA test numbers for that pool. Each experiment corresponds to scmxGEX (multiome RNA gene expression), scmxATAC (multiome ATAC sequencing), scRNA (single cell RNA sequencing), scATAC (single cell ATAC sequencing) for each pool. Runs link the appropriate fastq files to each experiment. Deconvolution using genetic information is required for single sample-based testing.	Illumina NovaSeq 6000	7
EGAD50000001258	The dataset includes 141 RNA-seq samples form mUM patietns. Tumour biopsies were obtained from uveal melanoma patients pre and post treatment with tebentafusp. Biopsies, which were either snap frozen or put in RNA later, were analysed by bulk RNA sequencing (50 million reads per sample) using the Illumina NovaSeq system.	Illumina NovaSeq X	141
EGAD50000001259	small RNA sequencing data for 3 patients with type 2D diabetes, and 3 patients with non diabetes. Single end sequencing was performed with read length of 100.	Illumina NovaSeq 6000	6
EGAD50000001260	Paired-end RNA-seq FASTQ files from amniotic fluid samples of 49 extreme premature births, affected and unaffected by the fetal inflammatory response.	NextSeq 500	49
EGAD50000001261	TIRE-seq is a novel RNA sequencing method that integrates mRNA purification directly into library preparation, eliminating separate RNA extraction. The technique demonstrated utility across three biological applications, including capturing transcriptional changes in human T cells, identifying genes driving murine dendritic cell differentiation, and analyzing the dose-response effects of temozolomide on patient-derived neurospheres. With its simplified approach and high sequencing efficiency, TIRE-seq offers a cost-effective solution for large-scale gene expression studies.	NextSeq 2000	1
EGAD50000001267	This dataset comprises paired-end and time-matched FastQ files from 356 cell-free DNA samples and 356 white blood cell DNA samples, derived from 299 patients with metastatic renal cell carcinoma or urothelial carcinoma. A baseline sample was available for all patients. Sequencing was performed using Illumina technology.	Illumina NovaSeq 6000	712
EGAD50000001268	Enzymatic methylation sequencing data generated from cell-free tumor DNA extracted from cerebrospinal fluid (CSF). CSF was gathered from pediatric brain tumor patients of varying diagnoses treated at Connecticut Children's Medical Center, or from the Children's Brain Tumor Network. After DNA extraction, libraries were prepped using the Enzymatic Methyl-seq V2 kit (New England Biolabs).	Illumina NovaSeq X	17
EGAD50000001269	Methylome of resting/hi and autoproliferating/Dim memory B cells in Multiple sclerosis patients. In brief, non-proliferating (CFSEhi) and AP+ (CFSEdim) CD19+ CD27+ B cells from PBMCs of 6 natalizumab-treated female RRMS (NAT) patients were sorted following 7 days of autoproliferation. Methylome libraries for 12 samples were constructed by applying the post-bisulfite adaptor tagging (PBAT) method on lysed cells and sequencing was conducted on Illumina NovaSeq 6000. CpGs with minimum 10X coverage were used for differential methylation position (DMP) analysis.	Illumina NovaSeq 6000	12
EGAD50000001270	Transcriptional profile of resting/hi and autoproliferating/Dim memory B cells in Multiple sclerosis patients. In brief, non-proliferating (CFSEhi) and AP+ (CFSEdim) CD19+ CD27+ B cells from PBMCs of 5 natalizumab-treated female RRMS (NAT) patients were sorted following 7 days of autoproliferation. Transcriptome libraries were constructed from the same samples, except for one sample pair, which was excluded due to limited input material (5 donors as for Methylome, D1; D2, D3, D5, D6; samples from donor D4 unfortunately technically failed). SMARTer Total Stranded RNA-seq library construction with ribosomal depletion was performed and then sequenced on NovaSeq S4 lane.	Illumina NovaSeq 6000	10
EGAD50000001271	The dataset contains raw data of 27 bulk RNA-sequencing data from 9 cHL cell lines. Paired-end fastq with polyA capture are available for each cell line.	NextSeq 2000	9
EGAD50000001272	The dataset contains 36 ChIP-seq data for H3K27Ac and their corresponging inputs for 9 cHL cell lines. Data are raw paired-end fasq files for each histone and input, and for each cell line.	NextSeq 2000	9
EGAD50000001273	The dataset contains 21 ATAC-seq samples from 9 cHL cell lines. Files within the dataset are raw fastq files for each cell line.	NextSeq 2000	7
EGAD50000001274	RNA from snap-frozen breast tissue biopsies were purified after lysing by tissuelyser (Qiagen) and using RNA Purification Plus Kit (Norgen biotek CORP, 47700) with additional on-column DNase-I treatment (Qiagen, 79254) at 27 °C. RNA purity and integrity (RIN) were quantified using RNA 6000 Nano kit (Agilent Technologies, 5067-1511) on the 4200 TapeStation (Agilent, Santa Clara, USA). Library preparation and 2x75bp paired-end of 160 ng total RNA input was performed using Illumina Stranded Total RNA Prep Ligation kit and Illumina HiSeq4000 system (Illumina, Sand Diego, CA, USA). RNA sequencing data from HiSeq4000 were quality checked and aligned to GRCh38 (GCA_000001405.15) reference genome using HISAT2 2.0.5 and submitted to subread v.1.5.2 for feature counts calculation. Finally, 36 paired biopsies samples (metformin n=26 and placebo n=12) were sequenced and included in the final analyses.	Illumina HiSeq 4000	72
EGAD50000001275	Five tumor-normal human breast cancer cases driven by APOBEC mutagenesis, including normal tissue, primary, and metastatic samples for each patient.	Illumina HiSeq 2000	15
EGAD50000001278	This study contains bulk and single nucleus RNA-seq of human epiglottis and subglottis tissue.	Illumina NovaSeq X	211
EGAD50000001279	This dataset contains single nucleus RNA-seq of human epiglottis and subglottis tissue performed using the 10X platform	unspecified	29
EGAD50000001280	CANCAP03 single-nuclear RNA sequencing. Tissue punch cores for six patients, 3 from each of olaparib and olaparib and degarelix cohorts, were processed for nuclear isolation and extraction, prior to single-nuclear RNA sequencing. 10x libraries were constructed. Sequencing on NovaSeq 6000. FASTQ files were processed using CellRanger pipeline. Outputs from the CellRanger pipeline were analysed using the Seurat package in R.	Illumina NovaSeq 6000	6
EGAD50000001285	DNA fragmentation from M116 blood sample (350 bp), library preparation, bisulfite conversion, and paired-end-150 whole genome bisulfite sequencing was conducted by Novogene®. Sequencing was carried out with a 10X coverage (approximately 60 Gb and 200 million reads per sample) using the NovaSeq X Plus platform (Illumina). Files types included in this dataset are raw Fastq files.	Illumina NovaSeq X	1
EGAD50000001286	Whole Genome Sequencing (WGS) was performed using DNA extracted from M116’s blood, saliva, and urine samples. DNA was fragmented to 350 bp, followed by library preparation, paired-end 150 bp WGS, and variant calling, all conducted by Novogene®. Sequencing was carried out with a 10X coverage (approximately 60 Gb and 200 million reads per sample) using the NovaSeq X Plus platform (Illumina). Files types included in this dataset are raw Fastq files.	Illumina NovaSeq X	4
EGAD50000001287	Genomic DNA was obtained from M116 peripheral blood sample and was used for targeted deep sequencing (TDS) studies. Barcoded libraries were prepared according to the manufacturer’s instructions, using a probe-based panel (KAPA HyperCap, Roche®) targeting frequently mutated regions of 50 myeloid-related genes. Samples were run on a MiSeq (Illumina®) sequencer for paired-end 2x75 bp reads with a mean coverage of 1000X.	Illumina MiSeq	1
EGAD50000001288	M116 stool was collected from 3 different days (3 biological replicates) and kept at -20ºC until DNA extraction (Biobanc IDIBGI and Goodgut®). Total genomic DNA was extracted from 150–200 mg of each stool sample after homogenisation using the DNeasy Powersoil Pro kit (Qiagen®) according to manufacturer’s instructions. Quality and quantity of DNA were evaluated using Qubit® BR kit on a Qubit®2.0 fluorimeter (ThermoFisher Scientific®) and on a Nanodrop ND-2000 UV-Vis spectrophotometer (ThermoFisher Scientific®). The v3-v4 region of the bacterial 16S rRNA gene was amplified and sequenced (paired-end 250-bp) following standard practices at external facilities (Novogene®) using previously described primers 515F. Files types included in this dataset are raw Fastq files.	Illumina NovaSeq 6000	3
EGAD50000001289	scRNA-seq was performed using Chromium Next GEM Single Cell 3’ Kit v3.1 (10X Genomics) according to the manufacturer’s instructions. Briefly, M116's PBMCs were thawed and quantified in order to calculate the number of cells to be loaded. Then, barcoded Single Cell 3ʹ Gel Beads, a master mix containing PBMCs, and Partitioning Oil were combined onto the Chromium Next GEM Chip G to generate Gel Beads-In-Emulsion (GEMs), where polyadenylated mRNAs were reverse transcribed. This way, all generated cDNAs from the same cell shared a common 10X Barcode. Following the reverse transcription, GEMs were broken and cDNAs were amplified and cleaned up with SPRIselect beads (Beckman Coulter®). Next, a portion of these cDNAs were enzymatically fragmented and subjected to adaptor ligation before using them as a PCR template for the incorporation of i5/i7 indexes. Finally, libraries were purified with SPRIselect beads, quantified and quality checked by using the 2200 TapeStation (Agilent®), and subjected to paired-end 150-bp sequencing (Novaseq systems, Illumina®) following standard practices at external facilities (Novogene®). Files types included in this dataset are raw Fastq files.	Illumina HiSeq X	1
EGAD50000001290	The dataset contains 10x Chromium single-cell transcriptomics data from 43 samples from 38 different patients with ovarian cancer. 36 samples were collected from the tumor, 7 samples from ascites. Sequencing was performed in paired-end mode and sequencing data is provided in fastq format.	Illumina NovaSeq 6000 NextSeq 500	43
EGAD50000001291	The dataset contains 10x Chromium single-cell DNA sequencing data from 42 samples from 38 different patients with ovarian cancer. 35 samples were collected from the tumor, and 7 samples from ascites. Sequencing was performed in paired-end mode and sequencing data is provided in fastq format.	Illumina NovaSeq 6000 NextSeq 500	42
EGAD50000001292	The Tumor Profiler Study is an observational trial combining a prospective diagnostic approach to assess the relevance of in-depth tumor profiling to support clinical decision-making with an exploratory approach to improve the biological understanding of the disease. This dataset contains scDNA-seq data for 3 ovarian cancer samples from the Tumor Profiler Study used to validate LongSom. LongSom is a computational workflow leveraging high-quality long-read scRNA-seq data to call de novo somatic single-nucleotide variants (SNVs), including in mitochondria (mtSNVs), copy-number alterations (CNAs), and gene fusions, to reconstruct tumor clonal heterogeneity.	Illumina NovaSeq 6000	3
EGAD50000001293	The dataset contains 10x Chromium single-cell transcriptomics data from 6 samples from 4 different patients with ovarian cancer. 3 samples were collected from the tumor, 3 samples from ascites. Sequencing was performed in paired-end mode and sequencing data is provided in fastq format. This dataset contains additional samples from patients not classified as having high-grade serous adenocarcinoma.	Illumina NovaSeq 6000 NextSeq 500	6
EGAD50000001294	The dataset contains 10x Chromium single-cell DNA sequencing data from 6 samples from 5 different patients with ovarian cancer. 4 samples were collected from the tumor, and 2 samples from ascites. Sequencing was performed in paired-end mode and sequencing data is provided in fastq format. This dataset contains additional samples from patients not classified as having high-grade serous adenocarcinoma.	Illumina NovaSeq 6000 NextSeq 500	6
EGAD50000001295	69 tissue samples from various parts of the developing human embryo brain were dissociated and single cells were collected and processed without bias for mRNA-seq using 10X chromium 3' protocol. Libraries were sequenced on Illumina NovaSeq and reads aligned against the human GRCh38 genome. This is an addition to EGAD00001006049 Human development single cell sequencing, consisting of samples added after the original submission to EGA, but included in final paper.	Illumina NovaSeq 6000	69
EGAD50000001296	8 Fastq files of single-cell RNAseq data from tumor-on-chip (TOC) after 42h of TNBC and CAF mono- or co-culture	Illumina NovaSeq 6000	4
EGAD50000001297	This dataset was derived from 360 formalin-fixed paraffin-embedded (FFPE) samples distributed across (i) 118 primary melanomas, (ii) 132 nevi and (iii) 110 melanocytic tumors. Next generation sequencing (NGS) libraries were prepared following the FFPE DNA Archer VariantPlex Somatic Protocol for Illumina. Input DNA was determined by the PreSeq DNA QC assay, with 100 ng of DNA used for samples with high quality scores, 200 ng for low quality scores and 300 ng for samples with bad quality scores as defined by the PreSeq DNA QC assay. NGS libraries were quantified using the KAPA Library Quantification Kit for Illumina (KK4824) and sequencing performed on the Nextseq 500/2000 without custom primers using paired-end sequencing (151bp for Read 1 and Read 2) with index reads (8bp for Index Read 1 and Index Read 2). For each sample, resulting paired-end fastq files are available as part of this dataset. Prefix for each sample name is associated with the the type of sample: NAE: nevi samples. MEL: Melanoma samples. BL: melanocytic tumor.	unspecified	360
EGAD50000001298	Pre-processed Seurat objects with cellular and functional annotation information from the scRNAseq study on PC fusion biopsy samples. The data consists of rds files composed of normalised counts matrices, normalised and scaled counts, functional enrichment counts at the cellular level and metadata with patient grouping and cellular annotation variables. There are 3 .rds files, one with the total cells (SCP_data_analysis), with the selection of cells annotated as cancerous (selected_cancer_analysis) and cancer-associated fibroblasts (selected_caf_analysis).		31
EGAD50000001299	This dataset has RNA-Seq fastq files which were generated as part of the snRNA splicing signature study. Analysis can be performed to regenerate splicing signatures found for RNU4-2, RNU5B-1 and RNU5A-1 by comparing cases versus controls.	NextSeq 550	46
EGAD50000001300	Targeted NGS sequencing data (bam and index files aligned to hg38) a total of 573 newly diagnosed, untreated RA patients and 163 healthy controls. The sequencing panel included 65 genes recurrently mutated in myeloid malignancies, to capture clonal hematopoiesis of indeterminate potential (CHIP) mutations. The median target coverage was 1700x across samples.	Illumina NovaSeq 6000	736
EGAD50000001301	Dataset includes scRNA-sequencing data of cells isolated by dissociation of ~20 BJ WT day 20 hiPSC-derived MN embryoid bodies generated in ULA 96w plates following the protocol described in Rodriguez-Muela et al., JCI 2018). Droplet-based scRNA-seq was performed using 10X Genomics scRNA-sequencing was performed using 10X Genomics Chromium Single Cell Kit v3. The resulting sequencing reads were processed as described in Buchner et al 2024. Analysis of the count matrices was performed using the R toolkit for single cell genomics, Seurat v5.1.0, with R v3.3.3 “Angel Food Cake.”	Illumina NovaSeq 6000	1
EGAD50000001302	ONT whole-genome sequencing data for "HPV integration induces gene fusions" . We sequenced five HPV-positive head and neck cancer samples using Oxford Nanopore platform. The sequences was submitted in fastq format.	PromethION	5
EGAD50000001303	RNA-seq data for "HPV integration induces gene fusions" We performed RNA-seq analysis of five HPV+ head and neck cancer samples using Illumina short reads. Sequenced are submitted in bam format. We also sequenced one samples with pacBio long reads, and the reads are submitted in fastq format.	Sequel II	5
EGAD50000001304	pacBio whole-genome sequencing data for "HPV integration induces gene fusions" We performed long read whole-genome sequencing on four HPV+ head and neck cancer samples using pacBio HiFi. The sequence reads were submitted in fastq format.	Sequel II	4
EGAD50000001305	Illumina whole-genome sequencing data for "HPV integration induces gene fusions" We performed short read whole-genome sequencing of five HPV+ head and neck cancer samples using Illumina. The reads are submitted in bam or fastq file format.	Illumina NovaSeq 6000	5
EGAD50000001306	This dataset consists of 67 bulk RNA-seq data (sorted aligned BAM files) from formalin-fixed paraffin-embedded (FFPE) oropharynx tumor samples from the University of Michigan (UM). Samples were collected as part of the UM Head and Neck SPORE between 2008-2014. Library preparation was performed using the ribo-depletion method with the FastSelect kit and sequenced on the NovaSeq 6000 at the UM Advanced Genomics Core, resulting in 150bp paired-end reads. Raw reads were trimmed using Cutadapt (v3.4) and aligned to hg38 and high-risk HPV genomes using STAR (v2.7.9a). 62 of 67 samples were determined to be HPV RNA-positive.	Illumina NovaSeq 6000	67
EGAD50000001307	This submission contains raw ATAC and RNA sequencing data from human primary CD4+ and CD8+ T cells, either stimulated or unstimulated with anti-CD3/CD28 Dynabeads, from three healthy donors. It includes 24 paired-end FASTQ files consisting of 12 poly(A)-enriched RNA-seq samples and 12 ATAC-seq samples sequenced on the Illumina NextSeq 500 platform.	NextSeq 500	24
EGAD50000001308	This dataset includes sample numbers TSI_0426 to TSI_0527 for the PANGEA (PCR amplicon) cohort of the study. Samples were generated by pooling four overlapping PCR amplicons (see Gall et al 2012, DOI: 10.1128/JCM.01516-12). The sequencing was conducted using the Illumina NovaSeq system. Each sample has a BAM file, and its associated reference file.	Illumina NovaSeq 6000	102
EGAD50000001309	This dataset includes sample numbers TSI_0001 to TSI_0202 and TSI_0316 to TSI_0425 for the PANGEA (veSEQ-HIV) cohort of the study. Samples were sequenced using the Illumina NovaSeq system and the veSEQ-HIV method (see Bonsall, Golubchik et al 2020, DOI: 10.1128/JCM.00382-20). Each sample has a BAM file, and its associated reference file.	Illumina NovaSeq 6000	312
EGAD50000001310	This dataset includes sample numbers TSI_0203 to TSI_0315 for the BEEHIVE cohort of the study. Samples were sequenced using the Illumina NovaSeq system and the veSEQ-HIV method (see Bonsall, Golubchik et al 2020, DOI: 10.1128/JCM.00382-20). Each sample has a BAM file, and its associated reference file.	Illumina NovaSeq 6000	113
EGAD50000001311	This dataset contains WES bam/bai files associated with 133 patients as part of the LEMA Study	Illumina NovaSeq 6000	133
EGAD50000001312	This dataset contains total RNA-seq of 10 patient samples: 5 castration resistant prostate cancer (CRPC) and 5 neuroendocrine prostate cancer (NEPC). Total RNA-seq was constructed using ribosomal RNA-depleted random hexamer-primed RNA library preparation kit and sequenced on Illumina at pair-end 150bp. Total RNA-seq allows for circular RNA and mRNA analyses. Each sample was sequenced on two separate lanes, therefore, two pairs of raw fastq files (4 files in total) are provided for each sample.	Illumina HiSeq 2000	10
EGAD50000001314	This dataset contains RNA sequencing data of two hepatoblastoma PDX cell models. Cells were either treated with KSP inhibitor filanesib or with DMSO (controls).	unspecified	16
EGAD50000001316	This research project was a collaboration between University College London, UK and the Stanley Center at the Broad Institute. In this project we sequenced and analyzed the whole exomes of 1,623 Schizophrenia, Bipolar and Control samples from collaborators in UK. Genomic DNA from each sample was sequenced to a mean depth of 20x. The exome used Twist capture and samples were sequenced on Illumina HiSeqX machines producing CRAM files.	Illumina NovaSeq 6000	1623
EGAD50000001317	This dataset contains WGBS of a stepwise-edited, human melanoma model including in vitro samples and corresponding xenografts.	Illumina NovaSeq 6000	17
EGAD50000001318	This dataset contains healthy and tumor samples from patients covering breast adenocarcinoma, colon adenocarcinoma, kidney renal clear cell carcinoma, hepatocellular carcinoma, lung adenocarcinoma and pancreas adenocarcinoma.	Illumina NovaSeq 6000	34
EGAD50000001319	This dataset contains WGBS of seven melanoma patients and one cultured primary melanocyte sample.	Illumina NovaSeq 6000	8
EGAD50000001320	Transcriptomic profile for whole blood after 24hours of stimulations	Illumina NovaSeq 6000	1369
EGAD50000001321	This data set contains scRNA-seq data from MPN patients and healthy individuals. It is accompanied by amplicon data for the JAK2V617F mutation. For each individual an INFa treated and untreated sample is included.	NextSeq 500	12
EGAD50000001322	stem cell-derived beta cells differentiated using two different cell lines: HUES8 (non-clinical grade) and RC9 (clinical grade). iPSC stem cell lines were differentiated to pancreatic islets using differentiation protocol as detailed by Rajaei et al. (2025, Science Translational Medicine, accepted for publication). scRNAseq was done at stage 7 of differentiation (day 30) using 10X genomics 3' sequencing. Cells were annotated then subsetted for beta cells for downstream analysis	Illumina NovaSeq X	1
EGAD50000001323	The dataset contains files with single nucleotide variants in VCF format for a total of 942 DNA samples, selected to represent a cross-section of the Swedish population. The samples originate from the Swedish Twin Registry (STR) and have been obtained from different geographical regions. For each of the 942 individuals, DNA was extracted from a blood sample and subject to whole genome sequencing (WGS). The WGS was performed using 2x150 bp paired-end chemistry on Illumina HiSeq X Ten instrumentation at the SciLifeLab National Genomics Infrastructure (NGI) in Stockholm and Uppsala. FASTQ files generated by WGS were analyzed using the nf-core pipeline Sarek, which includes pre-processing, alignment to the human GRCh38 reference genome, and germline variant calling. All participants gave their written informed consent and the TwinGene study was approved by the regional ethics committee (Regionala Etikprövningsnämnden, Stockholm, dnr 2007-644-31, dnr 2014/521-32). Access to phenotypic information can be requested from the Swedish Twin Registry (http://ki.se/en/research/the-swedish-twin-registry).		942
EGAD50000001324	The dataset contains files with single nucleotide variants in VCF format for a total of 58 DNA samples originating from the Northern Sweden Population Health Study (NSPHS). For each of the 58 individuals, DNA was extracted from a blood sample and subject to whole genome sequencing (WGS). The WGS was performed using 2x150 bp paired-end chemistry on Illumina HiSeq X Ten instrumentation at the SciLifeLab National Genomics Infrastructure (NGI) in Stockholm and Uppsala. FASTQ files generated by WGS were analyzed using the nf-core pipeline Sarek, which includes pre-processing, alignment to the human GRCh38 reference genome, and germline variant calling. The NSPHS study was approved by the local ethics committee at the University of Uppsala (Regionala Etikprövningsnämnden, Uppsala, 2005:325 and 2016-03-09). All participants gave their written informed consent to the study including the examination of environmental and genetic causes of disease in compliance with the Declaration of Helsinki.		58
EGAD50000001325	The dataset contains whole-genome sequencing data (aligned read files) in CRAM-format (lossless compression) for a total of 58 DNA samples originating from the Northern Sweden Population Health Study (NSPHS). For each of the 58 individuals, DNA was extracted from a blood sample and subject to whole genome sequencing (WGS). The WGS was performed using 2x150 bp paired-end chemistry on Illumina HiSeq X Ten instrumentation at the SciLifeLab National Genomics Infrastructure (NGI) in Stockholm and Uppsala. FASTQ files generated by WGS were analyzed using the nf-core pipeline Sarek, which includes pre-processing, alignment to the human GRCh38 reference genome, and germline variant calling. The NSPHS study was approved by the local ethics committee at the University of Uppsala (Regionala Etikprövningsnämnden, Uppsala, 2005:325 and 2016-03-09). All participants gave their written informed consent to the study including the examination of environmental and genetic causes of disease in compliance with the Declaration of Helsinki.		58
EGAD50000001326	The dataset contains whole-genome sequencing data (aligned read files) in CRAM-format (lossless compression) for a total of 942 DNA samples, selected to represent a cross-section of the Swedish population. The samples originate from the Swedish Twin Registry (STR) and have been obtained from different geographical regions. For each of the 942 individuals, DNA was extracted from a blood sample and subject to whole genome sequencing (WGS). The WGS was performed using 2x150 bp paired-end chemistry on Illumina HiSeq X Ten instrumentation at the SciLifeLab National Genomics Infrastructure (NGI) in Stockholm and Uppsala. FASTQ files generated by WGS were analyzed using the nf-core pipeline Sarek, which includes pre-processing, alignment to the human GRCh38 reference genome, and germline variant calling. All participants gave their written informed consent and the TwinGene study was approved by the regional ethics committee (Regionala Etikprövningsnämnden, Stockholm, dnr 2007-644-31, dnr 2014/521-32). Access to phenotypic information can be requested from the Swedish Twin Registry (http://ki.se/en/research/the-swedish-twin-registry).		942
EGAD50000001327	Serum samples (DC, CRSsNP, CRSwNP, N-ERD, n=20 each) were subjected to the Olink Explore 3072 analysis including the following panels: Olink Explore 384 Cardiometabolic and Cardiometabolic II, Olink Explore 384 Inflammation and Inflammation II, Olink Explore 384 Neurology and Neurology II, Olink Explore Oncology and Oncology II. The latter had 4 failed samples. From the originally 2925 unique proteins determined, 84 proteins had missing values in all samples and were removed. Further 164 proteins had >50% of values below LOD across all four groups and were removed with remaining values below threshold adjusted to LOD. This data cleaning left 2695 unique assays (2677 unique proteins) in serum to be analysed. Additionally, six assays (LMOD1, IDO1, SCRIB, CXCL8, IL6, TNF) were measured four times each in different panel.	unspecified	1
EGAD50000001328	Olink Target 96 Inflammation panels were applied to nasal secretions (DC, CRSsNP, CRSwNP, N-ERD, n=20 each). Across both panels, 184 assays (180 unique proteins) were determined, of which 4 appeared in both panels and were measured twice (CCL11, IL10, IL5, IL-6)	unspecified	1
EGAD50000001333	Shallow whole genome sequencing of samples from the paper Beddowes et al. A large-scale retrospective study in metastatic breast cancer patients using circulating tumour DNA and machine learning to predict treatment outcome and progression-free survival. Mol. One 2025. This includes 1048 longitudinal plasma samples from 149 patients and 35 normals. Samples were sequenced with shallow whole genome sequencing for copy number estimation. All the files are single-end fastq files.	Illumina HiSeq 2500	1048
EGAD50000001334	This dataset contains single cell RNA sequencing (192 fastq files for 59 samples), whole exome sequencing (20 BAM files -hg38- for 20 samples) and bulk RNA sequencing (138 paired-end fastq files and 17 single-end fastq files for 8 samples)	Illumina HiSeq 2500 Illumina NovaSeq 6000	87
EGAD50000001335	This dataset contains transcriptome data generated from naive and memory B cells isolated from patients with inflammatory bowel disease and healthy controls. Data was generated from both patients with Crohn's disease and ulcerative colitis. In addition, it contains BCR repertoire data from patients with IBD and healthy controls. The repertoire data was generated using amplicon sequencing of RNA isolated from peripheral blood, lymph nodes from inflamed gut and gut associated lymphatic tissue.	Illumina HiSeq 4000 Illumina MiSeq	895
EGAD50000001336	This dataset includes sequence files from whole exome sequencing and bulk RNA sequencing of tissue and blood biospecimens from a phase II clinical trial investigating epigenetic priming followed by immune checkpoint blockade in non-small cell lung cancer (NSCLC; NCT01928576). There are 78 whole exome sequencing files and 36 bulk RNA sequencing files included in this dataset.	Illumina HiSeq 2500 unspecified	114
EGAD50000001337	Illumina platform sequencing of whole genome libraries prepared from paired tumour/normal samples from 1 case of melanoma uveal with MSI	Illumina NovaSeq 6000	2
EGAD50000001338	This data-set contains two sequencing runs of the same scRNA-seq library. Once with and once without cell targeted amplification of transcripts.	unspecified	2
EGAD50000001339	This data table represents each patient molecular response based on ctDNA (using Guardant 360) percentage at baseline and cycle 3 day 1		92
EGAD50000001340	This table describes each variant detected using Guardant 360 for each sample included in this analysis.		92
EGAD50000001341	This table summarizes Guardant 360 ctDNA variant summary statistics including frequency of variant types for each sample included in this analysis.		92
EGAD50000001342	This dataset contains RNA sequencing data of pleural and peritoneal mesothelioma. The number of samples is 18. Sequencing was performed on Illumina NovaSeq 6000 and Illumina NovaSeqXPlus. The sequencing was always paired.	Illumina NovaSeq 6000 Illumina NovaSeq X	18
EGAD50000001343	This dataset contains Whole Genome sequencing data of pleural and peritoneal mesothelioma. The number of samples is 40. Sequencing was performed on Illumina NovaSeq 6000 and Illumina NovaSeqXPlus. The sequencing was always paired.	Illumina NovaSeq 6000 Illumina NovaSeq X	40
EGAD50000001344	RNA-seq of LuCaP 189.4 cell line. Treatment with DCC DMSO (n=3 replicates) or 1nm R1881 (n=3 replicates). All 6 samples were sequenced using cDNA oligo-dT using paired-end reads on Illumina NovaSeq 6000. Each sample has 2 fastq files, one for the forward and reverse read, respectively.	Illumina NovaSeq 6000	6
EGAD50000001345	ChIP-seq of 2 LuCaP cell lines with different factors. LuCaP cell line 189.4 was ChIP-sequenced for 6 factors with different conditions as well as the input DNA (n=31 in total). LuCaP 189.4 ChIP-sequence experiments include; 3 H3K27ac/DMSO, 3 H3K27ac/vorinostat, 3 H3K27ac/dexamethasone, 3 H3K27ac/dexamethasone/vorinostat, 3 GR dexamethasone, 3 GR dexamethasone/vorinostat, 3 HDAC3/DMSO, 3 AR, 3 FOXA1, 3 H3K27ac and 1 input sample. LuCaP 35 ChIP-sequence experiments as well as input include (n=10 in total); 3 H3K27ac, 3 AR, 3 FOXA1 and 1 input sample. All 41 samples were sequenced using paired-end reads on Illumina NovaSeq 6000. Each sample has 2 fastq files, one for the forward and reverse read, respectively.	Illumina NovaSeq 6000	41
EGAD50000001346	This dataset includes single cell (sc)RNA-seq of N=13 AML patients. The dataset includes bam files that have been purged of reads mapping to cells where a majority of the reads from that cell mapped to the human genome. All scRNA-seq data were generated using the Chromium Next GEM Single Cell 5' Reagent Kits v2 (Dual Index) from 10X Genomics.	Illumina NovaSeq 6000	13
EGAD50000001347	This dataset contains BAM files of whole-genome sequencing data of single human HSPCs after colony expansion, for three individuals who harbor a DNMT3A mutant clonal hematopoiesis clone in their blood system after hematopoietic cell transplantation. 29 samples are present in this dataset; 10 samples for LTHIT005, 10 samples for LTHIT131, and 9 samples for LTHIT069. In the sample name, samples are annotated for being DNMT3A wildtype (wt) or mutant (mut), except for LTHIT069 (wt samples: F13, H4, K2, L16, N12, mut samples: C8, E10, L9, N5). Whole-genome sequencing was performed after standard Illumina WGS library preparation and sequenced on a Novaseq 6000 using 150 bp paired-end sequencing. The sequencing depth is 15x genome coverage, or 50Gbases per sample.	Illumina NovaSeq 6000	29
EGAD50000001348	Genotyping datasets used in the article "Nested admixture during and after the Trans-Atlantic Slave Trade in the island of Sao Tome" by Ciccarella M et al. 2025 The genotype data corresponds to 2,104,148 autosomal SNPs genotyped from the IlluminaOmni 2.5 Million BeadChip for 97 volunteer participants sampled on São Tomé e Príncipe, family unrelated at the 2nd degree based on population genetics analyses. In particular, the dataset is composed of 95 individual samples collected on São Tomé and 2 collected on Príncipe. SNP rsID, Chromosome position and genetic position in (bp) are in Build GRCh38.		97
EGAD50000001349	Dataset consisting of 172 whole genome sequencing runs from healthy controls and Parkinson's disease patients. The data was sequenced on Oxford Nanopore (PromethION) platform. The data has been mapped to T2T-CHM13v2.0 reference genome. The data processing pipeline is available on the following link: [https://zenodo.org/records/13385065](https://zenodo.org/records/13385065). The data is in CRAM format and subet of donors also have information on C (hydroxy)methylation (MM and ML tag). The reads are also phased (HP tag).	PromethION	172
EGAD50000001352	This dataset will include Spatial Transcriptomics, Single-Cell RNA-Seq, Bulk RNA-Seq, Clinical data, WES, and H&E data from 10 Glioblastoma (GBM) Cancer patients. Researchers from private or public institutions outside the MOSAIC Consortium will be able to apply to access this data and, pending approval, use the data for their research.		10
EGAD50000001353	The dataset for “Genome-wide analyses of cell-free DNA for therapeutic monitoring of patients with pancreatic cancer” includes 496 cram files from whole genome next-generation sequencing on the Illumina NovaSeq 6000. The samples analyzed include tumor tissue, matched buffy coat and longitudinal plasma samples from individuals with pancreatic cancer.	Illumina NovaSeq 6000	496
EGAD50000001354	The dataset includes single cell RNA sequencing data from 17 AML/MDS patients undergoing a decitabine clinical trial. In total 47 samples were interrogated using 10X genomics ChromiumTM 140 Single Cell 5' Reagent Kit. The dataset includes bam files for each sample generated using CellRanger with default parameters.	unspecified	47
EGAD50000001355	This dataset includes illumina paired-end reads (fastq) of whole-exome sequencing (WES) data from three acute lymphoblastic leukemia patients. Tumor samples were obtained from patients, each with matched patient’s peripheral blood mononuclear cells (PBMCs). PBMCs are provided as a germline reference.	Illumina MiniSeq	6
EGAD50000001356	This dataset contains RNA-seq and T cell receptor (TCR) sequencing files from a cohort (n=20) 3 months post positive PCR for SARS-CoV-2. A total of 10 hospitalised (samples H1-10) and 10 non-hospitalised (samples NH1-10) are contained within this dataset. RNA sequencing was performed using the Illumina Stranded mRNA Prep Ligation kit to investigate the protein coding transcriptome of PBMC cells. T cell receptor (TCR) repertoire profiling was performed with the TakaraBio SMART-Seq Human TCR (with UMIs) on PBMC cells to investigate the depth and persistence of SARS-CoV-2 targeting TCRs.	NextSeq 2000	20
EGAD50000001357	This study investigates the clonal evolution and metastatic dissemination of prostate cancer using multiregional single-nuclei RNA sequencing (snRNA-seq) and low-pass whole-genome sequencing (WGS) data from 43 spatially distinct tumor areas in five patients with locally advanced prostate cancer, including both primary and regional lymph node samples. We employed the Chromium Next GEM Single Cell 3’ Kit (v3/3.1, 10x Genomics) for single-nuclei transcriptome profiling and constructed bulk WGS libraries using the NEBNext Ultra II FS DNA Library Prep Kit (New England Biolabs) from the same pool of nuclei extracted from each area. snRNA-seq data were aligned to the human genome (GRCh38-1.2.0_premrna) using 10x Genomics Cell Ranger v5.0.0, while WGS reads were aligned to the human reference genome (GRCh38) with BWA MEM, achieving an average sequencing depth of 0.3X for low-pass WGS and 16X after resequencing.	Illumina NovaSeq 6000 NextSeq 500	43
EGAD50000001358	This dataset contains raw RNA sequencing data for hepatoblastoma PDX cell line HB-303-LEF	Illumina NovaSeq 6000	2
EGAD50000001359	Targeted DNA sequencing was performed on DNA extracted from 87 formalin-fixed paraffin-embedded (FFPE) samples, representing a subset of the 382 sequenced patients from the GAINED cohort. A custom “Lymphopanel” was used to identify mutations in 70 genes associated with lymphomagenesis. Library preparation, exome capture, sequencing, and analysis were conducted by IntegraGen SA (Evry, France). Sequencing was performed on an Illumina HiSeq4000 platform, with variant calling and annotation using established bioinformatics tools such as MuTect and VEP.	Illumina HiSeq 4000	87
EGAD50000001361	This dataset contains bulk RNA-seq data from 12 autosomally isogenic human iPSCs. Three biological replicas from XXY (sample ID 419295-7), XX (sample ID 419298-419300), XY (sample ID 419302, 419305-6), and XO (sample ID 419308, 419310, 419312) hiPSC. RNA for this dataset was collected using Trizol method, and polyA library type was implemented. The libraries were sequenced using Illumina device. The sample reads are deposited as fastq files. Analysis have been done using reference alignment to GRCh38 reference genome.	Illumina NovaSeq 6000	12
EGAD50000001362	This dataset contains Whole exome sequence generated using human skin fibroblast sample (ID 419281) from Klinefelter XXY syndrome patient fibroblasts. These fibroblasts were used to generate autosomally isogenic human iPSCs. The sample DNA was collected using commercial kit and Human All Exon v5 capture paired library was sequenced using Illumina HiSeq device. The fastq file (ID 493224) and analysis (ID 3209) bam files have been deposited.	Illumina HiSeq 4000	1
EGAD50000001363	Whole exome sequencing from 51 samples from High-grade B-cell lymphoma, not otherwise specified: an LLMPP study.	unspecified	51
EGAD50000001364	Targeted capture sequencing for 366 newly added samples in High-grade B-cell lymphoma, not otherwise specified: an LLMPP study and 29 samples from a previously uploaded dataset.	Illumina HiSeq X unspecified	395
EGAD50000001365	RNA-seq for 26 newly added samples in High-grade B-cell lymphoma, not otherwise specified: an LLMPP study, and 32 samples from a previously uploaded dataset.	Illumina NovaSeq X unspecified	58
EGAD50000001366	Whole genome sequencing for 10 newly added samples in High-grade B-cell lymphoma, not otherwise specified: an LLMPP study, and 6 samples from a previously uploaded dataset.	Illumina NovaSeq X unspecified	16
EGAD50000001367	Low pass whole genome sequencing from 46 samples from High-grade B-cell lymphoma, not otherwise specified: an LLMPP study	unspecified	46
EGAD50000001368	The datasets consists of RNA sequencing data of T cells from CLL patients or age-matched healthy donors. In brief, CLL PBMCs are thawed, the sample is split in two, one part is left as it is and stained for sorting and from the other part is stimulated using anti-CD3/CD28 soluble antibodies. After 2 days the stimulated condition is also stained and FACS sorted. The T cell fraction from healthy donors and CLL patients at baseline and after stimulation were sent for bulk sequencing.	Illumina NovaSeq 6000	18
EGAD50000001369	Whole exome sequencing libraries were made from three vials derived from QHJI14s04 human iPSC secondary cell stock, where Nimblegen SeqCap EZ library kit v3.0 was used. The libraries were sequenced with Illumina HiSeq 2500. This dataset contains 3 paired fastq files.	Illumina HiSeq 2500	3
EGAD50000001370	The dataset contains RNA sequencing data of 272 high-grade serous carcinoma (HGSC) patients sequenced with Novoseq 6000, MGISEQ or Hiseq X Ten. The 814 samples are fresh frozen tissue samples that have been collected from different tissues and time points before and during treatment. The files provided are paired fastq files.	HiSeq X Ten Illumina NovaSeq 6000 MGISEQ-2000RS	814
EGAD50000001371	The aim of the study was to identify mutations in key ovarian cancer genes (BRCA1, BRCA2,TP53, PTEN, ATM, ATR, NF1) in archival fresh-frozen paraffin embedded (FFPE) tumor tissue from platinum-resistant ovarian cancer patients enrolled in the phase I/II GANNET53 clinical trial. From the enrolled 133 patients, archival FFPE tissue was available and DNA was extracted thereof. The DNA samples passing quality control (n=118) were NGS sequenced using the SureSeq™ Ovarian Cancer Panel and the NGS Library Preparation kit (Oxford Gene Technology), which covers all exons of 7 key ovarian cancer genes for single nucleotide variants (SNV) and indels, and the Illumina MiSeq platform. For the paired-end runs, one Read 1 (R1) and one Read 2 (R2) FASTQ file were created for each sample. FASTQ files were compressed and created with the extension \.fastq.gz.	Illumina MiSeq	118
EGAD50000001372	10x Single Cell Gene Expression library SCRNA10X_CS_CHIP0167_003 for Diffuse large B-cell lymphoma patient sample TFRIPAIR10	Illumina HiSeq 2500	1
EGAD50000001373	BCR sequencing done with 10x Genomics V(D)J kit + platform for the paper: Integrated single cell analysis reveals co-evolution of malignant B cells and tumor micro-environment in transformed follicular lymphoma. Dataset contains paired FASTQ files for 35 samples, each was processed with the 10x Genomics V(D)J kit to obtain the sequence of the B-cell receptor sequence.	Illumina HiSeq 2500	35
EGAD50000001378	This dataset comprises targeted sequencing data of 52 genes previously implicated in severe COVID-19 outcomes. The study includes samples from 764 individuals with severe COVID-19 and 3,939 population-based controls from the GCAT cohort (Spain). Molecular Inversion Probes (MIPs) were utilized for cost-effective and precise sequencing of the selected genes. The targeted genes include: Inflammasome/IL-1/TNF Pathway: NLRP3, CASP1, CASP8, IL1B, TNF, RIPK1, RIPK3, MYD88, TNFRSF13B SARS-CoV-2 Entry/Replication: ACE2, TMPRSS2, FURIN, SLC6A20, DDX1, DDX58, TLR4, FYCO1, CTSB, CTSL, ADAM17 Complement System: MBL2, CFH, CFI, CFB, ADAM10, CD46 Interferon Signaling: TLR3, IFIH1, IFITM3, TBK1, TLR7, IL10RB, IFNAR1, IFNAR2, SIGLEC1, MYD88, IFNGR1 Chemokine Receptor Signaling: CCR1, CCR3, CCR2, CCR9, IL8, CXCL3, CXCL10, CXCR6, XCR1, CCL2, CCL20 Immunodeficiency Genes: CASP8, CD46, CFB, CFH, CFI, IFNAR1, IFNAR2, IFNGR1, IFIH1, MYD88, NLRP3, RIPK1, TBK1, TLR3, TLR7	Illumina NovaSeq 6000	2294
EGAD50000001379	We analyzed 11 ETMR patient samples using single-cell transcriptomics (6 from fresh and 5 from frozen material). Samples were profilied by paired-end scRNA-sequencing using the Smart-Seq2 protocol on a NextSeq 500 sequencer (Illumina). The dataset contains 8,062 fastq files of 4,031 high-quality single cells/nuclei. Raw fastq files of high-quality cells are shared.	NextSeq 500	16
EGAD50000001380	For this dataset we performed total RNAseq on blood samples from 8 patients with muscle invasive bladder cancer. The data consists of 8 sets of paired fastq files with accompanying UMI fastq files. Sequenced on the Illumina NoveSeq6000 platform using S2 flow cells (v1.5,2x150 cycles), with a mean yield of 313 million reads.	Illumina NovaSeq 6000	8
EGAD50000001381	For this dataset we performed single cell RNAseq paired with single cell TCR-seq on tumor and blood samples from 4 patients. This dataset contains 4 tumor samples as well as 4 blood samples. Each sample is made up of 2 sets of paired fastq files. The first pair contains reads corresponding to RNA transcripts (_Transcripts in file name), while the second pair contain reads corresponding to TCRs (_VDJ in file name). Sequenced on the Illumina NovaSeq6000 platform in a paired-end run using an SP flow cell (v1.5, 300 cycles).	Illumina NovaSeq 6000	8
EGAD50000001382	For this dataset we performed bulk TCRseq on tumor and serial blood samples from patients with muscle invasive bladder cancer (n patients = 119, n blood = 209, n tumor = 28) or non-muscle invasive bladder cancer (n patients = 30, n blood = 56, n tumor = 19) for a total of 149 patients and 312 samples. Each sample consists of a paired fastq file. Sequenced on the Ilumina NovaSeq6000 platform using SP and S1 flow cells (v1.5, 2x101 cycles), with a mean yield of 42 million reads covering the CDR3β region.	Illumina NovaSeq 6000	312
EGAD50000001384	This dataset contains RNA profiling of paired FL and tFL samples (n=11 pairs) sequenced by Illumina NextSeq500 instrument. Libraries used in this mRNA-seq dataset were prepared with QuantSeq FWD library preparation kit (Lexogen, Catalog Number: 015) with the addition of UMI module. RNAs used for library preparation were isolated from FFPE tissues using ALLPREP RNA/DNA FFPE kit (QIAGEN,Catalog number: 80234). Data are uploaded as FASTQ format.	NextSeq 500	22
EGAD50000001385	This dataset contains miRNA-seq of paired FL and tFL samples (n=10 pairs) sequenced by Illumina NextSeq500. Libraries used in this miRNA-seq dataset were prepared with NEBNext Multiplex Small RNA Library Prep Set for Illumina (NEB,Catalog number: E7560S). RNAs used for library preparation were isolated from FFPE tissues using High Pure miRNA Isolation Kit (Roche,Catalog number: 05080576001). Data are uploaded as FASTQ format.	NextSeq 500	20
EGAD50000001386	Dataset includes count matrices, barcode tsvs, and features tsv for each sample, generated using Cell Ranger.	unspecified	48
EGAD50000001390	Linked-read whole genome sequencing data of 15 normal blood samples from brain tumour patients. The dataset consists of BAM files generated by the Long Ranger pipeline from 10x Genomics.		15
EGAD50000001391	Linked-read whole genome sequencing data of 31 glioma samples from 20 patients, including 7 matched blood samples		38
EGAD50000001392	We performed single and multi-region nanopore whole-genome sequencing on human bone and soft-tissue sarcoma samples. We also performed nanopore whole-genome sequencing on matched whole blood samples.	PromethION	125
EGAD50000001394	Mapping the spatial organization of DNA-level somatic copy number changes in tumors can provide insight to understanding higher-level molecular and cellular processes that drive pathogenesis. We describe an integrated framework of spatial transcriptomics, tumor/normal DNA sequencing, and bulk RNA sequencing to identify shared and distinct characteristics of an initial cohort of eleven gliomas of varied pathology and a replication cohort of six high-grade glioblastomas.	Illumina NovaSeq 6000 NextSeq 2000	17
EGAD50000001395	Dataset contains mRNA-sequencing data (fastq files) from 2 patients with NUP214::ABL1 disease (one T-ALL and one B-ALL patient). Patient samples (peripheral blood or bone marrow) were taken at diagnosis (B-ALL) or relapse (T-ALL). mRNA was extracted from lymphoblasts and underwent paired-end sequencing (75bp) using the Illumina NextSeq platform. Transcriptomic data was used for identification of gene fusions, SNVs and InDels. mRNA-seq data for an additional B-ALL patient with NUP214::ABL1 disease utilised in this study has already been published (EGAS00001006460).	NextSeq 500	2
EGAD50000001396	A scRNAseq dataset consisting of stem cell-derived islets treated with gradient separation (HUES8_Pure_Islets) and unpurified control (HUES8_control_islets). scRNAseq was done on cryopreserved stage 7 stem cell (HUES8 cell line)-derived islets. Details about stem cell differentiation and sequencing can be found on the publication by Rajaei et al. (2025, Science Translational Medicine). This dataset correspond to Fig 4 of the publication.	Illumina NovaSeq X	2
EGAD50000001397	150 nt paired-end metagenomic shotgun sequencing (Illumina HiSeq 2000) of fecal samples from 130 (mostly treated, i.e. adhering to gluten-free diet) celiac disease patients and 106 controls from the CeDNN cohort from the University Medical Center Groningen, The Netherlands. For six celiac disease patients, data from two timepoints (before and during gluten-free diet) are included, resulting in a total of 242 samples. The dataset includes 2 encrypted fastq files per sample (.fq.gz.c4gh) en sample phenotypes (.txt.c4gh).	Illumina HiSeq 2000	242
EGAD50000001398	This dataset contains 8 unpaired fastq files sequenced with Ilumina NextSeq 2000. The files contain transcriptome data of host and pathogen from infected and uninfected samples from extracted total RNA at 30 hours post infection.	NextSeq 2000	8
EGAD50000001399	This dataset comprises 124 human faecal samples collected across Indonesia between 2015 and 2021, together with 6 controls. It was collected to study patterns of human microbiome variation across Indonesia’s diverse geography and lifestyles, and to generate the Indonesian Microbiome Ecology and Evolution (IndoMEE) metagenome-assembled genome (MAG) reference database. The dataset consists of 250 bp paired-end metagenomic reads from 124 faecal samples acquired from 116 individuals, at ca. 10.6 Gb / sample, together with 2 positive controls (ZymoBIOMICS Microbial Community DNA Standard by Zymo Research Cat# D6305), and 4 negative controls (for sampling: Blank, for extraction: Buffer EB by Qiagen Cat# 19086, for library prep: Nuclease Free Water by Ambion-Invitrogen Cat# AM9937). Library preparation was carried out using Illumina's TruSeq DNA Nano DNA kit, and sequencing was performed on the Illumina NovaSeq 6000 platform.	Illumina NovaSeq 6000	130
EGAD50000001400	Spatial transcriptomic analysis of two Japanese ELOC-mutated renal cell carcinoma	Illumina HiSeq X	2
EGAD50000001401	This dataset contains WES bam/bai files associated with 51 patients as part of the ACUITI Study	Illumina NovaSeq 6000	51
EGAD50000001402	Long-read direct cDNA sequencing (Oxford Nanopore Technologies) of three independent post-mortem human retina samples of 17 lncRNA loci.	PromethION	3
EGAD50000001403	This dataset contains 10x Genomics single-nucleus RNA sequencing data from postoperative brain tissues of patients with focal cortical dysplasia type II (FCD II). The samples were processed using the Chromium platform.	Illumina HiSeq 2500 Illumina NovaSeq 6000	10
EGAD50000001404	Archival peripheral blood or bone marrow plasma samples from non small-cell lung cancer, B-cell lymphoma and acute myeloid leukemia patients, and healthy donors. Shallow whole genome sequencing of 100 Samples by two distinct library preparation methods (PCR = 45, PCR-Free = 55). sWGS done for Chromosomal CN detection.	Illumina HiSeq 4000	100
EGAD50000001405	Bulk RNA-sequencing of peripheral blood mononuclear cells collected from mild COVID-19 (n = 4), critical COVID-19 (n = 12) and control (n = 8) patients. Sequencing data is provided as FASTQ files.	Illumina NovaSeq 6000	24
EGAD50000001406	micro RNA-sequencing of peripheral blood mononuclear cells collected from mild COVID-19 (n = 6), critical COVID-19 (n = 13) and control (n = 8) patients. Sequencing data is provided as FASTQ files.	NextSeq 500	27
EGAD50000001407	The dataset includes 51 whole exome sequencing datasets generated using Illumina paired-end sequencing technology. These data derive from 26 patient-derived xenograft (PDX) tumors and their matched normal tissues, which consist of either normal gastric mucosa or blood samples. In one case (GTR0607), the matched normal tissue data are unavailable. As a result, the dataset comprises 102 FASTQ files, providing raw sequencing reads for comprehensive genomic analysis.	Illumina HiSeq 2000	51
EGAD50000001409	This dataset contains all available fastq files for gemline, baseline plasma/biopsy, and response assessment patient samples for the DIRECT study. The total number of samples was 729 (188 germline, 190 baseline plasma, 159 baseline biopsy, and 192 response assessment). The dataset files were generated from targeted (exome) Illumina sequencing performed on paired biopsy tissue and plasma cell free DNA.	Illumina NovaSeq X NextSeq 2000	729
EGAD50000001410	The dataset contains amplicon sequencing data from 48 samples from 35 different patients with ovarian cancers. Cell free DNA was collected from plasma. Sequencing was performed on an Ion Torrent platform and the sequencing data is provided in bam format.	Ion Torrent S5	48
EGAD50000001411	The dataset contains bulk transcriptomics data from 4 samples from 3 different patients with ovarian cancer. 3 samples were collected from the tumor, 1 sample from ascites. Sequencing was performed in paired-end mode and sequencing data is provided in fastq format. These samples are from patients not classified as as having high-grade serous adenocarcinoma.	Illumina NovaSeq 6000 NextSeq 500	4
EGAD50000001412	The dataset contains amplicon sequencing data from 7 samples from 7 different patients with ovarian cancers. Cell free DNA was collected from plasma. Sequencing was performed on an Ion Torrent platform and the sequencing data is provided in bam format. These samples are from patients not classified as having high-grade serous adenocarcinoma.	Ion Torrent S5	7
EGAD50000001413	The dataset contains bulk transcriptomics data from 39 samples from 36 different patients with ovarian cancer. 34 samples were collected from the tumor, 5 samples from ascites. Sequencing was performed in paired-end mode and sequencing data is provided in fastq format.	Illumina NovaSeq 6000 NextSeq 500	39
EGAD50000001414	Metagenomic data of stool samples from 111 patients with either bipolar disorder or schizophrenia spectrum disorder. Data includes fq.gz files, mostly 1 forward and 1 reverse read per sample. Samples analysed in multiple lanes have 4 (paired) fastq files, which will have the same patient ID (example: 048_1 and 048_2, or 088_L1 and 088_L2. DNA was extracted from using the QIAamp Fast DNA Stool Mini Kit (Qiagen) following the manufacturer’s instructions. Shotgun metagenomic sequencing was carried out using llumina Novaseq 6000.	Illumina NovaSeq 6000	130
EGAD50000001415	Targeted next-generation sequencing was performed on circulating cell-free DNA extracted from plasma samples of patients in the Neo-Pembro trial. For each patient, circulating cell-free DNA from the baseline plasma sample was extracted and sequenced using the AVENIO Expanded Kit v2.	NextSeq 550	33
EGAD50000001416	SCANDARE (NCT03017573) is a multicentric biobanking study, enrolling adult patients with newly diagnosed head and neck squamous cell carcinoma (HNSCC), triple negative breast cancer (TNBC), ovarian and cervical cancer. Tumor tissue and blood samples are collected at several time points during patient's journey, including at diagnosis, post-neoadjuvant chemotherapy in case of neoadjuvant treatment, at surgery, at recurrence and at disease progression following treatment initiated at recurrence. Since its launch in 2017 at Institut Curie, SCANDARE has enabled the longitudinal collection and preservation of samples for in-depth analyses. Here, we describe molecular alterations in 124 frozen samples (n=84 at baseline and n=40 paired at post-NAC surgery) from 91 TNBC patients using total RNAseq technology, with FASTQ files deposited in the dataset.	Illumina NovaSeq 6000	124
EGAD50000001417	These Fastq-files contain the raw sequencing output from our two sequencing experiments (Shotgun and Capture). All libraries with "SG" refer to Shotgun genomes, while all files containing "TF" refer to 1240K capture data.	Illumina HiSeq 4000	9
EGAD50000001418	WGS data from biliary tract cancer samples (Beaudry et al, 2025; n=55)		110
EGAD50000001419	Bulk RNA Sequencing of 33 epithelioid sarcomas (EpS) and 3 extracranial extra-renal rhabdoid tumors (EERT) Whole-exome sequencing of 30 EpS and 3 EERT Single-cell RNA Sequencing of 8 EpS Visium Spatial Transcriptomics of 3 EpS	Illumina NovaSeq 6000 NextSeq 500	99
EGAD50000001420	Clinical and ctDNA data for IMpassion031, including survival, response, and ctDNA data from baseline through post-surgery time points. 222 samples run on Signatera assay. File type is csv.		222
EGAD50000001422	This dataset includes whole-exome sequencing (WES) and RNA sequencing (RNA-seq) data from a uveal melanoma patient. Tumor samples were obtained from three distinct regions, each with matched RNA-seq and WES data. Additionally, WES data from the patient’s peripheral blood mononuclear cells (PBMCs) is provided as a germline reference.	unspecified	4
EGAD50000001424	This dataset includes single cell (sc)RNA-seq of N=3 patient-derived xenograft (PDX) samples profiled at baseline (no drug treatment) and N=12 samples from one PDX profiled during the course of treatment (N=1 sample each for 4 treatment arms x 3 timepoints). The dataset includes bam files that have been purged of reads mapping to cells where a majority of the reads from that cell mapped to the mouse genome. All scRNA-seq data were generated using the Chromium Next GEM Single Cell 5' Reagent Kits v2 (Dual Index) from 10X Genomics.	Illumina NovaSeq 6000	15
EGAD50000001425	This dataset includes whole exome sequencing (WES) data from PDX samples post-drug treatment (N=1 vehicle, N=1 gilteritinib, N=1 venetoclax and N=1 VenGilt). Bam files are provided for each sample. The VenGilt-treated sample underwent whole genome amplification using the Qiagen REPLI-g kit (#150345) prior to sequencing due to the very low cell number.	Illumina HiSeq 2500	4
EGAD50000001434	A total of 24 samples of meninges from 13 normal human embryos of post-conceptional week (PCW) 5.0-13.0 were dissected and subject to 10Xgenomics chromium single cell RNA sequencing with the purpose of identifying cell types and gene expression programs for normal meninges development. All libraries were sequenced on an Illumina NovaSeq 6000 using S4 to a target sequencing depth of 100,000 reads/cell. Final data after processing with STARSolo 2.7.10a, using human genome GRCh38.p12 and transcript annotations from ENSEMBL release 93 from reference package GRCh38-3.0.0 as available from 10Xgenomics, consists of 24 BAM files.	Illumina NovaSeq 6000	24
EGAD50000001436	The RISE-UP study contains raw metagenomic sequencing data of Crohn's disease patients. Paired-end sequencing was performed using the Illumina HiSeq 2000 platform.	Illumina HiSeq 2000	4
EGAD50000001443	Histone modification ChIP-seq for chromatin state annotations of myometrium and UL subclasses. Altogether 5 analyses and 210 runs. Includes fastq files; both ChIP and Input for each sample. Some Inputs are used for multiple ChIP experiments. ChIP-seq alignment files for trimmed, mapping q20 and nonredundant reads; both ChIP and Input for each sample. Software: Trim Galore v0.6.7; Bowtie 2 v2.5.0; samtools v1.6	Illumina HiSeq 2500 Illumina NovaSeq 6000	82
EGAD50000001445	The dataset for “Detection of brain cancer using genome-wide cell-free DNA fragmentomes” includes 310 bam files from whole genome next-generation sequencing on the Illumina Novaseq 6000. The dataset analyzed include plasma cfDNA samples from patients with cancer, patients with neurological conditions, and healthy individuals.	Illumina NovaSeq 6000	310
EGAD50000001446	The dataset comprises targeted deep methylation sequencing data used for both tissue-of-origin determination and donor-derived cfDNA quantification. It includes plasma cfDNA data from stable kidney transplant recipients (n = 31), stable liver transplant recipients (n = 20) and healthy controls (n = 23). In addition, plasma cfDNA samples collected early after transplantation were analyzed from kidney transplant recipients (n = 44) and liver transplant recipients (n = 40). The dataset also features genomic DNA data generated from whole blood (n = 10) and buffy coat (n = 3) from healthy controls, as well as from T cells (n = 1), B cells (n = 2), hepatocytes (n = 1) and kidney epithelium (n = 1). Lastly, genomic DNA data was generated from buffy coat of transplant recipients (n = 17). Here, “n” indicates the sample number.	Illumina NovaSeq 6000 NextSeq 550	193
EGAD50000001449	This dataset included human plasma EBV DNA target-capture sequencing data. The capture probes were designed to cover the entire EBV genome and selected human autosomal regions. After enrichment, the products were sequenced on the NextSeq500 System (Illumina) and aligned to the EBV genome (AJ507799.2) and the human genome (hg19) in BAM format. A total of 558 cases were analyzed, including seven subjects who developed NPC in the second-round screening, 431 non-NPC subjects with first-round transiently positive EBV, and 120 non-NPC subjects with first-round persistently positive EBV.	NextSeq 500	558
EGAD50000001450	This dataset contains 241 paired FASTQ files for 61 early-onset diabetes patient and 174 controls sequenced with MGI Tech DNBSEQ-T10 and DNBSEQ-T7.	unspecified	241
EGAD50000001451	This dataset consists of 8 ovarian cancer DNA sample libraries prepared using the Illumina TruSight Oncology 500 Kit (Illumina, cat no. 20076480) following the manufacturer's protocol. Library quality and quantity were assessed with High Sensitivity/D5000 Screentape Assay (Agilent Technologies, cat no. 5067-5592) on 4150/4200 Tapestation Quant-IT/Qubit dsDNA HS (Qiagen, cat no. Q32851) assay according to the supplier’s recommendations. Libraries were then pooled together in equal ratios and sequenced using PE-150bp mode on NovaSeq S1 flow cell 300 cycles kit (Illumina, 20028317) or S4 flow cell 300 cycles kit (Illumina, 20028312) aiming for 100 million reads per sample. The dataset contains sequence reads in fastq file format.	Illumina NovaSeq X	8
EGAD50000001452	Raw transcriptomic data from endothelial cells of post-COVID patients and healthy controls. Samples were collected from post-COVID patients at 3 months (n = 6) and 6 months (n = 6) post-infection, as well as from healthy control individuals (n = 5). All post-COVID patients were hospitalized during the first wave of the pandemic. Among them, half developed pulmonary embolism (TEP; n = 6), while the other half did not (no TEP; n = 6).	Illumina HiSeq 3000	17
EGAD50000001453	Sample number: 8 Sample details: Total 8 samples, including Homeostatic-like assembloids (4 samples), Fibrotic-like assembloids (4 samples).	Illumina NovaSeq 6000	8
EGAD50000001454	Sample number: 40 Sample details:Total 40 samples, including hepatocyte organoids (74 samples), primary human hepatocytes (15 samples), cholangicocyte organoids (11 samples), hepatocytes cultured in monlayer (5 samples) and portal fibroblasts (3 samples).	Illumina NovaSeq 6000	101
EGAD50000001455	This is the RNAseq CD4 dataset (paired FASTQ files, n = 86 paired files) generated from pre-methotrexate treatment JIA patients cohort samples. CD4 cells were sorted from isolated total PBMC by cell sorter (BD FACSAriaTM III, BD Biosciences). Clinical metadata is also included in this dataset. Samples were sequenced on the NovaSeq6000 instrument (Illumina, San Diego, US) at 300pM, using a 100bp paired read run with corresponding 8bp dual sample index and 5bp or 8bp unique molecular index reads. Run data were demultiplexed and converted to FASTQ files using Illumina’s BCL Convert Software v3.75.	Illumina NovaSeq 6000	416
EGAD50000001456	This is the RNAseq CD8 dataset (paired FASTQ files, n = 85 paired files) generated from pre-methotrexate treatment JIA patients cohort samples. CD8 cells were sorted from isolated total PBMC by cell sorter (BD FACSAriaTM III, BD Biosciences). Clinical metadata is also included in this dataset. Samples were sequenced on the NovaSeq6000 instrument (Illumina, San Diego, US) at 300pM, using a 100bp paired read run with corresponding 8bp dual sample index and 5bp or 8bp unique molecular index reads. Run data were demultiplexed and converted to FASTQ files using Illumina’s BCL Convert Software v3.75.	Illumina NovaSeq 6000	416
EGAD50000001457	This is the RNAseq PBMC dataset (paired FASTQ files, n = 82 paired files) generated from pre-methotrexate treatment JIA patients cohort samples. Total PBMC were isolated from pre-treated blood of JIA patients and clinical metadata is also included in this dataset. Samples were sequenced on the NovaSeq6000 instrument (Illumina, San Diego, US) at 300pM, using a 100bp paired read run with corresponding 8bp dual sample index and 5bp or 8bp unique molecular index reads. Run data were demultiplexed and converted to FASTQ files using Illumina’s BCL Convert Software v3.75.	Illumina NovaSeq 6000	416
EGAD50000001458	This is the RNAseq CD14 dataset (paired FASTQ files, n = 80 paired files) generated from pre-methotrexate treatment JIA patients cohort samples. CD14 cells were sorted from isolated total PBMC by cell sorter (BD FACSAriaTM III, BD Biosciences). Clinical metadata is also included in this dataset. Samples were sequenced on the NovaSeq6000 instrument (Illumina, San Diego, US) at 300pM, using a 100bp paired read run with corresponding 8bp dual sample index and 5bp or 8bp unique molecular index reads. Run data were demultiplexed and converted to FASTQ files using Illumina’s BCL Convert Software v3.75.	Illumina NovaSeq 6000	416
EGAD50000001459	This is the RNAseq CD19 dataset (paired FASTQ files, n = 83 paired files) generated from pre-methotrexate treatment JIA patients cohort samples. CD19 cells were sorted from isolated total PBMC by cell sorter (BD FACSAriaTM III, BD Biosciences). Clinical metadata is also included in this dataset. Samples were sequenced on the NovaSeq6000 instrument (Illumina, San Diego, US) at 300pM, using a 100bp paired read run with corresponding 8bp dual sample index and 5bp or 8bp unique molecular index reads. Run data were demultiplexed and converted to FASTQ files using Illumina’s BCL Convert Software v3.75.	Illumina NovaSeq 6000	416
EGAD50000001460	This study contains data obtained from 210 CRC samples (87 pairs and 36 singletons for whom only one of the sample pair was available). Our sample set consisted of 105 primary tumor and 105 metastatic samples. Data are analysis-ready BAM files. Sequencing libraries were prepared with the SureSelect Human All Exon V8 kit by Agilent. Sequencing was done on NovaSeq X Plus instrument (Illumina, San Diego, CA, USA) using 150-bp paired-end reads.	Illumina NovaSeq 6000	210
EGAD50000001461	60X WGS of single cell-derived colonies of DHL4 cells treated with platinum or RT (or unexposed control). 16 bam files aligned using MGP1000 pipeline (GRCh38).	Illumina NovaSeq X	16
EGAD50000001462	Granulomas are the hallmark of Mycobacterium infections, forming structured immune environments that drive disease persistence. However, their spatial and functional organization remains unclear. Using spatial RNA sequencing on 33 patient explants, we try to reveal the detailed gene expression and cell composition landscape of granulomas.	NextSeq 500	33
EGAD50000001464	Liver ductal organoids expanded in either conventional (Huch et al., 2015) or refined medium were split and seeded either in the same medium or in DM+. They were sequenced on day 10 after splitting. Droplet-based scRNA-seq was performed using the 10x Genomics Chromium Single Cell Kit v3, and libraries were sequenced on an Illumina NovaSeq 6000 S4. Raw sequencing data were processed using Cell Ranger software (v6.0.1) with the human genome reference (hg38) and gene annotation (Ensembl 98). Dataset integration and downstream analysis were mainly performed using Seurat (v4.3.0.1) with R v4.1.3 “One Push-up”.	Illumina NovaSeq 6000	12
EGAD50000001466	The dataset encompasses 852 Runs from the WGSPD Project 3 - Genomic Strategies to Identify High-impact Psychiatric Risk Variants Project	Illumina NovaSeq 6000	-
EGAD50000001467	Spatial transcriptomics analysis for 43 breast cancer lobular samples. Note that there are 2 runs per sample Images and read counts will be made availalble on Zenodo upon publication.	Illumina NovaSeq 6000	43
EGAD50000001469	Sample availability: WES was performed on 41 available baseline tumor and blood samples (22 CCRs/MPRs, 5 PPRs, 14 CRN/NPRs). Tumor DNA was extracted from formalin-fixed paraffin-embedded (FFPE) primary tumor and lymph node metastasis sections containing at least 10% viable tumor cells present in the sample. A pathologist (LS) scored the tumor percentage and indicated the most tumor-dense region on a hematoxylin and eosin (H&E) stain slide for subsequent DNA isolation. According to the manufacturer's protocol, five to 10 FFPE slides (10 µm) were used for DNA isolation using the AllPrep DNA FFPE isolation kit (Qiagen, 80234) and the QIAcube. Germline DNA was isolated from baseline PBMCs using the AllPrep DNA / RNA / miRNA Universal isolation kit (Qiagen, 80224) and the QIAcube, according to the manufacturer’s protocol. Genetic Diagnostics and Sequencing Services (CeGaT) performed whole-exome sequencing sequencing in Germany. Data processing: Demultiplexing of the sequencing reads was performed with Illumina bcl2fastq (v2.20). Adapters of the reads were trimmed with Skewer (v0.2.2), without quality trimming. Sequencing reads were aligned with BWA (v0.7.17) to the human reference genome GRCh38 (Ensemble, v105). Duplicated reads were marked using Picard (v2.25.0) MarkDuplicates, after which quality scores were recalibrated using GATK4 (4.2.2.0) BaseRecalibrator. FastQC (v0.12.1), MultiQC (v1.14), Mosdepth (v0.3.3) and NGScheckmate (v1.0.1) were used for assessing data quality, on FASTQ files and intermediate processing steps.	Illumina NovaSeq 6000	82
EGAD50000001470	Tumor availability: Bulk RNA sequencing was performed on tumor biopsies of 48 patients at baseline and 45 patients across time points (baseline n=48, week 1 n=38, week 2 n=34, week 4 n=37), including lymph node metastasis that were sequenced in addition to the primary tumor. Tumor RNA was extracted from formalin-fixed paraffin-embedded (FFPE) primary tumor and lymph node metastasis sections containing at least 10% viable tumor cells present in the sample, except for on-treatment samples with a complete pathological response in which we isolated samples with at least 10% immune-infiltration with no viable tumor cells. A pathologist (LS) scored the tumor percentage and indicated the most tumor-dense region on a hematoxylin and eosin (H&E) stain slide for subsequent RNA isolation. According to the manufacturer's protocol, five to 10 FFPE slides (10 µm) were used for RNA isolation using the AllPrep DNA/RNA FFPE isolation kit (Qiagen, 80234) and the QIAcube. Genetic Diagnostics and Sequencing Services (CeGaT) performed RNA sequencing in Germany. Data processing: Demultiplexing of the sequencing reads was performed with Illumina bcl2fastq (v2.20). Adapters of the reads (Pico v2 SMART adapter, first three nucleotides of the second sequencing read) were trimmed with Skewer (v0.2.2), without quality trimming. RNA-sequencing data was aligned to GRCh38 (Ensembl v109) using STAR (v2.7.9) in 2-pass mode with default settings. Gene counts were generated with the HTSeq (v2.0.2). FastQC (v0.12.1) and MultiQC (v1.14) were used for assessing data quality, on FASTQ files and intermediate processing steps.	Illumina NovaSeq 6000	157
EGAD50000001471	Raw and processed RNA-seq data from a patient-derived xenograft (yielding 18 samples) and the VOA1066 undifferentiated endometrial carcinoma cell-line (yielding 15 samples). PDX samples were treater with either compound-14, dBRD9, or were untreated. VOA1066 samples were treated with either compound-12, dBRD9, or were untreated. Samples were sequenced on an Illumina NextSeq 500 using 37-bp single-end sequencing. Raw fastq files and gene count tables are included for each sample.	NextSeq 500	33
EGAD50000001472	Raw and processed ATAC-seq data from the VOA1066 undifferentiated endometrial carcinoma cell-line (10 samples). Samples were treated with DMSO, dBRD9, or compound 12, without doxycycline; some samples instead received doxycycline-induced ARID1A treatment, or doxycycline alone. Samples were sequenced on Illumina NextSeq 500 using 37 bp paired-end sequencing parameters. Raw fastq fastq files, bigwig files giving normalized genomic coverage per region, and called broad peaks are included for each sample.	NextSeq 500	10
EGAD50000001473	ChIP-seq data of 18 samples from the VOA1066 undifferentiated endometrial carcinoma cell-line with various pulldowns: BRG1 (abcam, ab110642, GR3255117-11), BRD9 (CST, E4Q3F, Lot 1), ARID1A (CST, D2A8U, Lot 4), SS18 (CST, D6I4Z, Lot 1), PBRM1 (CST, D3F70, Lot 3), GLTSCR1 (CST, E6I3A, Lot 1), H3K27ac (CST, C36B11, Lot 16), or none. Sample were either untreated or received doxycycline-induced ARID1A treatment. Samples were sequenced on Illumina NextSeq 500 using 37 bp single-end sequencing parameters. Raw fastq files, along with processed bigwig files containing normalized genomic coverage, and called narrow peaks, are included for each sample.	NextSeq 500	18
EGAD50000001475	To assess the DNA methylation landscape of naïve human embryonic stem cells derived from ERKi or control blastocysts using PXGL or UXGL media.	unspecified	16
EGAD50000001477	Phenotypic data, whole genome sequencing, blood RNA-seq, imputed variant calls, cell type estimates, gene expression levels, splicing measurements, and analysis covariates from South African Blood Regulatory (SABR) resource participants, which includes South Eastern Bantu-speaking groups.	Illumina NovaSeq 6000	754
EGAD50000001482	WES (using the SureSelect Human All Exon V5 + UTRs target enrichment kit) in 23 Angolan and 7 Cape Verdean triple-negative breast cancer samples	Illumina NovaSeq 6000	46
EGAD50000001486	The dataset contains six composite lymphomas and leukemias cases (four composite B-cell lymphomas (case 1-4), and two combinations of B-cell and T-cell lymphoma (case 5+6)). Two tumor and one non-tumor (NTC) sample were sequenced from each case after library prep with NEBNext Ultra DNA Library Prep Kit, Exome capturing was done with NimbleGen SeqCap EZ Choice kit, and sequenced on a Illumina HiSeq 2000 HO and given as fastq files. WGA (Qiagen REPLI-g kit) was used for case 2 and 3. Case 1-4 were microdisected, case 5+6 were flow-sorted.	NextSeq 2000	99
EGAD50000001487	This dataset contains 85 WES bam/bai files of HNSCC cancer patients as part of with the LIONESS study	Illumina NovaSeq 6000	76
EGAD50000001488	This dataset consists of 694 bulk RNA-seq data (sorted aligned BAM files) from whole blood of ALS participants meeting the Gold Coast definition of ALS during their clinical visit (n=422) and healthy controls (n=272) at the University of Michigan (UM). Samples were collected as part of the UM Pranger ALS Clinic between 2011-2021, and prepared either using poly-A selection of rRNA depletion (see Experiments). RNAseq was performed on the NovSeq 6000 S4 at the UM Advanced Genomics Core, resulting in 150bp paired-end reads. Raw reads were trimmed using Cutadapt (v2.3) and aligned to hg38 using STAR (v2.5.3a). Age provided is the age at blood sample collection date.	Illumina NovaSeq 6000	693
EGAD50000001489	This dataset comprises two mtscATAC-seq libraries (customized 10X ATAC-seq) generated from donor M80 and donor H05, and one DOGMA-seq library (customized 10X multiome) generated from donor H05, all derived from primary human peripheral blood mononuclear cells. Patient M80 is an 80-year-old male with mitochondrial encephalomyopathy, lactic acidosis, and stroke-like episodes (MELAS), while donor H05 is a healthy 5-year-old pediatric donor. The raw FASTQ files have been provided.	Illumina NovaSeq 6000 Illumina NovaSeq X	2
EGAD50000001490	This dataset is composed of 159 RNAseq data (fastq, bam and count tables) and 35 targeted DNAseq data (fastq).	Illumina NovaSeq 6000	194
EGAD50000001491	The dataset comprises multi-omics profiles from 232 biopsies, including 217 from patients with large B-cell lymphoma (114 newly diagnosed and 103 relapsed/refractory) and 15 benign control samples. Specifically: single-nucleus RNA sequencing (snRNA-seq) was performed on all 232 biopsies; single-nucleus ATAC sequencing was conducted on 96 biopsies; bulk RNA sequencing was generated for 208 biopsies; whole-exome sequencing (WES) was performed on 174 tumor biopsies with 118 matched germline samples; Low-pass whole-genome sequencing (lpWGS) was performed on 174 tumor biopsies with 41 matched germline samples. All raw data files are provided in FASTQ format.	Illumina NovaSeq 6000	1062
EGAD50000001493	This dataset consists of WGS fastq files from 58 clonal organoids and 18 fresh-frozen (FT) bulk-tissue samples from surgically resected primary and metastatic tumors before and after anticancer therapies in 6 patients with matched blood control samples from each patient. After DNA extraction from each sample using DNeasy Blood and Tissue Kit (Qiagen), library preparation was done using TruSeq DNA PCR-free kit. Then, paired-end fastq files were generated using Illumina NovaSeq 6000.	Illumina NovaSeq 6000	83
EGAD50000001495	This dataset consists of single-cell RNA-seq profiling of colorectal cancer organoids derived from two patients. The organoids were profiled both after 4 weeks of treatment and after a drug holiday of two weeks. There are 49 RNA sequence files generated using single cell RNA sequencing technology from 10X genomics.	Illumina NovaSeq 6000	49
EGAD50000001496	Pulmonary arterial hypertension (PAH) and hereditary hemorrhagic telangiectasia (HHT) are two distinct vascular diseases linked to impaired signaling through bone morphogenetic protein (BMP) receptor complexes in endothelial cells. Although BMP-9 plays a central role in activating this pathway by binding to ALK1 and BMPR-II, its precise function in the pulmonary microvasculature has remained unclear. In this study, we uncover a previously unrecognized role for BMP-9 in regulating pulmonary vascular architecture and homeostasis. Our findings demonstrate that BMP-9 signaling intersects with VEGF pathways and contributes to the delicate balance between vascular growth and remodeling in the lungs. We also show that disruption of this pathway can shift vascular responses toward an HHT-like state, potentially altering disease susceptibility. These insights offer a unique perspective on how BMP-9 and ALK1 shape pulmonary vascular biology and suggest that targeting this axis could inform future strategies for treating complex vascular diseases such as PAH.	NextSeq 500	5
EGAD50000001497	We profiled large B-cell lymphoma (LBCL) patient plasma samples with ultra-low-pass (shallow) whole genome sequencing. The DNA sequencing was performed on pre-capture genomic libraries, with an intended mean coverage of at least 0.1X, FASTQ files were cleaned by fastp and mapped to the reference human genome (GRCh38) with bwa.	Illumina NovaSeq 6000	160
EGAD50000001499	21 samples were taken from human embryonic heart of PCW 5.5-14 of a total of 13 fetuses. Samples were subject to 10Xgenomics chromium single cell 3' sequencing version 2 or 3. Data was analyzed using STARSolo version 2.7.10a, using human genome GRCh38.p12 and transcript annotations from ENSEMBL release 93 from reference package GRCh38-3.0.0 as available from 10Xgenomics.	Illumina NovaSeq 6000	21
EGAD50000001500	We performed deep single-cell RNA sequencing of two rare glioblastoma cases where tissue could be sampled from tumor core to macroscopically normal cortex. We processed two to three replicates per sample with the 10X Genomics V3.1 kits. Deep sequencing was performed to about 100,000 reads per cell on the Illumina NovaSeq 6000 platform, resulting in a total of 37 BAM files.	Illumina NovaSeq 6000	37
EGAD50000001501	Four single-cell RNA libraries were generated from an unbiased sample of cells of a human four weeks post conception embryo using the 10X Chromium Single-Cell Instrument and NextGEM Single Cell Multiome ATAC+Gene Expression kit (10X Genomics) according to the manufacturer’s protocol.	Illumina NovaSeq 6000	4
EGAD50000001502	Four single-cell ATAC libraries were generated from an unbiased sample of cells of a human four weeks post conception embryo using the 10X Chromium Single-Cell Instrument and NextGEM Single Cell Multiome ATAC+Gene Expression kit (10X Genomics) according to the manufacturer’s protocol.	Illumina NovaSeq 6000	4
EGAD50000001503	This dataset includes small RNA sequencing data from circulating cell-free RNAs (CF-miRNAs). miRNA libraries were prepared using the Qiaseq miRNA Library Kit with adaptations for low RNA input and sequenced on the Illumina NextSeq 550 platform	NextSeq 550	222
EGAD50000001504	This dataset includes small RNA sequencing data from extracellular vesicle-derived RNA. miRNA libraries were prepared using the Qiaseq miRNA Library Kit with adaptations for low RNA input and sequenced on the Illumina NextSeq 550 platform	NextSeq 550	222
EGAD50000001505	This data set contains paired-end NGS files (in FASTQ format) generated from a methylation profiling experiment of cell-free DNA derived from the plasma of sepsis patients and healthy controls. Methylation profiling was performed using the TET-assisted pyridine borane sequencing (TAPS) protocol.	Illumina NovaSeq X	58
EGAD50000001506	This study consists of the spatial transcriptomics data generated by : Visium CytAssist Spatial Gene Expression for FFPE (10x) for steatotic liver disease-associated hepatocellular carcinoma (SLD-HCC) (n=7) and non-SLD-HCC (n=5) The goal of this data is to compare SLD-HCC vs non-SLD-HCC as well as response to immunotherapy	Illumina NovaSeq 6000	12
EGAD50000001507	This study consists of the spatial transcriptomics analyis data generated by : CosMxTM Human Universal Cell Characterization RNA Panel (1000-plex); NanoString, USA for n=4 SLD-HCC and n=4 non-SLD-HCC. The goal of this data is to compare SLD-HCC vs non-SLD-HCC as well as response to immunotherapy		4
EGAD50000001509	This dataset comprises whole-genome Oxford Nanopore long-read sequencing data from six high-grade serous ovarian cancer (HGSOC) patients, including cryopreserved, pre-treatment tumor tissues and matched peripheral blood samples (12 samples in total). High-molecular-weight DNA was prepared using the SQK-LSK114 Ligation Sequencing Kit and sequenced on FLO-PRO114M flow cells (R10.4.1). Basecalling was conducted using the super-accurate (SUP) model, enabling simultaneous detection of 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC). The dataset includes BAM files containing base modification information for downstream genomic and epigenomic analyses.	PromethION	12
EGAD50000001510	Peripheral blood mononuclear cells (PBMCs) were sampled from 3 healthy donors to sequence IgM heavy chain V(D)J repertoire libraries. Libraries were generated by targeting IgM heavy chain V(D)Js with a 5’MTPX primerset and sequenced on an Illumina MiSeq machine. Each donor was sequenced 8 times, varying only PCR amplification conditions, for a total of 24 sequencing libraries and therefore 48 paired-end fastq files.	Illumina MiSeq	3
EGAD50000001511	Dataset contains all newly generated single cell RNAseq data generated for the study. It includes 5 EMM soft tissue samples, 5 EMM bone marrow samples (EMM_BM) and 6 RRMM BM samples without EMM (RRMM_BM). Samples that are sequenced twice are marked by _1 or _2 in the sample name. The sample name mentioned corresponds to the sample name used in the publication. The rest of the EMM single cell RNAseq samples mentioned int the publication, EMM_03_1, EMM_11, EMM_14_1, EMM_14_2, EMM_16, and EMM_15_1 are already uploaded in EGA archive – dataset id EGAD50000000053.	Illumina NovaSeq 6000 Illumina NovaSeq X	18
EGAD50000001513	Single-cell G&T seq from an untreated PDX mouse with neuroblastoma (MYCN amplified).	unspecified	96
EGAD50000001514	Single-cell G&T seq from an untreated PDX mouse with neuroblastoma (MYCN amplified).	unspecified	96
EGAD50000001515	The SHH medulloblastoma tumor 7316-178 was received through the Childhood Brain Tumor Network (CBTN). From the patient tumor, disassociated cryopreserved cells stored in 10% DMSO/FBS were used. At least 50 mg of tissue (1 M cells) was used for both samples. Disassociated cells were prepared for Single Cell Multiome ATAC + Gene Expression sequencing (10× Genomics) according to the manufacturer’s instructions. Sequencing was performed on an Illumina NovaSeq S4 200 to a depth of at least 250 M reads for snATAC-seq and 200 M reads for snRNA-seq.	Illumina NovaSeq 6000	30
EGAD50000001517	Peripheral-blood or monocyte DNA from two healthy donors and two individuals with multiple sclerosis were sequenced on a PromethION-2 Solo (P2 Solo) instrument using R10.4.1 flow-cells and the Ultra-Long DNA Sequencing Kit (SQK-ULK114). For each participant we provide FASTQ reads (on-target), and haplotype-resolved assemblies of the complete IGH locus (~1.3 Mb) in FASTA format.	PromethION	4
EGAD50000001518	This dataset includes: amplicon sequencing data for a total of 3 BCP-ALL patients - fastq files	Illumina MiSeq	5
EGAD50000001519	This dataset includes: whole exome sequencing data for a total of 23 BCP-ALL patients - BAM files	Illumina NovaSeq X	46
EGAD50000001520	Liquid Biopsy High Grade Serous Ovarian Cancer WGS.	Illumina NovaSeq 6000	17
EGAD50000001521	This dataset contains ~1.2 TB RNA sequencing data in fastq format from 112 samples of fresh-frozen ovarian tumours from a total of 111 women, two samples being replicates. The samples were collected from the U-CAN collection at Uppsala Biobank and include both benign (n = 18) and malignant (n = 94) tumours. The RNA sequencing samples were sequenced using paired-end sequencing (2x150 bp) on an Illumina NovaSeq 6000 instrument at the SciLifeLab National Genomics Infrastructure (NGI) in Uppsala.	Illumina NovaSeq 6000	112
EGAD50000001522	Here, we conducted peripheral blood CD34+ cells single-cell RNA sequencing of patients with sickle cell anemia (n=4) and healthy donors (n=3)	Illumina NovaSeq X	7
EGAD50000001523	PCR-based VDJ profiling in IGH+ and IGHUND HGBCL-DH-BCL2(-BCL6) lymphomas	Illumina MiSeq	22
EGAD50000001524	GCB DLBCL MB2 DE and HGBCL-DH-BCL2(-BCL6) cases classified according to IGH status in IGH+ or IGH-undetectable	NextSeq 550	22
EGAD50000001525	High-throughput sequencing of IGHG1 alleles amplified from genomic DNA and the IGG1 (IGVH-CG1) functional transcript, flanking the gRNA-targeted region, from IGHUND HGBCL-DH-BCL2 COH-DHL1 cell line. IGHG1 targeting gRNAs were cloned into lentiCRISPRv2-Puro lentiviral vector (Addgene #98290). COH-DHL1 transduced cells were selected in Puromycin for 7 days, and expanded for further 17 days in the absence of any antibiotic selection.	Illumina MiSeq	6
EGAD50000001526	NGS of 5'RACE IGKV/IGLV PCR-products from IGH+ and IGHUND HGBCL-DH-BCL2(-BCL6) lymphomas	Illumina MiSeq	14
EGAD50000001527	HGBCL-DH-BCL2(-BCL6) lymphoma cases, classified according to IGH status in IGH+ or IGH-undetectable	Illumina NovaSeq 6000	12
EGAD50000001528	FL-HGBCL-DH-BCL2(-BCL6) metachronous lymphoma pairs, classified according to IGH status in IGH+ or IGH-undetectable	Illumina NovaSeq 6000	6
EGAD50000001529	scRNAseq dataset of circulating T cells from 3 FL patients (P011, P014 and P020) included in the GALEN clinical trial. There is two time points for each patient : One before treatment (D0) and one at day 7 of Lenalidomide treatment (D7).	Illumina NovaSeq 6000	6
EGAD50000001531	DNAmet analysis of olfactory mucosa (OM) cells derived from cognitively healthy and individuals with AD exposed to traffic-related ultrafine particles (UFPs) for 72h in submerged cultures. The UFPs used for exposures were: A0 and A20. Exposures were compared to the corresponding blank samples.		27
EGAD50000001532	miRNA Sequencing of olfactory mucosa (OM) cells derived from cognitively healthy and individuals with AD exposed to traffic-related ultrafine particles (UFPs) for 72h in submerged cultures. The UFPs used for exposures were: A0 and A20. Exposures were compared to the corresponding blank samples.	NextSeq 550	36
EGAD50000001534	RNAseq data of human single oocytes exposed to THC and its metabolites, (mean concentration measured in the follicular fluid, THC1, n=95, 25 ng/mL Δ9-THC, 5 ng/mL 11-OH-THC, 50 ng/mL 11-COOH-THC) or a supraphysiologic, THC2, n=93, 100 ng/mL Δ9-THC, 50 ng/mL 11-OH-THC, 200 ng/mL 11-COOH-THC).	Illumina HiSeq 4000	86
EGAD50000001536	This dataset contains clinical outcomes for 344 Diffuse Large B-Cell Lymphoma patients. For each subject we have recorded AGE (binarized: > 60, <= 60), SEX (male or female), LDH (serum lactate dehydrogenase), ECOG (Performance status 3 or 4), STAGE (Disease stage III or IV according to the Ann Arbor staging system), EXBM (Two or more extranodal sites of disease (outside of a lymph node), BSYMP (Presence of B symptoms), and BM (Bone marrow involvement). All data are delivered as a single XLSX file with one row per patient.		344
EGAD50000001537	The dataset consists of bulk whole exome sequencing (WES) and whole genome sequencing (WGS) data from baseline and disease-progressed tumors after immune checkpoint inhibitor (ICI) treatment, derived from patients, human cell lines, and mouse models. Whole exome sequencing includes 17 clinical cases and one human cell line model with parental and acquired resistant cell lines (n = 5). Whole genome sequencing encompasses the YUMMER mouse model, including vehicle-treated controls (n = 2) and acquired resistant tumors (n = 3). The sequence reads have been aligned to hg19 and mm9 reference genomes, with mapped files (BAM) made available for analysis.	Illumina HiSeq X Illumina NovaSeq X NextSeq 2000 unspecified	66
EGAD50000001538	We perform genome-wide EM-seq of 14 patients with concurrent matched normal, dysplasia and early gastric samples from South Korea. The dataset contains cram files from 38 samples.	Illumina NovaSeq 6000	38
EGAD50000001539	We generated whole genome sequencing data from 15 pairs of IM and matched normal. The dataset contains cram files from 30 samples.	Illumina NovaSeq 6000	30
EGAD50000001540	We perform targeted DNA sequencing of intestinal metaplasia of 463 IM and paired germline samples. An additional 48 IM and paired germline samples were included as validation cohort. The dataset contains cram files from 1022 IM and paired normal samples.	Illumina NovaSeq 6000	1022
EGAD50000001541	RNA-seq analyses were performed on tumor tissue samples of 115 advanced clear cell renal cell carcinoma patients who participated in the NIVOREN GETUG-AFU-26 trial. The samples were obtained prior to treatment initiation. RNA-seq was performed to evaluate previously published gene expression signatures. These included: the IMmotion T effector (CD8A, EOMES, PRF1, IFNG and CD274) and the IMmotion Myeloid (IL-6, CXCL1, CXCL2, CXCL3, CXCL8, and PTGS2) gene signatures represent gene expression patterns associated with T effector cells and myeloid cells, respectively, while JAVELIN Renal 101 Immuno (T-cell receptor signaling: CD3G, CD3E, CD8B, THEMIS, TRAT1, GRAP2, CD247; T-cell activation, proliferation, and differentiation: CD2, CD96, PRF1, CD6, IL7R, ITK, GPR18, EOMES, SIT1, NLRC3; Natural killer cell mediated cytotoxicity: D2, CD96, PRF1, CD244, KLRD1, SH2D1A; Chemokine: CCL5, XCL2; and Other immune response genes: CST7, GFI1, KCNA3, PSTPIP1) represents gene expression patterns associated with both innate and adaptive immune response.	Illumina HiSeq 4000	115
EGAD50000001542	The single- cell RNA-seq prefrontal cortex sample was selected from a cohort of patients with hydrocephalus ≥18 years of age planning to undergo CSF diversion surgery either with a ventriculoperitoneal shunt placement (VP) or ventriculocisternostomy (VCS). The tissue was dissociated followed by red cell and debris removal and counting of the cells to be sequenced with (scRNA-seq; 10x Genomics kit v2). After dissociation only non neuronal cells were retrieved.	Illumina HiSeq 2500	1
EGAD50000001544	Deposited here are time series (6, 24 and 48hrs) bulk RNA-seq data generated from isogenic wild-type (WT) RPE1 untreated or treated with Brequinar (and other dNTP-pool disruptor compounds)	Illumina NovaSeq 6000	48
EGAD50000001545	Transcriptome profiling of CRPS-affected skin using low-input total RNA-seq (paired-end mode with 54 and 68 nt read length for R1 and R2 respectively, Illumination NextSeq2000). 42 snap-frozen skin biopsies were analyzed together with longitudinal clinical data. The dataset contains FASTQ files, the processed read count matrix, and the corresponding clinical metadata.	NextSeq 2000	42
EGAD50000001546	This dataset consists of whole-genome sequences from 1,364 Korean breast cancers, with transcriptome data available for most cases	Illumina NovaSeq 6000	76
EGAD50000001547	Normal and Tumor paired WGS sequencing of UC Tissues	Illumina NovaSeq 6000	38
EGAD50000001548	Hybrid Capture of PyBKV integration in Urothelial Carcinoma from Kidney Transplant. The Panel was designed using Agilent SureDesign on SureSelect XTHS Hybrid Capture Panel. Sequencing was done on Illumina NovaSeq 6000	Illumina NovaSeq 6000	38
EGAD50000001549	This datasets contains single cell transcriptome sequencing of 10 control samples. Sequencing has been performed on Illumina NovaSeq 6000 using 10xGenomics Chromium Next GEM Single Cell 5 Reagent Kits. Sequencing was always paired. This dataset additionally contains a .QS file from Seurat Analysis performed in R, generated as described in the related publication.	Illumina NovaSeq X	10
EGAD50000001550	The ccRCC-ITH study WES dataset is derived from a morphology guided multi-regional sequencing dataset used to study the relationship between morphology and genetic evolution. It contains whole exome sequencing data of 18 tumor and 2 adjacent normal samples from 2 UTSW patients, who have consented to depositing their genomic data to public repository. Tumor samples come from different tumor sites, including primary tumor, thrombus and metastases. WES was performed using 75bp paired-end fragments at an average read depth > 50x on a NovaSeq 6000 platform (Illumina, San Diego, CA, USA). The raw data is in fastq format.	Illumina NovaSeq 6000	20
EGAD50000001551	Data from the D4Z4End2End study (Xiao et al. 2025), for genetic and epigenetic analysis of the D4Z4 macrosatellite in health and disease.	PromethION	12
EGAD50000001552	Using the TaKaRa SMARTer Stranded Total RNA-Seq Kit v2, RNA-seq libraries were generated from 5 ng of RNA extracted from FFPE tissue. A low-input workflow was applied with 5 PCR cycles for the first amplification and 16 for the second. Library quality was assessed by Bioanalyzer and Qubit, and libraries were sequenced (2×150 bp) on the Illumina NextSeq 550 platform.	NextSeq 550	6
EGAD50000001553	RNA was extracted from FFPE human tissue samples and used to prepare RNA-seq libraries with the Illumina Stranded Total RNA Prep Ligation kit with Ribo-Zero Plus. cDNA synthesis was performed starting from 100 ng of DNA-free RNA. Libraries were amplified with 16 PCR cycles and sequenced in paired-end mode (2×150 bp) on the Illumina NextSeq 550 platform. Library quality was assessed with Bioanalyzer and Qubit.	NextSeq 550	6
EGAD50000001555	This dataset includes scRNA-seq fastq files and Seurat counts and metadata for 6 samples of endothelial cells enriched from hiPSC-derived kidney organoids cultured in vitro or transplanted in chicken embryos. Samples were obtained at 2 timepoints: d7+13 (1 day after transplantation) and day 7+20 (8 days after transplantation).	DNBSEQ-T7	6
EGAD50000001556	Dataset of transcriptomics data from a cohort of individuals diagnosed with ovarian cancer. This dataset includes RNAseq sequencing data from 68 individuals. A total of 109 tumor samples were sequenced. Samples were prepared with the Illumina TruSeq Stranded RNA library. Sequencing data are paired end.	Illumina HiSeq 4000	109
EGAD50000001557	This dataset includes FASTQ files of single-nucleus RNA sequencing of cryopreserved kidney biopsy cores from adult patients with diagnosed primary FSGS (n = 9, all nephrotic), maladaptive FSGS (n = 9, not nephrotic), proteinuric controls (PLA2R-positive membranous nephropathy, n = 3), and healthy controls (n = 4). A total of 120,751 high-quality nuclei were identified, including 2,471 podocytes and 1,574 parietal epithelial cells (PECs). In addition to the raw FASTQ files, the dataset includes processed data files from all 25 samples, generated using Seurat in R: barcode files, features, count matrices, and associated metadata. Details regarding the bioinformatics pipeline can be found at https://github.com/lambrechtslab/FSGS_Deleersnijder_et_al	Illumina NovaSeq 6000	26
EGAD50000001558	The Emirati Genome Project (EGP) Variome comprises allele frequency data derived from 43,608 individuals sequenced as part of the national genome program in the United Arab Emirates. Samples were processed at the M42 EGP Facility and sequenced to a minimum of 30x coverage using Illumina NovaSeq 6000 short-read technology. Variant calling and alignment were performed using the DRAGEN pipeline (v3.9) against the GRCh38 reference genome. This dataset contains a total of 421,605,069 short variants (SNVs and indels), stored in VCF format. Each variant is annotated with population-level metrics in the INFO field, including: AC (alternate allele count) AF (alternate allele frequency) RC (reference allele count) RF (reference allele frequency) For convenience, VCF files are split by chromosome (chr1–22, X, Y), compressed and indexed using bgzip and tabix.	Illumina NovaSeq 6000	24
EGAD50000001559	High-coverage (>30X) whole genome data (FASTQ) for 25 southern African Khoe-San individuals (from five populations). These are the same individuals as in Schlebusch et al. 2020 (EGAD00001006183), but different data.	Illumina HiSeq X	25
EGAD50000001560	High-coverage (>30X) whole genome data (FASTQ) for 29 central African rainforest hunter-gatherer individuals (from five populations) and for 20 central African rainforest hunter-gatherer neighbor individuals (from four populations).	Illumina HiSeq X	49
EGAD50000001562	The dataset included Dynatag data for the occupancy of transcription factors and single nuclei RNA-seq data. Corresponding metadata is provided for both experiments.	NextSeq 2000	52
EGAD50000001563	This dataset includes whole genome sequencing of two patient-derived xenograft (PDX) samples with NUP98-Rearranged Acute Myeloid Leukemia. Bam files are provided for each sample.	Illumina NovaSeq X	2
EGAD50000001564	This dataset includes RNA-seq of two patient-derived xenograft (PDX) samples with NUP98-Rearranged Acute Myeloid Leukemia. Bam files are provided for each sample.	Illumina NovaSeq X	2
EGAD50000001565	Exome Sequencing of Healthy and M. Heamophilum Infected Participants	Illumina NovaSeq 6000	7
EGAD50000001566	Whole blood was stimulated Ex vivo with Heat Killed M abscessus and M haemophilum or left untreated with Saline for each individual Healthy Control and MH-infected participants	Illumina NovaSeq 6000	21
EGAD50000001567	WGS data from the Unicorn-PASS cohort for the Tandem Duplicator Phenotype paper		14
EGAD50000001568	Whole genome sequencing data from infant (<1 y) paediatric (< 18 y) KMT2A-rearranged acute lymphoblastic leukemias. Dataset includes fastq and BAM files from diagnostic and remission (control) samples of 3 patients. Dataset consists of two experiments depending on library preparation method; "Experiment 1" is made using normal WGS library preparation (pcr+), and "Experiment 2" is made using pcr-free library preparation method.	Illumina HiSeq X Illumina NovaSeq 6000	6
EGAD50000001569	This dataset contains raw RNA sequencing data from melanoma patient tumor samples collected before and during immunotherapy. The data support investigation of treatment-related changes in tumor cell populations. Samples were analyzed as part of a study exploring cellular responses to immune-based therapies.	Illumina HiSeq 4000	8
EGAD50000001570	Whole genome Illunima sequencing of cfDNA of 5 metastatic urothelial carcinoma patients. White blood cell (WBC) samples are used as patient matched germline samples (n=5). All files are in paired-end fastq format.	Illumina NovaSeq X	10
EGAD50000001571	Targeted tissue (n=210), cfDNA (n=402), and WBC (n=245) Illumina sequencing of metastatic urothelial carcinoma patients. All cfDNA is collected in the metastatic setting, while archival tissue is from across the disease course. White blood cell (WBC) samples are used as patient matched germline samples. All files are in paired-end fastq format.	Illumina HiSeq 2500	857
EGAD50000001572	This dataset contains whole sequencing data from 30 patients with pediatric AML. For each patient, there is one dataset derived from tumor cells and one dataset derived from a matched non-malignant control.	Illumina NovaSeq 6000	60
EGAD50000001573	This dataset contains fastq-files from single cell 5' RNA sequencing of the AML cell line HNT34 and normal T cells following co-culture with and without an antibody blocking SLAMF6 (TNC-1). The libraries were prepared using 10X GEM-X Universal 5' Gene Expression v3 Reagent Kit. In total, the dataset contains sequenced gene expression libraries from four samples (HNT34 co-cultured with T cells from two different donors; for both donors there is one sample with and one sample without the blocking antibody).	Illumina NovaSeq 6000	4
EGAD50000001574	This dataset contains bam-files from Mate-pair whole genome sequencing of 98 AML samples. DNA was extracted from either bone marow or peripheral blood from primary AML samples. The libraries were prepared using Illumina Nextera mate pair library preparation kit, generating long-insert (2-8 kb) paired end libraries. These were sequenced on an Illumina NextSeq 500 using 2x76bp paired end chemistry. The fastq files generated by sequencing were aligned to the human hg19 reference genome (ucsc.hg19.fasta from the GATK resource bundle) using bwa (0.7.15-r1140) and duplicate reads were identified using samblaster (0.1.24).	NextSeq 500	98
EGAD50000001575	This dataset contains bam-files from whole exome sequencing of 120 paired tumor-normal pairs from AML. Tumor DNA was extracted from either bone marow or peripheral blood from primary AML samples. Normal DNA was extracted from cultured skin fibroblast samples. The libraries were prepared using the Nextera rapid capture exome kit and sequenced on an Illumina NextSeq 500 using 2x151bp paired end chemistry. The fastq files generated by sequencing were aligned to the human hg19 reference genome (ucsc.hg19.fasta from the GATK resource bundle) using bwa (0.7.9a-r786 or 0.7.15-r1140) and duplicate reads were identified using samblaster (0.1.24).	NextSeq 500	232
EGAD50000001576	This dataset contains fastq-files from bulk RNA sequencing of 120 AML samples. RNA was extracted from either bone marow or peripheral blood from primary AML samples. The libraries were prepared using Illumina Truseq RNA library preparation kit v2 and sequenced on an Illumina NextSeq 500 using 2x151bp paired end chemistry.	NextSeq 500	120
EGAD50000001577	This dataset contains fastq-files from single cell 3' RNA- and feature barcode sequencing of AML samples using 10X technology (Chromium Single Cell 3ʹ Reagent Kit v3). In total, the dataset contains data from 38 AML samples from either bone marrow or peripheral blood, representing the following genetic subtypes: NPM1-mutated, myelodysplasia related, TP53-mutated, CBFB::MYH11, RUNX1::RUNX1T1, AML without class-defining mutations, and AML meeting criteria of two subtypes. The data consists of sequenced gene expression libraries from 38 samples and feature barcoding libraries covering a panel of 10-14 surface proteins for 34 samples.	unspecified	72
EGAD50000001584	Whole Genome Sequencing data from the COMPASS cohort and trial		536
EGAD50000001586	Mutation analysis of cfDNA from urine (stored at various storage conditions) and plasma using the QIAseq HRR panel	Illumina NovaSeq 6000	82
EGAD50000001587	Shallow WGS data from cell-free DNA extracted from plasma and urine (stored under different storage conditions)	Illumina NovaSeq 6000	154
EGAD50000001588	Mutation analysis in cfDNA from plasma and urine (stored under different storage conditions) from CRC patients using the AVENIO Expanded Kit	Illumina NovaSeq 6000	36
EGAD50000001589	Clonal CRISPRi/a hiPS cell lines were plated to induce spontaneous differentiation in KSR medium (KSR medium: knockout DMEM/F12 (Gibco, 12660012), 2 mM GlutaMAX, 1× NEAAs, 20% knockout serum replacement (Gibco, 10828010) and 0.1 mM β-mercaptoethanol (Merck-Millipore)). non-TC plates and dishes were used. At the day of plating, KSR medium was supplemented with Y-27632 (10 µM). The next day, medium was replaced with fresh KSR medium. Medium changes were performed every other day. EBs were transferred to 15 ml conical tubes and allowed to sediment at room temperature for 5 min. The supernatant was replaced with fresh KSR medium and cells were transferred onto non-TC treated plates and maintained at 37 °C, 5% CO2. On day 12 of differentiation, the KSR medium was replaced with Essential 6 medium (Gibco, A1516401). EBs were collected and subjected to SUM-seq library preparation as described in Lobato-Moreno et al. 2025 Nature Methods.	unspecified	16
EGAD50000001590	We analyzed subcutaneous adipose tissue (scAT) and periadrenal adipose tissue (paAT) from patients with Aldosterone-Producing Adenomas (APA) and, as controls, non-functional adrenal adenomas (NFA). RNA sequencing and immunohistochemistry of AT was performed (IHC, only on scAT). IHC markers included adipokines (leptin, adiponectin) and transcription factors (eg, c-jun, CaMKII)	NextSeq 2000	41
EGAD50000001592	The dataset includes 2 samples of Single cell RNA-seq (10X Genomics) from Human Embryonic stem cells. 2 samples with 2 replicates of timepoints 17 and 28 of dopaminergic neuron differentiation.	Illumina NovaSeq 6000	4
EGAD50000001593	Bam-file of the one replased KMT2A-r case. Diagnostic (ALLT-375D) and remission (ALLT-375R) samples can be found from EGA submission EGAD50000001568.	Illumina NovaSeq 6000	1
EGAD50000001594	Blood samples were collected from eight clinical trials and a British Columbia-based biobank. All plasma cfDNA and leukocyte DNA samples were processed with uniform methodology, and underwent targeted sequencing using a custom hybridization capture panel and Illumina instruments. Four different generations of custom hybridization capture panel were used, all providing coverage of the complete coding regions of 72 prostate cancer genes, and later panel generations also providing exhaustive coverage of the AR locus (including introns and flanking regulatory regions), as well as a genome-wide SNP grid for copy number analysis.	Illumina NovaSeq 6000	4189
EGAD50000001595	Sequencing dataset for the Predictive Endocrine ResistanCe Index (PERCI) in Breast Cancer cases of the WSG-ADAPT trial (NCT01779206). The Oncomine Comprehensive v3 assay (ThermoFisher) and a customized amplicon-based NGS-Panel (DERv2) was performed.	Ion GeneStudio S5 Prime	635
EGAD50000001596	This dataset comprises single-nucleus transcriptomic profiles from 10 human neuroblastoma samples. Nuclei were isolated and processed using the SMART-Seq2 protocol to enable full-length, high-resolution RNA sequencing at the single-nucleus level. cDNA libraries were generated and sequenced on the Illumina HiSeq 2500 platform. The resulting raw sequencing data are provided as .fastq files for downstream analysis.	Illumina HiSeq 2500	10
EGAD50000001597	This study presents RNA sequenced from several intestinal phenotypes such as from patients with damaged intestinal mucosa as well as from those without a damaged intestine but still an autoantibody response to tissue transglutaminase (“potential celiac disease”) or those with treated celiac disease. By combining phenotypes in a much larger number of tissue samples from both peripheral blood and mucosal biopsies, we found highly differentially expressed genes, but also identify genes involved in controlling and triggering celiac disease associated changes. This study identifies molecular mechanisms involved in celiac disease and new possible targets for treatment and for identification of individuals at risk.	Illumina NovaSeq 6000	281
EGAD50000001598	This data set contains genomic sequencing data and variant calls for PNG15 and PNG16.	Illumina NovaSeq 6000 PromethION Sequel II unspecified	2
EGAD50000001599	FASTQ-files of RNA-seq experiments of 8 primary human thymocyte subsets, isolated by flow-cytometry (FACS) from neonatal thymus biopsies. All samples were sorted to be negative for the expression of the lineage surface markers CD11c, CD19, CD56, CD141 and CD303 and additionally had the following surface phenotypes: DN1 (CD4-CD8-CD7+CD1a-CD161-CD38-); DN2 (CD4-CD8-CD7+CD1a-CD161-CD38+CD3-TCRab-); DN3 (CD4-CD8-CD7+CD1a+); ISP (CD4+CD8-CD3-TCRab-CD1a+); DPE (CD4+CD8+TCRab-CD69-); DPL (CD4+CD8+TCRab+CD69+); SP4 (CD4+CD8-); SP8 (CD4-CD8+). Samples were sorted from 2-4 pediatric patients undergoing heart surgery.	NextSeq 2000	28
EGAD50000001600	FASTQ-files of EM-seq experiments of 6 primary human thymocyte subsets, isolated by flow-cytometry (FACS) from neonatal thymus biopsies. All samples were sorted to be negative for the expression of the lineage surface markers CD11c, CD19, CD56, CD141 and CD303 and additionally had the following surface phenotypes: DN2 (CD4-CD8-CD7+CD1a-CD161-CD38+CD3-TCRab-); DN3 (CD4-CD8-CD7+CD1a+); ISP (CD4+CD8-CD3-TCRab-CD1a+); DPE (CD4+CD8+TCRab-CD69-); DPL (CD4+CD8+TCRab+CD69+); SP4 (CD4+CD8-). Samples were sorted from 1-2 pediatric patients undergoing heart surgery.	NextSeq 500	12
EGAD50000001601	This dataset contains BCR sequences of buffy coat purified B-cells from 5 healthy donors. For each sample, sequences were obtained in two conditions : with control Aso and Dis Aso treatment.	NextSeq 2000	10
EGAD50000001602	Deposited here are WGS data generated from isogenic wild-type (WT) RPE1 untreated and treated with brequinar for 42 days (and other dNTP-pool disruptor compounds)	Illumina NovaSeq 6000	15
EGAD50000001603	Dataset contains RNA-sequencing data (fastq files) from 46 patients treated on the Australasian Leukaemia and Lymphoma Group (ALLG) ALL09 study (ACTRN12618001734257). All patients were diagnosed with CD19 positive B-cell acute lymphoblastic leukaemia and were negative for Philadelphia chromosome positive disease. Patient samples (bone marrow or peripheral blood) were taken at diagnosis. mRNA was extracted from lymphoblasts and underwent paired-end sequencing using either the illumina NextSeq (75 bp PE) or MGI DNABSEQ-G400 (100bp PE) platforms. Transcriptomic data was used for identification of genomic drivers of acute lymphoblastic leukaemia (ALL).	DNBSEQ-G400 NextSeq 500	46
EGAD50000001605	ZFHX4 as a transcription factor specifically induced during the in vitro differentiation of iPSC-derived mDANs. Depletion of ZFHX4 during neuronal differentiation led to a significant reduction in the number of mDANs while its overexpression did not lead to any changes in mDAN count. Identification of ZFHX4 target genes using CUT&Tag profiling and RNA-seq upon ZFHX4 knock-down (KD) revealed its involvement in cell cycle regulation during mDAN differentiation	unspecified	21
EGAD50000001606	RNA bulk sequencing on organoids from colon, ileum, jejunum, duodenum and pancreas duct with or without TS2/16 antibody. Also there is difference in extracellular matrix used, BME vs Colagen.	NextSeq 2000	31
EGAD50000001607	This dataset includes sequencing data from participants of the LongVar study. Specifically, it comprises low-coverage whole genome sequencing (WGS) data which was derived from cell-free DNA from blood samples. These data were generated to compare imputation accuracy. 133 individuals agreed to share their data, and are healthy volunteers.	Illumina NovaSeq 6000	133
EGAD50000001609	Sequencing was performed on the PromethION Flow Cell R10 (M version) using the P2 Solo Sequencer (MinION release 24.02.16). The library was divided into three portions and loaded in a tapered manner across three time points. At the first time point, the initial portion of the library was loaded, and sequencing began in whole-genome mode (without adaptive sampling), to monitor QC (N50 within the expected range). After one hour, adaptive sampling was activated. A custom BED file was used to define the target regions (N=326). Each region was extended by 20 kb upstream and downstream, and overlapping regions were merged into non-redundant intervals using bedtools43, covering a total of 1.3% of the human genome. At the second time point (20–24 hours after sequencing began, or when fewer than 2000 pores remained active), the flow cell was washed following ONT’s Flow Cell Wash Kit protocol, and the second portion of the library was loaded. At the third time point (40–48 hours), the second portion of the library was retrieved, and merged with the third portion. The flow cell was washed and reloaded with the mixed library. POD5 files were basecalled using Dorado v0.8.1 in super-accuracy (SUP) mode with the dna_r10.4.1_e8.2_400bps_SUP@v5.0.0. Reads from the initial 1 h Whole Genome Sequencing (WGS) and subsequent 72 h adaptive sampling were filtered for Q-scores >10. Passing reads were demultiplexed using Dorado’s demux function, then combined per sample and mapped to the GRCh38 reference genome using Minimap2 v2.22. Files were converted to CRAM using samtools.	PromethION	17
EGAD50000001613	Whole-exome sequence reads were obtained from a set of 934 unrelated individuals from the study of sepsis and acute distress respiratory syndrome in Spain (GEN-SEP study), using Illumina paired-end reads.		934
EGAD50000001614	RNA-Sequencing data of CAFs (patient derived CAF180 and CAF181 as well as the CT5.3 CAF cell line) and HT-29 tumor spheroid samples performed in the context of the study of the interaction of Fusobacterium nucleatum with CRC CAFs. Fusobacterium nucleatum interacts with cancer-associated fibroblasts to enhance colorectal cancer progression	NextSeq 2000 unspecified	13
EGAD50000001615	68 paired-end raw FASTQ files obtained from 38 tissue sections across 16 hearts, spanning embryonic human heart development between 5.5 and 14 postconceptional weeks. Some libraries were re-sequenced for better coverage. The tissue sections were processed using the 10x Genomics Visium Gene Expression kit, and libraries were paired-end sequenced using Nextseq2000 (Illumina).	NextSeq 2000	68
EGAD50000001616	CLUSTER consortium RNA sequencing dataset from UK JIA patient cohort with documented uveitis status. Peripheral blood mononuclear cells (PBMCs) were isolated from blood samples and cryopreserved. Samples were subsequently thawed and flow-sorted for CD19+ B cells prior to RNA sequencing. Clinical metadata includes patient demographics and uveitis status at time of sampling.	Illumina NovaSeq 6000	133
EGAD50000001618	Six EATLs and three MEITLs (among which 5 had only tumoral tissue available) were sequenced using the xgen research panels v2.0 from IDT DNA (Newark, New Jersey, USA).	Illumina HiSeq 4000	15
EGAD50000001619	Libraries for sixty-eight cases (23 EATL and 45 MEITL) were prepared starting from 1 µg of total RNA using the TruSeq Stranded total RNA Gold kit from Illumina (San Diego, California, USA) and were sequenced in three batches on a HiSeq 4000 machine from Illumina with an average output of 63 million reads per sample (range 49-126).	Illumina HiSeq 4000	75
EGAD50000001620	Fourteen MEITLs and one EATL previously had been sequenced using the SureSelectXT Target Enrichment System (Agilent Technologies, Santa Clara, CA, USA).	Illumina HiSeq 2500	30
EGAD50000001621	Twenty-six EATLs and 22 additional cases of MEITLs (among which 5 had only tumoral tissue available) were sequenced using the xgen research panels v1.0 and v2.0 from IDT DNA (Newark, New Jersey, USA).	Illumina HiSeq 4000	78
EGAD50000001622	Data derived from PBMC and blood samples for the single cell and bulk RNAseq analysis for the sc-DECISION paper. https://doi.org/10.1101/2025.02.04.25321370	Illumina NovaSeq 6000 NextSeq 500	20
EGAD50000001623	Little is known about the effects of ISGs on NK cell responses in chronic viral hepatitis. Here, we determined the expression, regulation and function of ISGs in NK cells with respect to phenotype and function, we performed high-throughput single cell RNA sequencing of circulating CD3-CD56+ NK cells from healthy donors (HD) and patients with chronic HBV or HCV infections.	NextSeq 550	16
EGAD50000001625	The dataset contains 18 samples. It includs 12 paired whole genome sequencing, and 2 matched short read and long read RNA sequencing of liver organoids, 2 paired RIP RNA sequencing samples. All the experiments were performed on Illumina platform with raw reads stored in fastq format or bam format.	Illumina NovaSeq 6000 Illumina NovaSeq X	18
EGAD50000001626	new 70 germline WGS data of prostate cancer patients	Illumina NovaSeq 6000	70
EGAD50000001628	This dataset contains raw sequencing reads in FASTQ format from 12 samples, comprising paired tumour and matched blood DNA from six patients diagnosed with poorly differentiated thyroid carcinoma (PDTC). Genomic DNA was extracted from fresh-frozen tumour tissue and peripheral blood using standard protocols. Whole-genome sequencing was performed on the Illumina HiSeq X Ten platform, generating paired-end reads at an average depth of approximately 20X. These data support investigations into the somatic mutational landscape and genomic alterations associated with PDTC.	Illumina HiSeq X	12
EGAD50000001629	This dataset contains fastq files from scRNAseq dataset of myeloid cells from secondary lymphoid organs (lymph nodes and tonsil) from lymphoma patients (3 FL and 3 DLBCL) and controls (3 reactive tonsils and 3 reactive lymph nodes). For HTO libraries, demultiplexing barcodes are available in samples descriptions.	Illumina NovaSeq 6000	8
EGAD50000001630	This dataset comprises molecular and immunogenomic profiles from 60 individual NSCLC patients (30 male, 30 female), with samples collected at multiple timepoints including initial biopsies (IB), resections (RES), and peripheral blood mononuclear cells (PBMC) at different timepoints during treatment. It includes 32 FFPE tumor samples sequenced on Illumina with the FoundationOne panel, 20 FFPE samples sequenced on IonTorrent with the Oncomine Comphrehensive Assay Plus, and 41 samples assessed with IonTorrent via Oncomine Immune Response Research Assay panel for targeted gene expression. T-cell receptor (TCR) repertoire profiling was performed on 62 tumor samples (initial biopsies and resections, Oncomine TCR Beta-SR Assay), and 155 blood-derived samples were sequenced with Oncomine TCR Beta-LR Assay. Data files include processed sequencing results in bam format.	Ion GeneStudio S5 Prime Ion Torrent S5 unspecified	310
EGAD50000001631	ATAC-seq performed in OCI-AML22 subpopulations sorted based on CD34 and CD38 status	Illumina NovaSeq 6000	8
EGAD50000001632	This is the aligned bam files of 5 patients, with sequential sampltes from patients comprising cfDNA and FFPE extracted tumour DNA	Illumina NovaSeq 6000	18
EGAD50000001633	single-nucleus RNA seq characterization of human fresh frozen prostate cancer samples and non-malignant controls for the same patients. Biopsies were sectioned in 100 micron thick sections, adjacent to those used for spatial transcriptomics.	NextSeq 2000	16
EGAD50000001634	Spatial transcriptomic characterization of human fresh frozen prostate cancer and non-malignant regions from the same patient. 100 micron thick sections were made from the biopsies, adjacent to snRNA sequencing, and subjected to 10x Genomics Visium ST analysis.	NextSeq 2000	16
EGAD50000001639	This dataset is the immune profiling of the trial patients. Cryopreserved PBMCs obtained from the patients at different time points were processed to single-cell transcriptomic or TCR libraries using 10x next GEM single Cell 5' v2 kit. We also performed whole exome sequencing (WES) of the tumor to identify the mutation burden. The WES data was generated by sequencing company Novogene. There are total of 69 of single cell libraires, 69 of single cell TCR libraries, and 13 of tumor WES data being generated from 18 trial patients recruited at University of Florida.	Illumina NovaSeq 6000	82
EGAD50000001640	Drug induced photoreactions following exposure to UVA have been documented for several therapeutic compound classes, including psoralens and fluoroquinolones. We used deep (~80X) whole genome sequencing to examine the mutagenic spectrum of CX5461, quinolone-derived small molecule, in normal tissues of cancer patients undergoing treatment. We compared the distribution of somatic base substitutions and short indels from skin biopsies taken from non-sun exposed areas pre and post CX5461 exposure of patients. To further characterize the mutational pattern of CX5461 photoreactivity, we compared mutational profiles of human retinal pigment epithelial cells exposed to UVA, CX5461, or a combination of both UV and CX5461.	Illumina HiSeq X Illumina NovaSeq X	38
EGAD50000001641	This dataset includes NGS data of large B-cell lymphoma, including bulk RNAseq data of 71 biopsies and germline WES data of 70 biopsies.	Illumina NovaSeq 6000	141
EGAD50000001642	ATAC-seq performed in peripheral blood samples of patients affected by Acute Myeloid Leukemia (AML)	Illumina NovaSeq 6000	77
EGAD50000001643	We analyzed the genome of plasma cells collected from bone marrow samples or plasmocytoma from patients prior to (when available) and progression on anti-GPRC5D T cell engagers. Dataset includes tumor and matched normal WGS data aligned to GRCh38.p14 in bam format, as well as tumor small variant and CNV calls.	Illumina NovaSeq X NextSeq 1000	47
EGAD50000001644	RNA sequencing data of tumors from 53 patients with advanced or metastatic cancer	Illumina NovaSeq 6000	133
EGAD50000001645	The data set contains the fastq files of the single cell RNAseq study associated with the iCope manuscript. There are fastq files for gene expression and for tcr enrichment for each sample.	Illumina NovaSeq 6000	34
EGAD50000001646	We profile the whole-transcriptome (bulk RNAseq) of 7 patient-derived Sézary Syndrome (SS) cells to identify expression patterns, functional programs and expressed gene mutations that may provide clues on new therapeutic options for SS patients. The libraries were sequenced on NextSeq500 (Illumina) with a paired-end read length of 2x75bp. Raw data (FASTQ) and obtained processed data (VCF) including all called raw variants are available.	NextSeq 500	7
EGAD50000001647	To investigate gene expression dynamics during myogenic differentiation in myotonic dystrophy type 1, RNA was collected from immortalized control, patient-derived, and CRISPR/Cas9-corrected myoblasts at proliferation (day -2) and at days 1 and 5 of differentiation. High-quality RNA was isolated and sequenced using poly(A)-enriched or rRNA-depleted protocols, yielding 100 bp paired-end reads with high coverage. A subset of samples at day 1 underwent deeper sequencing to enhance transcriptome resolution.	BGISEQ-500	30
EGAD50000001649	BAM files for 42 IDH wildtype, untreated, human glioblastoma samples from the GB-UK cohort, published in Noorani & Haughey et al. Cancer Discovery (2025).	Illumina NovaSeq 6000	42
EGAD50000001650	Sequencing of laser-capture micro dissected colorectal tumour glands using a targeted capture panel. Tumours were previously subjected to WGS in the parent EPICC study. UMI-resolved aligned bam files, aligned to hg38.	Illumina NovaSeq 6000	94
EGAD50000001651	High-throughput amplicon sequencing were performed on CD34+ hematopoietic stem and progenitor cells (HSPCs) derived from 4 healthy donors and one unaffected heterozygous carrier of a CYBB c.252G>A variant causing X-linked granulomatous disease. Targeted genome editing were performed in the HSPCs of CYBA and CYBB genes using CRISPR-Cas9 and recombinant AAV6. To assess gene editing outcomes in CD34+ HSPCs, the targeted loci were amplified from genomic DNA and sequenced on an Illumina iSeq100.	Illumina iSeq 100	127
EGAD50000001652	Human brain single nuclei amplifed by Droplet MDA. Nanopore long read whole genome sequencing. Two cases of multiple system atrophy, one control.	MinION PromethION	38
EGAD50000001654	SCANDARE (NCT03017573) is a multicentric biobanking study, enrolling adult patients with newly diagnosed head and neck squamous cell carcinoma (HNSCC), triple negative breast cancer (TNBC), ovarian and cervical cancer. Tumor tissue and blood samples are collected at several time points during patient's journey, including at diagnosis, post-neoadjuvant chemotherapy in case of neoadjuvant treatment, at surgery, at recurrence and at disease progression following treatment initiated at recurrence. Since its launch in 2017 at Institut Curie, SCANDARE has enabled the longitudinal collection and preservation of samples for in-depth analyses. Here, we describe molecular alterations in 232 FFPE samples at baseline from 83 HNSCC patients using DNA targeted panel sequencing, with FASTQ files deposited in the dataset.	unspecified	232
EGAD50000001655	SCANDARE (NCT03017573) is a multicentric biobanking study, enrolling adult patients with newly diagnosed head and neck squamous cell carcinoma (HNSCC), triple negative breast cancer (TNBC), ovarian and cervical cancer. Tumor tissue and blood samples are collected at several time points during patient's journey, including at diagnosis, post-neoadjuvant chemotherapy in case of neoadjuvant treatment, at surgery, at recurrence and at disease progression following treatment initiated at recurrence. Since its launch in 2017 at Institut Curie, SCANDARE has enabled the longitudinal collection and preservation of samples for in-depth analyses. Here, we describe molecular alterations in 264 FFPE samples at baseline from 90 HNSCC patients using 3'-tag RNA-seq technology, with FASTQ files deposited in the dataset.	unspecified	264
EGAD50000001657	H3K27Ac and H3K4me3 CUT&Tag fastq data using resected tissue from 6 cell types: Microglia (PU1+), Brain endothelial cells (ERG+), Mural cells (NOTCH3+), Neurons (NEUN+), Oligodendrocytes (OLIG2+) and Astrocytes (RFX4+). Transcription factor CUT&Tag for PU1, ERG and OLIG2. Promoter capture HiC for Primary pericytes, Primary Brain endothelial cells and Astrocytes.	NextSeq 2000	41
EGAD50000001658	SCANDARE (NCT03017573) is a multicentric biobanking study, enrolling adult patients with newly diagnosed head and neck squamous cell carcinoma (HNSCC), triple negative breast cancer (TNBC), ovarian and cervical cancer. Tumor tissue and blood samples are collected at several time points during patient's journey, including at diagnosis, post-neoadjuvant chemotherapy in case of neoadjuvant treatment, at surgery, at recurrence and at disease progression following treatment initiated at recurrence. Since its launch in 2017 at Institut Curie, SCANDARE has enabled the longitudinal collection and preservation of samples for in-depth analyses. Here, we describe molecular alterations in 91 frozen samples (n = 53 at baseline, n = 31 Post-NAC, n= 7 Recurrence, and n = 62 germline DNA) from 64 ovarian cancer patients using WES technology, with FASTQ files deposited in the dataset.	Illumina NovaSeq 6000	153
EGAD50000001659	Longitudinal blood samples for exploratory biomarker analysis were collected from 51 out of 83 patients enrolled in CA011-001 (NCT02419417) study of BMS-986152, totaling 425 samples collected on three different dosing schedules (A, B, and C). Whole transcriptome expression profiling was performed on the collected blood samples using RNASeq, with Illumina TruSeq library preparation and paired sequencing. Gene expression data was quantified as transcripts per million (TPM).	NextSeq 550	51
EGAD50000001661	SCANDARE (NCT03017573) is a multicentric biobanking study, enrolling adult patients with newly diagnosed head and neck squamous cell carcinoma (HNSCC), triple negative breast cancer (TNBC), ovarian and cervical cancer. Tumor tissue and blood samples are collected at several time points during patient's journey, including at diagnosis, post-neoadjuvant chemotherapy in case of neoadjuvant treatment, at surgery, at recurrence and at disease progression following treatment initiated at recurrence. Since its launch in 2017 at Institut Curie, SCANDARE has enabled the longitudinal collection and preservation of samples for in-depth analyses. Here, we describe molecular alterations in 247 frozen samples at baseline and 138 germline DNA from 139 TNBC patients using WES technology, with FASTQ files deposited in the dataset.	Illumina NovaSeq 6000	385
EGAD50000001662	SCANDARE (NCT03017573) is a multicentric biobanking study, enrolling adult patients with newly diagnosed head and neck squamous cell carcinoma (HNSCC), triple negative breast cancer (TNBC), ovarian and cervical cancer. Tumor tissue and blood samples are collected at several time points during patient's journey, including at diagnosis, post-neoadjuvant chemotherapy in case of neoadjuvant treatment, at surgery, at recurrence and at disease progression following treatment initiated at recurrence. Since its launch in 2017 at Institut Curie, SCANDARE has enabled the longitudinal collection and preservation of samples for in-depth analyses. Here, we describe molecular alterations in 108 FFPE samples at baseline and 108 germline DNA from 108 TNBC patients using WGS technology, with FASTQ files deposited in the dataset.	Illumina NovaSeq X	216
EGAD50000001664	This dataset includes sequence files from bulk RNA sequencing of tissue biospecimens from a phase II study investigating epigenetic priming followed by immune checkpoint blockade in platinum-resistant epithelial ovarian cancer (NCT02900560). There are 48 bulk RNA sequencing files from 24 specimens included in this dataset.	Illumina NovaSeq 6000	24
EGAD50000001665	This dataset is made of 166 samples extracted from 1) FFPE blocks of the most recent biopsy (metastasis when applicable or primary tumor) of the breast tumor 2) Plasma samples and blood samples at baseline, at Day 1 Cycle2, Day 1 Cycle 3 and at progression or treatment discontinuation. Target sequencing has been perfomed using GREAT_A_v3 panel	NextSeq 500 NextSeq 550	166
EGAD50000001666	This data set contains 5 paired fastq files (WGS).	Illumina NovaSeq 6000	4
EGAD50000001668	High resolution amplicon-bases sequencing o screen for PIK3CA hot spot mutations	NextSeq 550	156
EGAD50000001669	High resolution mutation analysis using the AVENIO ctDNA Expanded Kit V2, which targets 77 cancer-associated genes, in cfDNA in metastatic breast cancer.	Illumina NovaSeq 6000	158
EGAD50000001670	mFAST‑SeqS, or modified Fast Aneuploidy Screening Test‑Sequencing System, is a streamlined and cost-effective approach to estimate the fraction of ctDNA in blood samples by detecting genome-wide aneuploidy in cfDNA	NextSeq 500 NextSeq 550	161
EGAD50000001673	Ewing sarcoma is characterized by pathognomonic translocations fusing most frequently EWSR1 with FLI1 (EF1). In addition, Ewing sarcoma can also display alterations in STAG2, TP53 and CDKN2A (SPC). Starting from Ewing sarcoma derived human mesenchymal stem cells (MSCpat), we recapitulated this translocation and SPC alterations using a CRISPR/cas9 approach and generated a bona fide Ewing sarcoma model (EWIma1) displaying transcriptomic (RNA-seq) and epigenetic (ChIP-seq) hallmarks of EwS.	Illumina HiSeq 2500 Illumina NovaSeq 6000	66
EGAD50000001674	Whole exome sequencing for CIAO clinical trial with tumor and paired normal tissue	Illumina HiSeq 4000	56
EGAD50000001675	Whole exome sequencing for HNSCCs treated with immune checkpoint blockade	Illumina HiSeq 4000	19
EGAD50000001676	RNAseq (QuantSeq) from FFPE tissues for the CIAO clinical trial	Illumina HiSeq 4000	28
EGAD50000001678	DNA samples were quantified with Qubit BR kit on a Qubit2.0 fluorimeter (ThermoFisher Scientific), and integrity was checked using agarose gel electrophoresis. DNA from samples passing the quality control was bisulfite-converted using the EZ DNA Methylation-Gold™ Kit (Zymo Research, Cat. D5005) following the manufacturer's instructions. Then, bisulfite-converted DNA was hybridized into the Methylation EPICv1.0 BeadChip (Illumina) array interrogating > 850,000 CpG sites according to manufacturer's instructions	unspecified	3
EGAD50000001679	For extracellular vesicles (ECV) extraction from plasma, 1M ammonium acetate was added to precipitate ECVs on ice for 45 min. Then, 100 mM ammonium acetate was added to the mixture, and ECVs were precipitated by centrifugation at 20,000g for 30 min. ECVs were washed with 50 mM ammonium bicarbonate (Sigma-Aldrich, Cat. 1066-33-7). Then, 600 µl of 1% ammonium deoxycholate (Sigma Aldrich, Cat. K2755-1MG) were added. The concentration of protein in each sample was measured using a bicinchoninic acid assay (BCA assay). Ammonium bicarbonate was used to dilute 500 µg of protein into a final volume of 500 µl. Next, dithiothreitol (Sigma-Aldrich, Cat. D9779) was added to obtain a final concentration of 20 mM, followed by iodoacetamide (Sigma-Aldrich, Cat. I6125) to a final concentration of 40 mM. Next, trypsin (Roche, Cat. RTRYP-RO) was added to the sample in a 1:25 protein ratio and incubated at 37°C overnight. Next day, formic acid (ThermoFisher Scientific, Cat. 28905) in a final concentration of 0.1% was added and extraction of proteins was done using Empore™ Solid Phase Extraction Cartridges (3M), following manufacturer’s instructions. The eluted samples were then centrifuged for 90 min. using a speed vacuum centrifuge (Thermo, RC1010), followed by snap freezing in liquid nitrogen. Then, the samples were kept in a freeze dryer (LyoDry Compact Benchtop, MechaTech) overnight. Next, the samples were reconstituted in 30 µl of 0.1% formic acid (FA) and an o-Phthaladehyde (Oparil) assay was performed to determine the concentration of each sample. After that, the sample was prepared in a concentration of 0.5 µg/µl using 0.1% FA and alcohol dehydrogenase (ADH). The samples were prepared in glass mass spectrometry vials for proteomic analysis using a Waters Synapt G2Si High-Definition Mass Spectrometry (Waters Corporation) operated by the MassLynx 4.1., 110 min. running time with 2 µl of an injection containing 1 µg of peptide. Quality controls were also run along with samples to guarantee consistency. Pooled quality controls were made from all samples, in which the samples were run at the beginning, middle and end of the mass spectrometry run. Samples were randomized before running the experiment. The proteomic data was then imported into Progenesis software 4.2 (Nonlinear Dynamic, UK) to identify and quantify peptides and proteins.	unspecified	1
EGAD50000001680	This dataset contains single-cell RNA-seq libraries prepared using the Chromium Single Cell 5' Library & Gel Bead Kit v1.1 (10x Genomics). Cells were partitioned into nanoliter-scale GEMs for barcoding and reverse transcription, followed by amplification and purification of cDNAs. Libraries were sequenced on an Illumina NovaSeq 6000 using 150 bp paired-end reads, ensuring high-resolution transcriptomic profiling.	Illumina NovaSeq 6000 NextSeq 550	34
EGAD50000001681	RNA libraries were prepared from 100 ng of total RNA using the Illumina Stranded Total RNA Prep with Ribo-Zero Plus kit, then sequenced (2×150 bp) on the Illumina NextSeq550. Transcript-level quantification was done with kallisto, followed by gene-level summarization (tximport) and differential expression analysis (DESeq2). Quality control included PCA, hierarchical clustering, and deconvolution with CIBERSORTx to exclude non-myeloid genes. Final analyses identified transcriptomic differences among healthy donors and three patient subgroups based on response to dendritic cell therapy.	NextSeq 550	27
EGAD50000001682	This dataset contains a merge VCF file generated from WES data of patients diagnosed with familial Meniere disease (FMD). Variant calling followed GATK best practices using the nf-core/Sarek pipeline (v3), and variants were filtered using genotype-level thresholds consistent with gnomAD filters. Multiallelic variants were split and INDELs were left-aligned during normalization. Variant Quality Score Recalibration (VQSR) was applied separately to SNVs and INDELs using well-established truth sets, with a 90% sensitivity threshold to maximize the detection of rare variants. Final variants were annotated with Ensembl VEP.	Illumina NovaSeq 6000	93
EGAD50000001683	This dataset contains a merge VCF file generated from WES data of patients diagnosed with sporadic Meniere disease (FMD). Variant calling followed GATK best practices using the nf-core/Sarek pipeline (v3), and variants were filtered using genotype-level thresholds consistent with gnomAD filters. Multiallelic variants were split and INDELs were left-aligned during normalization. Variant Quality Score Recalibration (VQSR) was applied separately to SNVs and INDELs using well-established truth sets, with a 90% sensitivity threshold to maximize the detection of rare variants. Final variants were annotated with Ensembl VEP.	Illumina NovaSeq 6000	287
EGAD50000001684	II2_hg38.bwa.QC.vcf III1_III2_hg38.bwa.QC.vcf The VCF files of the mother (II:2) and both affected siblings (III:1+III:2, merged).	NextSeq 550	2
EGAD50000001686	Out of the 2,509 Estonian Microbiome Project participants, a sub-cohort of over 300 individuals provided an additional stool sample after a median follow-up period of 4.4 years. All participants of the EstMB cohort gave informed consent for the data and samples to be used for scientific purposes. In the second time point, the participants were instructed to send the samples by post immediately after taking the sample and time their sample collection to avoid shipping on the weekends. The median time between sampling and arrival at the freezer in the core facility for the second time point was 53 hours and 1 minute (mean: 59 hours and 22 minutes; minimum: 34 minutes; maximum: 168 hours and 7 minutes [note: additionally, one outlying measurement was 508 hours and 27 minutes. The shotgun metagenomics paired-end sequencing was performed by Novogene Bioinformatics Technology Co., Ltd., using Illumina NovaSeq6000 platform, resulting in 4.62 ± 0.44 Gb of data per sample (insert size 350 bp, read length 2 × 250 bp).	Illumina NovaSeq 6000	328
EGAD50000001687	This dataset contains whole genome sequencing (WGS) data from 42 IPMN-PDAC and 12 normal samples, with tumours collected from multiple regions within each tumour to capture intra-tumour heterogeneity. Tumour and matched normal samples were sequenced to study somatic mutations, structural variants, copy number alterations, mutational signatures and clonal evolution during IPMN-PDAC progression. The sequencing was performed using Illumina NovaSeq instrument with 150 bp paired-end reads.	Illumina NovaSeq 6000	54
EGAD50000001689	This dataset comprises single-nucleus transcriptomic profiles from 9 human paraganglioma samples. Nuclei were isolated and processed using the SMART-Seq2 protocol to enable full-length, high-resolution RNA sequencing at the single-nucleus level. cDNA libraries were generated and sequenced on the Illumina HiSeq 2500 platform. The resulting raw sequencing data are provided as .fastq files for downstream analysis.	Illumina HiSeq 2500	9
EGAD50000001690	7 samples from slice cultures kept for 4-29 days from forebrain of the developing human embryo at ages between post conceptional week 7.5 and 10.5 were dissociated and single cells were collected and processed without bias for mRNA-seq using the 10Xgenomics chromium 3' protocol version 3. Libraries were sequenced on Illumina NovaSeq 6000 and reads aligned against the human GRCh38 genome.	Illumina NovaSeq 6000	7
EGAD50000001691	Whole-genome sequencing of normal blood and brain tissue using Biomodal multi-omics technology.	Illumina NovaSeq 6000	2
EGAD50000001692	Three AML samples from patients with mutations in DNMT3A, IDH2, TET2, and a control sample of whole genome amplified DNA.	GridION	4
EGAD50000001693	WGS data for paired tumor and normal tissue from 260 patients in the TRACERx ctDNA study. The patient cohort is comprised of NSCLC patients undergoing surgical resection of their tumour. The patients were then followed up for a median of ~5 years after surgery, and serial blood samples were collected. The WGS data here was generated to inform the manufacture of custom ctDNA panels.	Illumina NovaSeq X	520
EGAD50000001694	Bulk RNA-seq of genome-edited HEK293T cell lines engineered to model specific PKD1 variants. These clones were created with the purpose of enabling the exploration of genotype-phenotype correlations in Polycystic Kidney Disease (PKD)	NextSeq 550	12
EGAD50000001696	Targeted DNA sequencing bam files from pre-treatment peripheral blood mononuclear cells collected from patients on the NCT02644369 Phase II trial of Pembrolizumab for metastatic solid tumours		88
EGAD50000001697	Bam files with all reads covering 976 fCpGs identified in Gabbutt and Duran-Ferrer 2025 to develop the EVOFLUx methodology. These include 10 samples, 6 normal B cells, and 2 CLL patients developing RT.	unspecified	10
EGAD50000001700	This data set contains 12 paired fastq files (RNA-seq) from one wild-type iPS cell, two HLA-KO iPS clones and three CD14+ monocytes differentiated from them. The RNA-seq libraries were generated with Illumina TruSeq Stranded Total RNA library prep kit. The libraries were sequenced with Illumina NovaSeq 6000 in paired-end mode.	Illumina NovaSeq 6000	6
EGAD50000001701	The dataset for the study “ctDNA residual disease analyses during perioperative nivolumab or nivolumab plus ipilimumab in resectable diffuse pleural mesothelioma” includes 169 bam files from whole genome sequencing on the Illumina NovaSeq6000 platform. The samples analyzed include tumor (n=28) and normal (n=28) DNA samples, and serial plasma cfDNA (n=113) samples from 28 individuals with resectable diffuse pleural mesothelioma, treated with nivolumab or nivolumab plus ipilimumab on the NCT03918252 trial.	Illumina NovaSeq 6000	169
EGAD50000001702	This dataset will include Spatial Transcriptomics from the 10x Genomics Visium SD technology, Single-Cell RNA-Seq from the 10x Genomics Chromium Flex protocol, Bulk RNA-Seq (sequenced on Illumina), WES (sequenced on Illumina), H&E, and Clinical data from 10 DLBCL patients. For Visium and Chromium data we are sharing the outputs of respectively Space Ranger and Cell Ranger. For Bulk RNA-Seq we are including processed files such as counts.tsv, FPKM.tsv, fusion_genes.tsv, TPM count matrix.tsv, DESeq2 count matrix.tsv, multiqc_report. For WES, we are sharing the presence or absence of single nucleotide variants (SNVs), small insertions and deletions (indels), and copy number variants (CNVs).* Researchers from private or public institutions outside the MOSAIC Consortium will be able to apply to access this data and, pending approval, use the data for their research.		10
EGAD50000001703	This dataset consists of 12 RNA samples derived from peripheral blood mononuclear cells (PBMCs) collected from three healthy donors. The samples cover four experimental conditions: fresh PBMCs (ID: 517210–517212), PBMCs subjected to a single freeze–thaw cycle (ID: 517213–517215), PBMCs refrozen in freezing mix (ID: 517216–517218), and PBMCs refrozen in TRIzol (ID: 517219–517221). Each sample is provided as a FASTQ file, and RNA sequencing was performed using the Illumina platform. This dataset enables the assessment of cryopreservation effects on PBMC transcriptomic profiles.	Illumina NovaSeq 6000	12
EGAD50000001704	This dataset will include Spatial Transcriptomics from the 10x Genomics Visium SD technology, Single-Cell RNA-Seq from the 10x Genomics Chromium Flex protocol, Bulk RNA-Seq (sequenced on Illumina), WES (sequenced on Illumina), H&E, and Clinical data from 15 Ovarian patients. For Visium and Chromium data we are sharing the outputs of respectively Space Ranger and Cell Ranger. For Bulk RNA-Seq we are including processed files such as counts.tsv, FPKM.tsv, fusion_genes.tsv, TPM count matrix.tsv, DESeq2 count matrix.tsv, multiqc_report. For WES, we are sharing the presence or absence of single nucleotide variants (SNVs), small insertions and deletions (indels), and copy number variants (CNVs).* Researchers from private or public institutions outside the MOSAIC Consortium will be able to apply to access this data and, pending approval, use the data for their research.		15
EGAD50000001705	Data of 54 samples with their 54 paired fastaq files that were sequenced in a Novaseq 6000 platform. Samples correspond to the three biological replicates of AC16 cells (i.e. AC16, C1, C2). Each biological replicate was edited to carry either of the three different genotypes of the rs1136201 variant (i.e. AA, AG, GG) and each genotype was enriched to express HER2 protein in three levels (i.e. low, medium and high expression) as selected by FACS. Additionally, treatment of each sample was performed and sequenced in duplicates (i.e. 1 and 2)	Illumina NovaSeq 6000	54
EGAD50000001706	This dataset will include Spatial Transcriptomics from the 10x Genomics Visium SD technology, Single-Cell RNA-Seq from the 10x Genomics Chromium Flex protocol, Bulk RNA-Seq (sequenced on Illumina), WES (sequenced on Illumina), H&E, and Clinical data from 10 Mesothelioma patients. For Visium and Chromium data we are sharing the outputs of respectively Space Ranger and Cell Ranger. For Bulk RNA-Seq we are including processed files such as counts.tsv, FPKM.tsv, fusion_genes.tsv, TPM count matrix.tsv, DESeq2 count matrix.tsv, multiqc_report. For WES, we are sharing the presence or absence of single nucleotide variants (SNVs), small insertions and deletions (indels), and copy number variants (CNVs).* Researchers from private or public institutions outside the MOSAIC Consortium will be able to apply to access this data and, pending approval, use the data for their research.		10
EGAD50000001707	Somatic L1 retrotransposition was mapped in high-grade serous carcinoma (HGSC) tumors using LDI-PCR/Nanopore-seq method. This method identifies somatic L1 insertions originating from two active L1 loci at chr22q12.1 and chrXp22.2. Multiple tumors from the same patient were utilised to understand L1-mediated intrapatient heterogeneity. Long-read data generated by LDI-Nanopore sequencing was aligned to the reference genome (GRCh38).	MinION	49
EGAD50000001708	scRNA-seq was performed on the BD Rhapsody platform. CD45+ lymphocytes from two FOXN1het patients and one cord blood sample as control were sorted from frozen cryopreserved samples. The samples were split and each sample was stained with 4 Sampletag antibodies before Abseq staining with the BD Immune Discovery panel and 6 additional antibodies was performed to better classify immune cell subsets. The sorted and stained populations were pooled before loading onto the BD Rhapsody cartridge. The manufacturer's protocol was followed for library construction and samples were sequenced on an Illumina NovaSeq 6000.	Illumina NovaSeq 6000	1
EGAD50000001709	Small RNA-Seq dataset of MicroRNA's found in tear extracellular vesicles (EVs) and hiPSC-derived retinal pigment epithelium (RPE) cells of Ushers Syndrome patients and control samples. 14 samples were isolated and assessed for quality before preparing libraries using the miReasy and TruSeq Small RNA Library Preparation Kit. Sequencing was performed on the Illumina NextSeq 2000 platform, generating paired end FASTQ files.	NextSeq 2000	14
EGAD50000001710	Direct RNA Sequencing offers the capability of standard transcriptomic analyses + RNA modification analysis, which holds promising applications for future clinical workflows. To this end, we tested the newly available RNA004 kit from ONT on human peripheral blood and provide orthogonal GLORI and IVT measurement for benchmarking of m6A-detection. The samples are from peripheral blood of a healthy control, taken at three different timepoints.	PromethION	3
EGAD50000001711	This is a Phase II clinical trial. This research study is studying a combination of targeted therapies as a possible treatment for estrogen-receptor positive (ER+) endometrial cancer and low-grade serous ovarian cancer. The drugs involved in this study are: - Abemaciclib (also known as Verzenio™) - Letrozole (also known as Femara®) - Metformin (also known as Glucophage®) - Zotatifin (also known as eFT226) - Gedatolisib (also known as PF-05212384)	Illumina NovaSeq 6000	15
EGAD50000001712	The human nasal region is a complex structure derived from neural crest and placodal lineages, but its development remains poorly understood due to limited fetal tissue access and its intricate architecture. To address this, we created a single-nucleus and spatial transcriptomic atlas of the human fetal nasal region using tissue from 10 fetuses between 7 and 12 post-conceptional weeks. Single-nucleus RNA sequencing (snRNA-seq) revealed 34 distinct cell types, including epithelial, cartilaginous, immune, neuronal, glial, muscular, and vascular cells. By integrating snRNA-seq with multiplexed error-robust fluorescence in situ hybridization (MERFISH), we tracked dynamic changes in cell composition and gene expression over time in the olfactory epithelium (OE) and surrounding nasal regions. We identified novel markers of olfactory sensory neuron development and key pathways regulating epithelial patterning and OE morphogenesis. Notably, we observed early molecular signatures consistent with the "one neuron-one receptor" model, with spatially organized expression of 171 olfactory receptor genes. Together, this study presents the first integrated molecular and spatial framework of early human nasal development. It offers an unprecedented molecular blueprint of olfactory system formation and serves as a foundational resource for embryologists and developmental biologists studying sensory neurogenesis, epithelial patterning, and congenital disorders.	NextSeq 500	8
EGAD50000001713	scRNA-seq was performed on the BD Rhapsody platform. ILCs (ILC1, ILC2, ILC3) and NK cells were bulk sorted from fresh cord bloods of 3 donors. The 4 samples containing ILCs and NK cells from each donor were stained with Sampletag antibodies (12 samples in total) before Abseq staining with the BD Immune Discovery panel and 6 additional antibodies was performed to better classify immune cell subsets. The sorted population were pooled before loading onto the BD Rhapsody cartridge. The manufacturer's protocol was followed for library construction and samples were sequenced on a Illumina NovaSeq 6000.	Illumina NovaSeq 6000	1
EGAD50000001714	This dataset contains T cell receptor (TCR) sequencing data from 51 RNA samples, each processed into three independent libraries (tot 153 FASTQ files). Sequencing was performed on an Illumina NextSeq 500 platform using the NextSeq 500/550 Mid Output Kit v2.5 (300 Cycles), generating paired-end FASTQ files with dual 10 bp barcodes and 5% PhiX as internal control. Libraries were quantified with the Qubit HS dsDNA kit and assessed for fragment size distribution using the Agilent Tapestation D1000 system before isochoric pooling to 4 nM and sequencing in two independent runs. The resulting dataset enables analysis of TCR β-chain CDR3 sequences for clonal expansion and diversity studies.	NextSeq 550	153
EGAD50000001715	This dataset is the expression profiling by high throughput sequencing, including all counts of genes for all samples in two groups. It includes 26 samples totally (Lean healthy group includes 12 samples: s001/SR01_Bright, s002/SR01_Dim, s004/SR02_Bright, s005/SR04_Dim, s006/SR04_Bright, s007/SR05_Dim, s008/SR05_Bright, s009/SR06_Bright, s011/SR07_Dim, s012/SR07_Bright, s013/SR08_Bright, s014/SR08_Dim; DM2 group includes 14 samples: s015/SR201_Bright, s017/SR202_Bright, s018/SR202_Dim, s019/SR203_Bright, s020/SR203_Dim, s021/SR204_Bright, s022/SR204_Dim, s023/SR205_Dim, s025/SR206_Dim, s026/SR206_Bright, s027/SR207_Dim, s028/SR207_Bright, s029/SR208_Bright, s030/SR208_Dim). It is a TAB file.	Illumina NovaSeq 6000	26
EGAD50000001721	This dataset contains 432 plasma cfDNA samples from healthy individuals, comprising a diurnal cohort of 16 participants and a cross-sectional cohort of 144 participants. Samples underwent capture-based target enrichment using a custom probe set covering 4,991 genomic regions, followed by targeted enzymatic methyl-sequencing on Illumina instruments in 150 bp paired-end mode. The dataset consists of raw FASTQ files generated from the sequencing runs, accompanied by a metadata file containing individual demographic details and sampling information.	unspecified	432
EGAD50000001722	This dataset contains the raw count matrix generated in the PTBP1 project. Sequencing data were obtained from bulk directional RNA-seq and RIP-seq experiments performed on a NovaSeq 6000 platform (2×100 bp paired-end reads). Read counts were computed using HTSeq-count, without normalization. The dataset comprises two groups of samples: controls and patients. For each group, both input and immunoprecipitated samples are available. Patient samples correspond to skin biopsies from individuals affected by a neurodevelopmental disorder associated with variable skeletal dysplasia and disproportionate short-limbed short stature.	Illumina NovaSeq 6000	15
EGAD50000001723	This dataset contains 15 paired FASTQ files from the PTBP1 project. Sequencing data were generated using bulk directional RNA-seq and RIP-seq experiments performed on a NovaSeq 6000 platform (2×100 bp paired-end reads). The dataset comprises two groups of samples: controls and patients. For each group, both input and immunoprecipitated samples are available. Patient samples correspond to skin biopsies from individuals affected by a neurodevelopmental disorder associated with variable skeletal dysplasia and disproportionate short-limbed short stature.	Illumina NovaSeq 6000	14
EGAD50000001726	The dataset consists of exome sequencing data from a male patient with a maternally inherited SPTBN1 mutation. The clinical presentations include autism spectrum disorder, cognitive delay, feeding difficulties, gastrointestinal manifestations, and musculoskeletal abnormalities.	unspecified	1
EGAD50000001727	Bam and indexed bam files after removal of duplicates and trimming of the unique molecular identifiers. Sequencing was performed on NovaSeq 6000 platform.		92
EGAD50000001728	Fibroblasts of 4 WS patients and 4 sex matched controls were transformed into iPSCs, which were then differentiated to Neuroepithelial Stem Cells (NESCs), and further differentiated into neurons. RNA sequencing was performed at the NESC stage and after 4 and 8 weeks of neural development for 3 replicates per individual using the Illumina TruSeq stranded library polyA method on the NovaSeq6000 platform at NGI Stockholm. The data was preprocessed using the nf-core/rnaseq (https://nf-co.re/rnaseq) that includes quality control, data cleaning, and read alignment to GRCh38 (STAR, v2.6.1d)	Illumina NovaSeq 6000	70
EGAD50000001729	single cell multiome dataset (snRNA-seq + snATAC-seq) from PBMC of patients with long COVID disease.	Illumina NovaSeq 6000	18
EGAD50000001730	This dataset consists of 8 patients' data fastq files. 4 in each LR1 and 4 in LR2. the dataset was generated using 10X genomics and novaseqX.	Illumina NovaSeq X	2
EGAD50000001731	Whole-exome sequencing (WES) data from 13 patients with metastatic melanoma in a TIL-ACT cohort, generated to characterize tumor genomics and neoantigen landscapes relevant to adoptive T cell therapy.	unspecified	26
EGAD50000001732	Associated with CNA differences between RNA-based subtypes of PDAC. Shallow Whole Genome Sequencing (sWGS) generated 47 .bam files from 37 patients. sWGS was used for Copy number aberrations (CNA) determination.	Illumina HiSeq 2500	47
EGAD50000001733	This dataset contains the raw fastq files of RNA and whole exome sequencing of the head and neck organoid biobank. For RNA sequencing, 41 organoid samples collected at different timepoints as well as before and after genetic modification are included. For WES, 25 samples of early organoid cultures or matching tumor tissues are included.	Illumina NovaSeq 6000	66
EGAD50000001737	This dataset comprises fastq or bam files derived from sequencing data of dsDNA and ssDNA libraries from 138 plasma cfDNA samples.	Illumina NovaSeq 6000 Sequel II	138
EGAD50000001738	The dataset encompasses 1110 Runs from the WGSPD Project 3 - Genomic Strategies to Identify High-impact Psychiatric Risk Variants Project	HiSeq X Ten	1110
EGAD50000001739	The dataset encompasses 804 Runs from the WGSPD Project 3 - Genomic Strategies to Identify High-impact Psychiatric Risk Variants Project	HiSeq X Ten	804
EGAD50000001740	1075 members of the LBC1936 were sequenced using the Illumina HiSeq X platform. This dataset contains the bam files.	HiSeq X Ten	1073
EGAD50000001742	Neuronal cells derived from female healthy control hiPSC lines LUMCi003-A, LUMCi003-B and LUMCi023-A were transfected with in total four different splice-switching AONs targeting three different human transcripts. Untransfected samples as well as samples transfected with a scrambled AON were included as controls. Three days after transfection, RNA was isolated for sequencing using Illumina Novaseq6000 in order to perform differential gene expression analysis and differential splicing analysis to determine off-target effects. Sequencing was performed in two batches: one containing the samples from cell line LUMCi003-A and the other containing cell lines LUMCi003-B and LUMCi023-A.	Illumina NovaSeq 6000	61
EGAD50000001743	This dataset includes scRNA-seq fastq files and Seurat counts and metadata for 3 samples of hiPSC-derived kidney organoid cells cultured in vitro in presence of IFN-γ. Samples were obtained from kidney organoids with APOL1 G1G1 variant, APOL1 G2G2 variant and an isogenic control.	Illumina NovaSeq 6000	3
EGAD50000001744	WGS from KeyLargo from paired tumor (tissue) and normal (blood) samples	Illumina NovaSeq X Plus	50
EGAD50000001745	Targeted panel sequencing was conducted on patient-derived cell lines: Mel-DCC-11 (the parental line, sensitive to the BRAF inhibitor) and Mel-DCC-11-R (resistant to the BRAF inhibitor). The parental cell line was generated from disseminated cancer cells (DCCs) obtained from a lymph node sample of a melanoma patient. The resistant cell line was generated from the parental one through exposure to Vemurafenib. Normal peripheral blood lymphocytes from the same patient served as the control for comparison. Sequencing on NovaSeq6000. Fastq files.	Illumina NovaSeq 6000	3
EGAD50000001746	This dataset contains 48 fastq files of samples derived from CRC patients, including colorectal cancer tissue, normal adjacent tissue and cfDNA samples, processed with the Illumina platform NovaSeq6000.	Illumina NovaSeq 6000	24
EGAD50000001747	Paired-read fastq files were derived from standard Illumina WES NGS sequencing for 103 DLBCL biopsy samples. This is one of three datasets associated with the multi-platform NGS sequencing efforts of the Cornell-NCI DLBCL genomic study.	Illumina NovaSeq 6000	103
EGAD50000001751	Profiling the chromatin accessibility landscape of cultured primary human hepatocytes (PHH) from three female and three male adult, non-diabetic donors.	Illumina HiSeq 4000	6
EGAD50000001752	Transcriptional profiling via high-throughput sequencing was performed on a set of patients from the AVANT and CALGB trials	Illumina HiSeq 2500	1398
EGAD50000001753	This dataset provides 27 phased diploid genome assemblies for Emirati trios, delivered in FASTA format. Assemblies were built with hifiasm (trio mode) from PacBio HiFi reads at ≥25× coverage for each parent and offspring, then scaffolded with NTLink and reference-guided merging using RagTag against CHM13v2. As part of the same program, a complementary set of single-sample diploid assemblies for 30 individuals (yielding 60 haplotypes) is being generated from deep PacBio HiFi data to expand a population-matched pangenome reference. Together, these resources capture Emirati genomic diversity and support downstream variant discovery and annotation.	unspecified	60
EGAD50000001754	This dataset contains a telomere-to-telomere (T2T), trio-based genome assembly in FASTA format for a single female Emirati individual (proband) generated using parental information for accurate phasing. Sequencing data comprised PacBio HiFi (>60X per parent, >120X offspring), ONT ultra-long reads (>110X offspring), and Illumina short-read WGS (>100X for all three individuals). The offspring genome was assembled in a trio framework and finished with NTLink (initial scaffolding), RagTag (reference-guided refinement), and Quartett (gap filling). The resulting contiguous assembly serves as a high-quality, population-relevant reference suitable for downstream variant discovery and integration into the pangenome.	PromethION unspecified	2
EGAD50000001755	This dataset comprises 27 phased diploid genome assemblies (i.e., 54 haplotypes) from Emirati family trios, provided in FASTA format. Sequencing used PacBio HiFi reads generated on the Revio platform at ≥25× coverage for each parent and offspring. Assemblies were produced with hifiasm (trio mode), then scaffolded with NTLink and reference-guided merging via RagTag against CHM13v2. The collection offers haplotype-resolved references suitable for downstream variant discovery and pangenome analyses.	unspecified	54
EGAD50000001756	This dataset is an Emirati telomere-to-telomere (T2T) pangenome graph in GBZ format built from 116 haplotype-resolved assemblies spanning 58 individuals (28 trio-based and 30 single-sample assemblies). Assemblies were generated with long-read sequencing (PacBio HiFi and ONT ultra-long) with standard polishing, then integrated into a graph representation. The genomes show high contiguity (median ≈150 Mb) and high consensus accuracy (median QV 59). The resulting GBZ graph captures globally shared and Emirati-enriched variation, including sequence in complex regions, and serves as a population-matched reference for downstream variant discovery and annotation.	PromethION unspecified	2
EGAD50000001757	Somatic L1 retrotransposition was mapped in high-grade serous carcinoma (HGSC) tumors using LDI-PCR/Nanopore-seq method. This method identifies somatic L1 insertions originating from two active L1 loci at chr22q12.1 and chrXp22.2. Multiple tumors from the same patient were utilised to understand L1-mediated intrapatient heterogeneity. Long-read data generated by LDI-Nanopore sequencing was aligned to the reference genome (GRCh38). This dataset contains the remaining file of this study.	MinION	1
EGAD50000001758	BAM files of healthy controls (3) and Progressive supranuclear palsy (PSP) patients (4). Genomic DNA from donors was fragmented, end-polished, A-tailed, and ligated with Illumina adapters, followed by size selection. Libraries were PCR-amplified (unless PCR-free), purified with AMPure XP, assessed on the Agilent Fragment Analyzer, and quantified using Qubit and qPCR. Qualified libraries were pooled and sequenced on Illumina platforms, and low-quality reads or adapter-contaminated reads were removed. Clean data were then mapped to the hg38 reference genome using BWA.		7
EGAD50000001759	This dataset contains 4 bulk RNA sequencing files (.fastq.gz) and a metadata file (.csv) from Caco-2/TC7 intestinal epithelial cells under normal or inflamed conditions, followed by treatment with 4HTBZ or vehicle. The cDNA libraries were prepared with the QuantSeq 3′ mRNA-Seq Kit (Lexogen, Inc.) and sequenced on an Illumina HiSeq4000 system. The metadata file provides sample information (experimental condition and code identifier). Raw sequencing reads were quality-checked with FastQC, trimmed with Trimmomatic, and aligned to the human reference genome (GRCh38) using HISAT2.	Illumina HiSeq 4000	4
EGAD50000001760	Associated with Molecular classification of small intestinal adenocarcinomas. Shallow Whole Genome Sequencing (sWGS) generated 214 .fastq files from 127 small intestine adenocarcinomas from 125 patients. DNA libraries for shallow whole genome sequencing were prepared using TruSeq Nano kits (Illumina).	Illumina NovaSeq 6000	127
EGAD50000001761	Associated with Molecular classification of small intestinal adenocarcinomas. TSO500 Targetted sequencing panel generated 292 .fastq files from 145 samples (primary and recurrent small intestine adenocarcinomas as well as normal samples) from 133 patients. TruSight Oncology 500 (TSO500; Illumina, San Diego, USA) high-throughput assays were run on genomic DNA samples through next-generation sequencing (NGS) on Illumina’s NovaSeq6000 at GenomeScan.	Illumina NovaSeq 6000	145
EGAD50000001762	Associated with Molecular classification of small intestinal adenocarcinomas. RNA sequencing generated 849 .fastq files from 137 primary and recurrent small intestine adenocarcinoma samples from 135 patients.	Illumina NovaSeq 6000	137
EGAD50000001766	This dataset contains RNA sequencing data in fastq and bam format from 24 samples of human jejunum organoids grown in 3D configuration or as 2D monolayers on anodisc imaging chambers (AICs) in medium formulations OGM, ENR, and ENRRT. The RNA sequencing samples were sequenced using paired-end sequencing (2x150 bp) on an Illumina NovaSeq 6000 instrument at the SciLifeLab National Genomics Infrastructure (NGI) in Uppsala.	Illumina NovaSeq 6000	24
EGAD50000001767	The glioblastoma spatial transcriptomics dataset was generated from 4 FFPE fixed adult glioblastoma samples, using the 10x Genomics Visium platform, producing raw FASTQ files, spatial feature matrices (.h5, .mtx, .tsv), and paired histology images (.tif). Spot-level transcript counts were aligned to tissue architecture and processed in Seurat, with results stored as .rds objects. Data visualization was performed using SpatialFeaturePlot, and quantitative analyses were supported by custom Matlab scripts. The dataset provides both raw and processed files, offering reproducible spatially resolved transcriptomic profiles of glioblastoma tissue sections	Illumina NovaSeq 6000	4
EGAD50000001768	This dataset contains the results of both RNA sequencing and DNA sequencing. For RNA sequencing, it includes data from 12 samples derived from 9 patients, provided as raw sequencing files in FASTQ format. For DNA sequencing, it contains mutation information obtained from targeted panel DNA sequencing, including data from 35 individuals in the mutation annotation file (MAF) format.	Illumina NovaSeq X Plus unspecified	47
EGAD50000001774	Single cell sequencing using the direct nuclear tagmentation and RNA sequencing (DNTR-seq) method was performed on the diagnostic bone marrow sample (ALL40). This dataset consists of the DNA (scWGS) profile from 384 cells. Two fastq files are associated with each cell identifier.	NextSeq 550	384
EGAD50000001775	This dataset contains 144 files fastq files sequenced with Illumina HiSeqX.	Illumina HiSeq 4000	144
EGAD50000001776	We performed whole-exome sequencing on Bone-Marrow and Peripheral Blood samples from WM patients, including 9 patients with sequential samples.	Illumina NovaSeq 6000	229
EGAD50000001779	Co-culture experiments of primary lymphoid and DLBCL-derived immortalized FRCs (iFRC) followed by scRNA-seq. iFRCs as well as lymphocytes from DLBCL samples (n=3) were harvested after 24h of incubation and different experimental conditions were multiplexed using in-house cell multiplexing oligonucleotides (CMOs; full list of barcodes provided in Table S9). Cells were incubated with CMOs at a final concentration of 1.8 μM for 20 minutes on ice, followed by four washes with PBS (centrifugation at 400 × g for 3 minutes at 4 °C). Single-cell RNA sequencing (scRNA-seq) and multiplexing libraries were prepared using the 10x Genomics Single Cell 3′ Gene Expression v4 assay, according to the manufacturer’s protocol. Sequencing was performed on an Illumina NovaSeq 6000 platform using paired-end 100 bp reads on an S4 flow cell.	Illumina HiSeq 4000	3
EGAD50000001780	This dataset contains RNA-seq data generated from surgical specimens of proliferative vitreoretinal diseases (including proliferative vitreoretinopathy and epiretinal membranes) and from retinal pigment epithelial cells isolated from post-mortem donor eyes. Total RNA was extracted and cDNA libraries were prepared using the Ovation Solo RNA-seq Kit (Tecan). Sequencing was performed on the Illumina NovaSeq 6000 platform. Raw FASTQ files are included to enable downstream analyses such as gene expression profiling.	Illumina NovaSeq 6000	83
EGAD50000001781	RNA-seq results of sequenced data of patients with t(14;16) translocated T-ALL.	unspecified	9
EGAD50000001782	WGS data for patients with t(14;16) translocated T-ALL.	unspecified	8
EGAD50000001783	Whole exome-sequencing for patients with t(14;16) translocated T-ALL.	Illumina HiSeq 2500	1
EGAD50000001784	ATAC-sequence for patients with t(14;16) translocated T-ALL and control T-ALL patients	unspecified	15
EGAD50000001785	Nanopore sequencing for patients with t(14;16) translocated T-ALL	GridION	5
EGAD50000001786	This dataset contains ChIP-seq data generated from two patient-derived xenograft (PDX) models. For each model, sequencing data are provided from input control (n=1), IgG control (n=2), and FOXF1 immunoprecipitation (n=2). All the 10 samples are in bam file type.	Illumina NovaSeq X	10
EGAD50000001787	This dataset contains H3K27ac HiChIP data from ta primary sample and a patient derived model (PDX) of the novel FOXF1/FENDRR ALL subtype. All the 2 samples are in bam file type.	Illumina NovaSeq X	2
EGAD50000001788	This dataset contains 10x Genomics single-cell RNA-seq data from two primary leukemia samples and their corresponding patient-derived xenograft (PDX) models. All the 3 samples are in bam file type.	Illumina NovaSeq X	3
EGAD50000001789	This dataset contains RNA-seq data from 8 acute leukemia cases (diagnosis or relapse) belonging to a newly identified subtype, termed FOXF1/FENDRR ALL. All the 8 samples are in bam file type.	Illumina NovaSeq X	8
EGAD50000001790	This dataset contains whole-genome sequencing (WGS) data from 14 acute leukemia cases of the newly identified FOXF1/FENDRR ALL subtype, including diagnosis or relapse samples, as well as 7 matched germline samples. All the 21 samples are in bam file type.	Illumina NovaSeq X	21
EGAD50000001791	This dataset contains ATACseq data from ta primary sample and a patient derived model (PDX) of the novel FOXF1/FENDRR ALL subtype. All the 2 samples are in bam file type.	Illumina NovaSeq X	2
EGAD50000001792	This dataset contains Whole Genome Sequencing (WGS) data from two cell populations (myeloid and lymphoid) and the germline population sorted from a primary sample of the novel FOXF1/FENDRR ALL subtype. All the three samples are in bam file type.	Illumina NovaSeq X	3
EGAD50000001793	This dataset contains MissionBio Targeted single-cell DNA and Protein sequencing data from a primary sample and a patient-derived xenograft (PDX) model, which is used to investigate the clonal evolution in the FOXF1/FENDRR ALL subtype. All the 2 samples are in bam file type.	Illumina NovaSeq X	2
EGAD50000001795	The dataset includes FASTQ and BAM files from diagnostic and matched remission (control) samples of one TCF3::PBX1 ALL patient (ALLT-376). Sequencing library was made using PCR free-based WGS library preparation (PCR-), using NEB Next® Ultra™ DNA Library Prep Kit. Both diagnostic (90x) and remission (30x) samples were sequenced using HiSeq X.	Illumina NovaSeq 6000	2
EGAD50000001796	The dataset includes FASTQ and BAM files from diagnostic and matched remission (control) samples of one TCF3::PBX1 ALL patient (ALLT-335). Sequencing library was made using standard PCR-based WGS library preparation (PCR+), using NEB Next® Ultra™ DNA Library Prep Kit. Both diagnostic (90x) and remission (30x) samples were sequenced using HiSeq X.	Illumina HiSeq X	2
EGAD50000001797	This RNA-seq dataset comprises 57 samples, from which the proportions of immune cell fractions were estimated using CIBERSORTx.	Illumina NovaSeq 6000	57
EGAD50000001798	The dataset consists of whole exome sequencing FASTQ files and analyzed data, including single nucleotide variants (SNVs) and short insertions and deletions (Indels) of candidate driver genes.	Illumina HiSeq X	129
EGAD50000001799	WXS - Exome capture using the Twist Human Core Exome Kit on germline DNA, paired-end (2x100bp) sequencing on a NovaSeq system. Number of samples: 295	Illumina NovaSeq 6000	295
EGAD50000001800	WGS - PCR-free library preparation (whole blood gDNA), paired-end (2x100bp) sequencing on DNBseq sequencing platform (BGI) Number of samples: 26	unspecified	26
EGAD50000001801	Mononuclear cells were collected from synovial fluid of anti-citrullinated positive antibodies (ACPA)-positive rheumatoid arthritis patients (n=8) and ACPA-negative RA patients (n=8). Global cell types (i.e., no enrichment) were obtained from these cryopreserved synovial fluid mononuclear cells (SFMC) samples and immediately fixed and processed using the GEM-X Flex Gene Expression Reagent Kits (10x Genomics) according to protocol. Following Gel Bead-in Emulsion (GEM) generation, samples were processed using the standard manufacturer’s protocol. Once sequencing libraries passed standard quality control metrics, libraries were sequenced on an Illumina NextSeq2000 P4 100 cycle reagent kit with the following read structure: R1: 28, R2: 90, I1: 10, I2: 10. Libraries were sequenced to obtain a read depth greater than 10,000 reads/cell for gene-expression (GEX). FASTQ files are made available. More detailed information can be obtained in Argyriou A. et al, Annals of the Rheumatic Disease, 2025.	NextSeq 2000	1
EGAD50000001803	Metataxonomic sequencing data targeting the 16S rRNA gene hypervariable regions V3 and V4 on 545 rectal mucus samples from patients suspected to have colorectal cancer and healthy controls.	Illumina MiSeq	545
EGAD50000001805	Dataset contains mRNA capture sequencing data from plasma of 180 different human donors: 132 prostate cancer patients (PRAD) and 48 cancer-free controls. Samples were sequenced on a NovaSeq X and sequencing data is provided in FASTQ format.	Illumina NovaSeq X	180
EGAD50000001806	Dataset contains mRNA capture sequencing data from plasma of 125 different human donors: 88 patients with a non-malignant disease and 37 cancer-free controls. Samples were sequenced on a NovaSeq X and sequencing data is provided in FASTQ format.	Illumina NovaSeq X	125
EGAD50000001808	This dataset contains targeted EM-seq data from 61 plasma cfDNA samples spanning 20 sporadic ALS, 10 C9orf72-ALS, 10 asymptomatic C9orf72 expansion carriers, and 21 non-disease controls. Libraries were prepared with the NEBNext Enzymatic Methyl-seq (EM-seq) kit, captured using the Twist Human Methylome Panel (3.98 M CpGs; 123 Mb), and sequenced on Illumina NovaSeq 6000 (150 bp paired-end; ~45× on-target depth) with unmethylated lambda DNA spike-in and a 20% PhiX balance library. The dataset includes per-sample gzipped FASTQ files (R1/R2) and minimal metadata.	Illumina NovaSeq 6000	60
EGAD50000001809	Exome sequencing data from sixteen phenotypically abnormal human fetal samples.	Illumina NovaSeq 6000	16
EGAD50000001811	RNA sequencing of mCRPC patient biopsies obtained for the profiling of the prevalence of immune transcripts. Manuscript can be found at PMID: 35491356	NextSeq 500	95
EGAD50000001812	scRNA-seq data from circulating gamma delta T cells derived from healthy donors and patients with stage IV melanoma that either responded or did not respond to anti-PD-1 monotherapy or anti-PD-1 and anti-CTLA-4 combination therapy. This dataset includes paired samples from these patients before and 3 months after the start of immunotherapy.	Illumina NovaSeq 6000 NextSeq 2000	28
EGAD50000001813	Here, we applied a whole-genome, tumor-informed approach to study ctDNA in more than 202 advanced-stage cancer patients treated at VHIO. This study demonstrates the efficacy of ultrasensitive ctDNA as a broad biomarker in a large pan-cancer discovery and validation cohort, spanning 24 tumor types treated with diverse immunotherapy modalities. It demonstrates ctDNA’s potential for early response assessment, survival prediction, and distinguishing true progression from pseudoprogression, enhancing its clinical relevance. This dataset captures the plasma sequencing data from the project.	Illumina NovaSeq 6000	383
EGAD50000001814	This dataset includes full tumor transcriptomes from 558 advanced NSCLC tumors. These data originate from pre-treatment samples from a clinical trial for first-line non-small cell lung cancer (IMpower150). The patients in this trial were treated with the PD-L1 inhibitor atezolizumab +bevacizumab + chemotherapy or bevacizumab + chemotherapy or atezolizumab + chemotherapy.		277
EGAD50000001815	Primary human CD4+ naive T cells were cultured in vitro and sorted based on the defined number of division steps which they have undergone. Of those populations (division 0/div4/div6) EMseq data were generated to assess the whole-genome DNA methylation profile. 2 independent experiments with 2 different donors were conducted	Illumina NovaSeq 6000	6
EGAD50000001816	This dataset contains ChIP sequencing data for β-catenin, H3K27ac and H3K27me3 from three HCC cell lines and five tumour organoids under normal conditions and oleic acid treatment.	Illumina NovaSeq 6000	57
EGAD50000001817	This dataset contains whole exome sequencing data of 13 family members of 5 heritable pulmonary arterial hypertension families.	Illumina NovaSeq 6000	13
EGAD50000001818	Total RNA-sequencing of platelets and immortalized megakaryocyte cell lines for inherited thrombocytopenia. Libraries were prepared using Illumina TruSeq Stranded Total RNA with Ribo-Zero Gold kits. The samples were paired-end sequenced using the NovaSeq 6000 platform at a depth of 60 million reads per sample with a read length of 100 base pairs.	Illumina NovaSeq 6000	62
EGAD50000001821	Eight samples were analyzed on an Oxford Nanopore MinION flow-cell, and to evaluate multiplexing, four samples were run together on one flow-cell on the P2 Solo device. Using adaptive sampling, we targeted 10 regions of different structural variant types, including deletions, translocations and complex rearrangements. MinION sequencing resulted in between 14.1-18.3 Gb of data per flow cell, with mean autosomal on-target coverage of 28.4x and off-target read depth coverage of 5.3x.	MinION	12
EGAD50000001822	The dataset, in .bam file format, consists of whole exome sequencing (WES) data of tumors from patients (n=24) with Renal Medullary Carcinoma (RMC), generated using Illumina NovaSeq 6000 sequencing technology. DNA was extracted from FFPE solid tumor samples using the AllPrep DNA FFPE Kit (Qiagen, CA). Libraries from FFPE tissue were prepared with the SureSelect XT HS2 DNA Kit (Agilent, CA) for exome capture. The All Exon V7 exome probe set (Agilent, CA) was used for hybridization and capture of DNA.	Illumina NovaSeq 6000	24
EGAD50000001825	This dataset consists of 16 FASTQ file pairs containing 3` RNA-sequencing data of patient-derived fibroblasts and three age- and sex-matched control fibroblast lines (CTR1, CTR2, and CTR3). The patient and CTR1 fibroblast lines have three biological replicates, while CTR2 and CTR3 lines both have one sample. Each sample has two lanes.	Illumina NovaSeq 6000	8
EGAD50000001826	The dataset contains raw data (FASTQ) of whole genome sequencing of C1498 cells (n=1 sample). Libraries were prepared with the Illumina DNA PCR-Free Prep kit (Illumina, San Diego, CA, USA) and 150-bp paired-end reads were generated on the NovaSeq 6000 Sequencing System (Illumina).	Illumina NovaSeq 6000	1
EGAD50000001830	Transcriptional profiles for stemB and proB cells harvested from either primary or secondary xenografts. This dataset contains six types of samples: - diagnostic stemB cells harvested from primary xenografts - diagnostic proB cells harvested from primary xenografts - relapse stemB cells harvested from secondary xenografts injected with stemB cells from primary xenografts - relapse proB cells harvested from secondary xenografts injected with stemB cells from primary xenografts - relapse stemB cells harvested from secondary xenografts injected with proB cells from primary xenografts - relapse proB cells harvested from secondary xenografts injected with proB cells from primary xenografts	NextSeq 500	26
EGAD50000001831	raw fastq files from the single-nuclei RNA-seq experiments generated from 42 paediatric CNS tumour samples by the 10X Genomics technology.	Illumina NovaSeq 6000	42
EGAD50000001832	WGS data for pancreatic cancer samples (COMPASS Trial; Knox et al., Journal of Clinical Oncology, 2025). The dataset includes tumour and normal BAM files sequenced on Illumina HiSeq/NovaSeq and aligned against GRCh38 (n=263 tumour/normal pairs).		536
EGAD50000001833	WGS data for pancreatic cancer samples. These are miscellaneous samples which are part of a tandem duplicator phenotype paper (Farooq et al, NPJ Precision Oncology, 2025). The dataset includes tumour and normal BAM files sequenced on Illumina NovaSeq 6000 and aligned against GRCh38 (n=7 tumour/normal pairs).		14
EGAD50000001834	WGS data from pancreatic cancer samples. These are part of a resected cohort from a tandem duplicator phenotype paper (Farooq et al, NPJ Precision Oncology, 2025). The dataset includes tumour and normal BAM files sequenced on Illumina HiSeq/NovaSeq and aligned against GRCh38 (n=189 tumour/normal pairs).		378
EGAD50000001835	CNS nuclei were isolated from frozen specimen, stained with anti-NeuN and anti-Olig2, FACS purified (DAPI+NeuN-Olig2-), and analyzed with single nucleus RNA-seq. One batch of samples was also labelled with CMO and allows distinguishing indivudal samples within the analysis. A combined seurat object was generated to include all the analyses and samples were integrated using Harmony integration.	NextSeq 1000 NextSeq 500 NextSeq 550	81
EGAD50000001836	We report the first application of long-read single-cell RNA sequencing on nasopharyngeal swabs from COVID-19 patients and healthy controls to resolve transcript-level changes across cell types. Samples were collected from three patients with moderate COVID-19, six patients with critical COVID-19, and three healthy controls. Single-cell libraries were generated using the 10x Genomics Chromium 3' protocol, producing unfragmented cDNA. Libraries were sequenced on the Oxford Nanopore Technologies PromethION platform. Basecalling was performed with Guppy (v1.6), generating a single FASTQ file per sample.	PromethION	12
EGAD50000001837	This dataset contains snRNA-seq data of 5 regionally sampled GBM tissue (peritumoral region, tumor edge, and tumor core). Regionally sampled GBM patient tissue was dissociated and nuclei were processed in an unbiased manner without any sorting procedure. Nuclei dissociation was performed following the nuclei isolation for single-cell ATAC-seq (10x Genomics; CG000169) with matched peri-tumoral and tumor core samples. Subsequent single-cell methylation sample preparation followed Scale Biosciences single-cell methylation kit protocol v1.0. mplification adaptors through ligation and adding a second barcode though indexed PCR reaction following Scale Biosciences single-cell methylation kit protocol v1.0. We employed the TWIST human methylome panel for targeted enrichment of informative methylation sites.	Illumina NovaSeq X	5
EGAD50000001838	This dataset contains snRNA-seq data of 26 regionally sampled GBM tissue (peritumoral region, tumor edge, and tumor core). Regionally sampled GBM patient tissue was dissociated and nuclei were processed unsorted or by sorting with 7AAD to remove debris and dead cells. Single-nuclei suspension was prepared following the nuclei isolation protocol Single Cell Multiome ATAC + Gene Expression Sequencing protocol using Chromium Nuclei Isolation Kit. Single-nuclei suspension was prepared following the nuclei isolation protocol Single Cell Multiome ATAC + Gene Expression Sequencing protocol using Chromium Nuclei Isolation Kit, nuclei were barcoded and RNA and ATAC libraries were constructed, allowing for simultaneous capture of transcriptome and epigenome from the same cells.	Illumina NovaSeq X	26
EGAD50000001839	Total RNA-sequencing Paired-End was performed by GenomeScan (Leiden, the Netherlands) on the NovaSeq6000 platform for full-length FOS (FOS FL), truncated FOS (FOSΔ), a FOS isoform with an L376N mutation (FOS mut), untransduced fMSC (unt) and fMSC with empty pLV (pLV) at week 0, 1, 2, and 3 in triplicate (total of 60 samples). Raw RNA-seq data were processed by the RNA-seq BioWDL pipeline pipeline version 5.0.0 developed by the SASC team at LUMC, LUMC, Leiden, the Netherlands).	Illumina NovaSeq 6000	60
EGAD50000001840	Sequencing data from mCRPC patients receiving cabazitaxel. Targeted sequencing data of prostate cancer relevant genes using the PCF-SELECT capture panel. cfDNA from baseline, on-treatment and progression samples taken on cabazitaxel were sequenced as well as patient match WBC DNA	Illumina NovaSeq 6000	348
EGAD50000001843	This dataset contains transcriptome, BCR, TCR and HTO data at the single cell level from follicular lymphoma bone marrow samples at diagnosis (n=19) and one year post rituximab therapy (n=16).	Illumina HiSeq 2000	39
EGAD50000001844	This dataset contains fastq files obtained from paired samples of normal colonic mucosa and tumour tissue, collected from 19 patients diagnosed with MMR-proficient colorectal cancer at an age of 50 years or younger. It includes both whole-exome sequencing and RNA-seq data, generated on the Illumina HiSeq 3000 platform. Additionally, the dataset includes phenotypic information from the patients.	Illumina HiSeq 3000	38
EGAD50000001845	This dataset contains paired-end RNA sequencing data from human T cells treated with ML226, a selective ABHD11 inhibitor, and matched vehicle controls. The experiment was designed to investigate the role of ABHD11 in sterol metabolism and its impact on T cell effector function and autoimmunity. Samples were collected across biological replicates following 24-hour treatment, and RNA was extracted for high-throughput sequencing. Sequencing was performed using an Illumina platform. The dataset supports analysis of differentially expressed genes, alternative transcript usage, and pathway-level effects of ABHD11 inhibition. This dataset is part of the study titled "ABHD11 inhibition drives sterol metabolism to modulate T cell effector function and alleviate autoimmunity".	Illumina NovaSeq 6000	8
EGAD50000001846	Primary DCIS and recurrent DCIS/IBC (total n=82, ipsilateral n=78 and contralateral n=4) as well as non-recurrent DCIS (n=32) cases were identified through hospital databases from Royal Melbourne Hospital (RMH), Nottingham City Hospital (UK), the LifePool cohort and Peter MacCallum Cancer Centre (PMCC). All cases were micro-dissected from haematoxylin stained sections (range 4-20 sections) by manual micro-dissection to achieve >50% tumour tissue purity. DNA was extracted using the FFPE AllPrep kit (Qiagen, MD, USA). Depending on the success and timing of DNA extraction (Figure 1), cases were analysed by a targeted sequencing panel (n=46, samples pre-2020), whole exome sequencing (WES, n=67), or low-coverage whole genome sequencing (LCWGS). WES: DNA was processed by the Australian Genome Research Facility (AGRF) using the Twist Bioscience Human Comprehensive Exome v1 or v2 (Twist Bioscience HQ, South San Francisco, CA, USA) according to the Twist Target Enrichment Standard Hybridisation Protocol. Sequencing was performed on the Illumina NovaSeq 6000 with 150 bp paired-end reads for a median depth of 101.29x, range: 20.39-247x. Whole Genome Sequencing (WGS): Libraries for four paired samples with matched stromal DNA were made, processed and sequenced by AGRF using the IDT xGen Kit (USA) according to the manufacturer’s protocol, aiming to achieve 60x depth for tumour DNA and 30x for matched normal. File type: Paired-end FASTQ.	Illumina NovaSeq 6000 NextSeq 500	196
EGAD50000001849	SCANDARE (NCT03017573) is a multicentric biobanking study, enrolling adult patients with newly diagnosed head and neck squamous cell carcinoma (HNSCC), triple negative breast cancer (TNBC), ovarian and cervical cancer. Tumor tissue and blood samples are collected at several time points during patient's journey, including at diagnosis, post-neoadjuvant chemotherapy in case of neoadjuvant treatment, at surgery, at recurrence and at disease progression following treatment initiated at recurrence. Since its launch in 2017 at Institut Curie, SCANDARE has enabled the longitudinal collection and preservation of samples for in-depth analyses. Here, we describe molecular alterations in 251 frozen samples at baseline from 143 TNBC patients using shallow WGS technology, with FASTQ files deposited in the dataset.	Illumina NovaSeq 6000	251
EGAD50000001850	Sequence data generated for the study "An allele-resolved nanopore-guided tour of the human placental methylome" includes Illumina WGS, Illumina RNA-seq, Illumina EM-seq, and Oxford Nanopore Technologies PromethION sequence data.	Illumina NovaSeq 6000 Illumina NovaSeq X PromethION	24
EGAD50000001851	low-pass Whole Genome Sequencing (lpWGS) to assess the representativeness of seven body liquids from female patients with metastatic breast cancer. 20 patients. 216 liquid samples, 745 solid samples, 20 matched normal samples. 0.5X coverage. Illumina single read sequencing. Fastq files are provided. lpWGS from Richard et al., Nature Communications, 2025.	Illumina NovaSeq X	981
EGAD50000001852	Whole Exome Sequencing (WES) to explore mutational landscape of blood and non blood liquids in female patients with metastatic breast cancer. 11 patients. 86 liquid samples, 11 matched normal samples. Median coverage 67X. Illumina paired-end sequencing. Fastq files are provided. WES from Richard et al., Nature Communications, 2025 low-pass Whole Genome Sequencing (lpWGS) to assess the representativeness of seven body liquids from female patients with metastatic breast cancer. 216 liquid samples, 745 solid samples, 20 matched normal samples. 0.5X coverage. lpWGS from Richard et al., Nature Communications, 2025.	Illumina NovaSeq X	97
EGAD50000001855	The dataset contains single-cell RNA-seq data of 73 tumor samples high-grade serous carcinoma (HGSC) patients sequenced with Novaseq. The samples were collected at primary, interval, and relapse treatment phases from multiple tissue sites. The files provided are fastq files.	Illumina NovaSeq 6000	73
EGAD50000001856	The dataset includes FASTQ and BAM files from diagnostic and matched remission (control) bone marrow samples of one hypodiploid ALL patient (ALLT-351). Sequencing library was made using NEB Next® Ultra™ DNA Library Prep Kit. Both diagnostic (60x) and remission (30x) samples were sequenced using HiSeq X.	Illumina HiSeq X	2
EGAD50000001857	Bulk TCR sequences from tumor, peripheral blood mononuclear cell (PBMC), and cell-free DNA (cfDNA) samples from 81 patients with solid tumors treated with pembrolizumab.	Illumina NovaSeq 6000	374
EGAD50000001859	cfDNA shallow Whole-Genome sequencing on 15 samples derived from 5 paitnets with mTNBC, 3 technical replicates each, resulting 15 samples.	Illumina NovaSeq 6000	15
EGAD50000001860	sWGS of the blood samples from 14 (fourteen) healthy donors	Illumina HiSeq 2500	14
EGAD50000001861	cfDNA shallow Whole-Genome sequencing from 30 patients with mTNBC in the TONIC trial, including 2 timepoints: baseline and week 12.	Illumina HiSeq 2500	60
EGAD50000001862	cfDNA shallow Whole-Genome sequencing of the remaing patients with mTNBC in the TONIC trial, including 3 timepoints: baseline, week 6 and week 12. The week 6 data of the patients in the pilot run was sequenced here.	Illumina NovaSeq 6000	168
EGAD50000001863	14 TNBC tumors with tumor percentage ~90% were selected for in silico spike-in experiment	Illumina HiSeq 2500	14
EGAD50000001864	20 tumor tissue WES data with paired cfDNA sWGS data. This data is used for concordance analysis between cfDNA-based and tissue-based CNA. The WES libraries were sequenced across 3 independent runs using multiple lanes.	Illumina HiSeq 2500	80
EGAD50000001867	This dataset contains whole-genome shotgun metagenomic sequencing data of 408 rectal mucus samples from patients suspected to have colorectal cancer and healthy controls. Sequencing was performed on the Illumina NovaSeq platform, producing paired-end FASTQ files. These data support microbial community profiling and biomarker discovery analyse	Illumina NovaSeq 6000	408
EGAD50000001868	This dataset contains sequence data from 154 patients in Australia diagnosed with cutaneous melanoma before the age of 20. Each patient was exome sequenced on the illumina platform. Generated sequence reads were aligned against version 19 of the human genome assembly and stored in bam files made available here.	Illumina HiSeq X Illumina NovaSeq 6000	154
EGAD50000001869	Contains a total of 9 patients with whole-genome sequencing of 30 tumours or germline samples, 10X single-cell multiome RNA-seq + ATAC-seq of 24 tumours, 10X 5' single-cell RNA-seq of 10 tumours, and snPATHO-seq of 2 primary tumours. All single-cell experiments have matched whole-genome sequencing. When matched tumour (and germline) whole-genome sequencing is missing, it can be found under EGAS00001006598.	unspecified	40
EGAD50000001870	This study provides a comprehensive benchmarking resource for somatic variant detection in cell-free DNA (cfDNA) from cancer patients. Longitudinal plasma samples from colorectal and breast cancer cohorts were selected to create patient-matched dilution series spanning ultra-low to high circulating-tumour-DNA (ctDNA) fractions, while preserving each individual’s germline and clonal haematopoiesis background. Deep whole-genome sequencing (150×) and ultra-deep whole-exome sequencing (2,000×) generated a reference call set of ~37,000 single-nucleotide variants and ~58,000 insertions/deletions. These data enabled systematic evaluation of nine somatic variant callers across variable ctDNA levels and sequencing depths, and were further used to explore machine-learning–guided parameter tuning. The resulting dataset offers an openly accessible framework for developers and clinicians to assess and optimize somatic variant calling in liquid biopsy applications.	Illumina NovaSeq 6000	12
EGAD50000001871	This dataset contains comprises enzymatic methylation sequencing (EM-Seq) data generated from 208 rectal mucus samples from patients suspected to have colorectal cancer and healthy controls. Targeted CpG sites were enriched using a panel of genes implicated in colorectal cancer, and libraries were prepared with NEBNext EM-Seq prior to Illumina sequencing. Sequencing was performed on the Illumina NovaSeq6000 platform, producing paired-end FASTQ files. These data support analyses of DNA methylation patterns in CRC detection and progression	Illumina NovaSeq 6000	208
EGAD50000001873	This dataset contains raw sequencing data from low-input PCHi-C in human primary CD4+ T cells isolated from healthy donors. The data are released as paired-end Illumina FASTQ files. Promoter Capture Hi-C used DpnII digestion and Tn5-based low-input library preparation, followed by targeted capture with custom-designed biotinylated RNA probes targeting most of the annotated gene promoters.	Illumina NovaSeq 6000	4
EGAD50000001874	scRNAseq dataset of vitligo dermis and epidermis for 6 patients. For each patient nonlesional and perilesional skin were sampled and processed separately.	Illumina NovaSeq 6000	12
EGAD50000001875	This dataset contains raw sequencing and processed data of single nuclei transcriptomes. The data were generated with 10x Chromium probe-based single-nucleus assay on FFPE tissues of 63 ST-EPN tumour resections. Raw data contain BAM (and BAM.BAI) files of the sequencing. Processed data contain aligned .h5 counts matrices as generated by 10x CellRanger for raw and filtered running options.	Illumina NovaSeq 6000	63
EGAD50000001876	This dataset contains raw sequencing and processed data of spatially resolved transcriptomes. The data were generated with 10x Visium probe-based assay on FFPE tissues of 30 ST-EPN tumour resections. Raw data contain two-lane fastq files of the sequencing. Processed data contain aligned and filtered .h5 counts matrices as well as positional files as generated by 10x SpaceRanger.	NextSeq 2000	30
EGAD50000001877	Plasma cfDNA sWGS for breast cancer patients (HR+/HER2−, stage II–III), blood drawn pre-operatively (no neoadjuvant), Centre Léon Bérard/MyProbe (France). Libraries: DSP and/or SSP as recorded. Sequencing by IntegraGen (2022–2024); alignment to GRCh38 with BWA-MEM; BAMs sorted/indexed; primary mapped reads only (see analysis metadata for exact filters/MAPQ). Controlled access.	Illumina MiSeq	50
EGAD50000001878	Plasma cfDNA sWGS for NSCLC patients (stage I–III), blood drawn pre-operatively, CHU Montpellier/LUNGDOC study. Libraries: DSP as recorded. Sequencing provider IntegraGen (2022–2024). Alignment to GRCh38 (hg38 canonical), BWA-MEM; BAMs sorted/indexed; only primary mapped reads retained (analysis pages list -F/MAPQ). Controlled access.	Illumina MiSeq	56
EGAD50000001879	Plasma cirDNA sWGS for metastatic colorectal adenocarcinoma at diagnosis prior to systemic therapy. Libraries DSP/SSP (per sample). Sequencing by IntegraGen (2022–2024). Alignment: GRCh38 with BWA-MEM; BAMs sorted/indexed; delivered as primary mapped reads only (filters/MAPQ in analysis metadata). Controlled access	Illumina MiSeq	30
EGAD50000001880	Paired-end shallow whole-genome sequencing (sWGS) of plasma cfDNA from healthy donors recruited at EFS Montpellier (France). Libraries: DSP (NEBNext Ultra II) and/or SSP (IDT/Swift Accel-NGS 1S Plus). Sequencing by IntegraGen (France, 2022–2024). Reads aligned to GRCh38 (hg38 canonical) with BWA-MEM; BAMs coordinate-sorted and indexed; submission contains primary mapped reads only (unmapped/secondary/supplementary/duplicates removed; typical filter “-F 3332”, MAPQ threshold as noted in analysis records). Each BAM/BAI pair is linked to its Sample and Experiment. Data Use: controlled access.	Illumina MiSeq	62
EGAD50000001881	This dataset includes exome sequencing (ES) from a cohort of patients with infertility. Sequencing was performed on genomic DNA extracted from blood, using Agilent SureSelect XT Low Input exome capture and Illumina NovaSeq 6000 paired-end sequencing and is intended to facilitate research into novel genetic causes of infertility.	Illumina NovaSeq 6000	8
EGAD50000001882	This dataset contains RNA004 DRS data for a single patient sample with heterozygous inactivating SNPs for the methyltransferase METTL5. We provide 1) all reads in a GRCh38 aligned bam file + unaligned reads and 2) filtered reads just for the relevant part of chrR. Both bam files have been called with dorado 0.7.2 to contain the neccessary m6A methylation information. Companion datasets are 1) EGAS50000001201 for DRS of a healthy control with intact METTL5 and 2) PRJEB74238 for IVT 18S rRNA vector data as unmodified control. Please note, that both the companion datasets contain pod5 data, to faciliate re-basecalling.	PromethION	1
EGAD50000001883	To investigate the levels of gene expression in clear cell renal cell carcinoma (ccRCC), we performed an RNA-seq analysis in 786-O, A-498, UM-RC-2 clear cell renal carcinoma cell lines, as well as HK-2 normal renal epithelium cell line, cultured under normoxic (20% O2) and hypoxic (0.05% O2) conditions. Data are provided in raw fastq files and csv files with the differential expression results. Differential expression was calculated by the RNAflow pipeline (DEseq2), and log fold changes are calculated with respect to the normal cell line (HK-2), with separate results between normoxic and hypoxic datasets.	Illumina NovaSeq 6000	4
EGAD50000001884	In order to investigation the 3D chromatin structure in clear cell renal cell carcinoma (ccRCC), we performed ultra high-resolution chromatin capture using MNase (Micro-C) in 786-O, A-498, UM-RC-2 ccRCC cell lines, as well as HK-2 normal renal epithelium cell line, cultured under normoxic (20% O2) and hypoxic (0.05% O2) conditions. Data are provided in raw fastq and the chromatin interactions in hic format. Data were processed using the Juicer pipeline.	Illumina NovaSeq 6000	4
EGAD50000001885	To identify regions of open and accessible chromatin in clear cell renal cell carcinoma (ccRCC), we performed ATAC-seq on the 786-O, A-498, UM-RC-2 ccRCC cell lines, as well as HK-2 normal renal epithelium cell line, cultured under normoxic (20% O2) and hypoxic (0.05% O2) conditions. Data are provided in raw fastq and processed narrowPeak fomats. Data were processed using the nf-core atacseq pipeline.	Illumina NovaSeq 6000	4
EGAD50000001886	To investigate the epigenetic landscape of clear cell renal cell carcinoma (ccRCC), we performed ChIP-seq data for the CTCF, H3K4me1, H3K4me3, H3K27ac, H3K27me3, H3K36me3, and HIF1A marks in the 786-O, A-498, UM-RC-2 ccRCC cell lines, as well as the HK-2 normal renal epithelium cell line, cultured under normoxic (20% O2) and hypoxic (0.05% O2) conditions. Data are provided in raw fastq and narrowPeak formats. Data were processed using the nf-core chipseq pipeline.	Illumina NovaSeq 6000	4
EGAD50000001888	To overcome the challenges of low DNA yields, degraded DNA by formalin fixation and diluted signal of genomic aberrations by non-carcinoma components in the heterogeneous FFPE samples, we isolated pure carcinoma and stromal cells using the DEPArray™ NxT system, a microchip-based digital sorter that allows isolation of pure, homogeneous subpopulations of cells from FFPE samples. We isolated pure carcinoma and stromal cell populations from 12 FFPE tissues, including tissues from 9 primary and metastatic breast cancer and 3 primary ovarian high-grade serous carcinomas. This was followed by a targeted panel for somatic mutation and sCNA analysis (7 samples), subject to cell availability. Mutation analysis was performed successfully in 6/7 samples and somatic mutations were detected in all of them.	NextSeq 2000	24
EGAD50000001889	To overcome the challenges of low DNA yields, degraded DNA by formalin fixation and diluted signal of genomic aberrations by non-carcinoma components in the heterogeneous FFPE samples, we isolated pure carcinoma and stromal cells using the DEPArray™ NxT system, a microchip-based digital sorter that allows isolation of pure, homogeneous subpopulations of cells from FFPE samples. We isolated pure carcinoma and stromal cell populations from 12 FFPE tissues, including tissues from 9 primary and metastatic breast cancer and 3 primary ovarian high-grade serous carcinomas. This was followed by downstream shallow whole genome sequencing (WGS) for copy number landscape profiling for 10 samples. Seven out of 10 samples (even some with low tumour content or of old age) produced good quality genomic data, detecting sCNA in all carcinoma population samples but not in the stromal populations.	NextSeq 2000	40
EGAD50000001890	To overcome the challenges of low DNA yields, degraded DNA by formalin fixation and diluted signal of genomic aberrations by non-carcinoma components in the heterogeneous FFPE samples, we isolated pure carcinoma and stromal cells using the DEPArray™ NxT system, a microchip-based digital sorter that allows isolation of pure, homogeneous subpopulations of cells from FFPE samples. We isolated pure carcinoma and stromal cell populations from 12 FFPE tissues, including tissues from 9 primary and metastatic breast cancer and 3 primary ovarian high-grade serous carcinomas. This was followed by a targeted panel for somatic mutation and sCNA analysis (7 samples), subject to cell availability. Mutation analysis was performed successfully in 6/7 samples and somatic mutations were detected in all of them.	NextSeq 2000	24
EGAD50000001891	To overcome the challenges of low DNA yields, degraded DNA by formalin fixation and diluted signal of genomic aberrations by non-carcinoma components in the heterogeneous FFPE samples, we isolated pure carcinoma and stromal cells using the DEPArray™ NxT system, a microchip-based digital sorter that allows isolation of pure, homogeneous subpopulations of cells from FFPE samples. We isolated pure carcinoma and stromal cell populations from 12 FFPE tissues, including tissues from 9 primary and metastatic breast cancer and 3 primary ovarian high-grade serous carcinomas. This was followed by downstream shallow whole genome sequencing (WGS) for copy number landscape profiling for 10 samples. Seven out of 10 samples (even some with low tumour content or of old age) produced good quality genomic data, detecting sCNA in all carcinoma population samples but not in the stromal populations.	NextSeq 2000	40
EGAD50000001892	To overcome the challenges of low DNA yields, degraded DNA by formalin fixation and diluted signal of genomic aberrations by non-carcinoma components in the heterogeneous FFPE samples, we isolated pure carcinoma and stromal cells using the DEPArray™ NxT system, a microchip-based digital sorter that allows isolation of pure, homogeneous subpopulations of cells from FFPE samples. We isolated pure carcinoma and stromal cell populations from 12 FFPE tissues, including tissues from 9 primary and metastatic breast cancer and 3 primary ovarian high-grade serous carcinomas. This was followed by downstream shallow whole genome sequencing (WGS) for copy number landscape profiling for 10 samples. Seven out of 10 samples (even some with low tumour content or of old age) produced good quality genomic data, detecting sCNA in all carcinoma population samples but not in the stromal populations.	NextSeq 2000	40
EGAD50000001893	To overcome the challenges of low DNA yields, degraded DNA by formalin fixation and diluted signal of genomic aberrations by non-carcinoma components in the heterogeneous FFPE samples, we isolated pure carcinoma and stromal cells using the DEPArray™ NxT system, a microchip-based digital sorter that allows isolation of pure, homogeneous subpopulations of cells from FFPE samples. We isolated pure carcinoma and stromal cell populations from 12 FFPE tissues, including tissues from 9 primary and metastatic breast cancer and 3 primary ovarian high-grade serous carcinomas. This was followed by downstream shallow whole genome sequencing (WGS) for copy number landscape profiling for 10 samples. Seven out of 10 samples (even some with low tumour content or of old age) produced good quality genomic data, detecting sCNA in all carcinoma population samples but not in the stromal populations.	NextSeq 2000	40
EGAD50000001894	To overcome the challenges of low DNA yields, degraded DNA by formalin fixation and diluted signal of genomic aberrations by non-carcinoma components in the heterogeneous FFPE samples, we isolated pure carcinoma and stromal cells using the DEPArray™ NxT system, a microchip-based digital sorter that allows isolation of pure, homogeneous subpopulations of cells from FFPE samples. We isolated pure carcinoma and stromal cell populations from 12 FFPE tissues, including tissues from 9 primary and metastatic breast cancer and 3 primary ovarian high-grade serous carcinomas. This was followed by downstream shallow whole genome sequencing (WGS) for copy number landscape profiling for 10 samples. Seven out of 10 samples (even some with low tumour content or of old age) produced good quality genomic data, detecting sCNA in all carcinoma population samples but not in the stromal populations.	NextSeq 2000	40
EGAD50000001895	To overcome the challenges of low DNA yields, degraded DNA by formalin fixation and diluted signal of genomic aberrations by non-carcinoma components in the heterogeneous FFPE samples, we isolated pure carcinoma and stromal cells using the DEPArray™ NxT system, a microchip-based digital sorter that allows isolation of pure, homogeneous subpopulations of cells from FFPE samples. We isolated pure carcinoma and stromal cell populations from 12 FFPE tissues, including tissues from 9 primary and metastatic breast cancer and 3 primary ovarian high-grade serous carcinomas. This was followed by downstream shallow whole genome sequencing (WGS) for copy number landscape profiling for 10 samples. Seven out of 10 samples (even some with low tumour content or of old age) produced good quality genomic data, detecting sCNA in all carcinoma population samples but not in the stromal populations.	NextSeq 2000	40
EGAD50000001896	To overcome the challenges of low DNA yields, degraded DNA by formalin fixation and diluted signal of genomic aberrations by non-carcinoma components in the heterogeneous FFPE samples, we isolated pure carcinoma and stromal cells using the DEPArray™ NxT system, a microchip-based digital sorter that allows isolation of pure, homogeneous subpopulations of cells from FFPE samples. We isolated pure carcinoma and stromal cell populations from 12 FFPE tissues, including tissues from 9 primary and metastatic breast cancer and 3 primary ovarian high-grade serous carcinomas. This was followed by a targeted panel for somatic mutation and sCNA analysis (7 samples), subject to cell availability. Mutation analysis was performed successfully in 6/7 samples and somatic mutations were detected in all of them.	NextSeq 2000	24
EGAD50000001897	To overcome the challenges of low DNA yields, degraded DNA by formalin fixation and diluted signal of genomic aberrations by non-carcinoma components in the heterogeneous FFPE samples, we isolated pure carcinoma and stromal cells using the DEPArray™ NxT system, a microchip-based digital sorter that allows isolation of pure, homogeneous subpopulations of cells from FFPE samples. We isolated pure carcinoma and stromal cell populations from 12 FFPE tissues, including tissues from 9 primary and metastatic breast cancer and 3 primary ovarian high-grade serous carcinomas. This was followed by a targeted panel for somatic mutation and sCNA analysis (7 samples), subject to cell availability. Mutation analysis was performed successfully in 6/7 samples and somatic mutations were detected in all of them.	NextSeq 2000	24
EGAD50000001898	To overcome the challenges of low DNA yields, degraded DNA by formalin fixation and diluted signal of genomic aberrations by non-carcinoma components in the heterogeneous FFPE samples, we isolated pure carcinoma and stromal cells using the DEPArray™ NxT system, a microchip-based digital sorter that allows isolation of pure, homogeneous subpopulations of cells from FFPE samples. We isolated pure carcinoma and stromal cell populations from 12 FFPE tissues, including tissues from 9 primary and metastatic breast cancer and 3 primary ovarian high-grade serous carcinomas. This was followed by a targeted panel for somatic mutation and sCNA analysis (7 samples), subject to cell availability. Mutation analysis was performed successfully in 6/7 samples and somatic mutations were detected in all of them.	NextSeq 2000	24
EGAD50000001899	To overcome the challenges of low DNA yields, degraded DNA by formalin fixation and diluted signal of genomic aberrations by non-carcinoma components in the heterogeneous FFPE samples, we isolated pure carcinoma and stromal cells using the DEPArray™ NxT system, a microchip-based digital sorter that allows isolation of pure, homogeneous subpopulations of cells from FFPE samples. We isolated pure carcinoma and stromal cell populations from 12 FFPE tissues, including tissues from 9 primary and metastatic breast cancer and 3 primary ovarian high-grade serous carcinomas. This was followed by a targeted panel for somatic mutation and sCNA analysis (7 samples), subject to cell availability. Mutation analysis was performed successfully in 6/7 samples and somatic mutations were detected in all of them.	NextSeq 2000	24
EGAD50000001902	The current sequencing dataset consists of 85 bisulfite sequencing data files (Fastq files) of human cell-free urine DNA samples. These extracted cfDNA samples were then constructed into DNA libraries using the KAPA HTP Library Preparation Kit (Kapa Biosystems) with bisulfite modification using the EpiTect Bisulfite Kit (Qiagen) and subsequently sequenced on the Nextseq500 platform (Illumina).	NextSeq 500	85
EGAD50000001903	Read data and bam alignments for 124 ancient yakuts	HiSeq X Five Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 4000 Illumina MiniSeq NextSeq 500	14979
EGAD50000001909	Whole genome sequencing FASTQ files from tumor and normal samples from 202 patients. All experiments reported in this article were performed in the Clinical Laboratory Improvement Amendments (CLIA)-certified and College of American Pathologists (CAP)-accredited laboratories at Personalis Inc., as guided by the Association for Molecular Pathology (AMP) and CAP’s joint recommendations52. Preferred input material consisting of FFPE slides (enabling H&E and macrodissection which is part of the standard NeXT Personal process) was not available for the majority (94%) of tumor samples. Most tumor samples (72%) were processed from FFPE curls, with the remaining processed from pre-extracted DNA and fresh frozen material. Where FFPE material on slide was available, tumor sections were macrodissected to increase tumor content, with a minimum allowable cellularity threshold of 20% determined by pathological review. The tumor cellularity threshold of 20% was chosen based on prior work demonstrating that no significant correlation is observed between tumor purity and assay limit of detection (LOD) at this threshold31. Genomic DNA was isolated from tumor and normal samples and WGS libraries prepared as previously described31. Genomic DNA was isolated from matched tumor and normal samples using the Qiagen AllPrep DNA/RNA FFPE Tissue Kit or the QIAamp DNA Mini Kit (QIAGEN, Germantown, MD, USA) using internally optimized workflows. WGS sequencing libraries were prepared with 120-500 ng of acoustically sheared genomic DNA (Covaris LLC, Woburn, MA, USA) using the KAPA HyperPrep Kit (Roche Sequencing Solutions, Pleasanton, CA, USA) and customized methods. Libraries were cleaned-up using AMPure XP beads and then quantified using the KAPA Library Quantification Kit (Roche Sequencing Solutions, Pleasanton, CA, USA), before being sequenced to 30X depth of coverage using a NovaSeq 6000 instrument (Illumina, San Diego, CA, USA).	Illumina NovaSeq 6000	404
EGAD50000001910	Raw sequencing data for 909 bulk mRNA samples that correspond to 749 metastases, 64 primary tumours and 96 normal tissue. These samples were sequenced using NovaSeq6000. Library was prepared with Lexogen QuantSeq 3' FWD. Raw single-end fastq samples have been deposited as gzipped files.	Illumina NovaSeq 6000	909
EGAD50000001915	Whole-exome sequencing (WES) was performed on 73 tumors, including 58 colorectal cancers (CRC) and 15 urothelial cancers (UC), derived from 58 individuals with Lynch syndrome (LS). The cohort comprised both primary and metachronous tumors.	Illumina NovaSeq 6000	108
EGAD50000001918	This data consists of iPSC derived model systems stimulated with different media (where possible from the top/bottom separately) to provide insight into which more accurately reflects the human gut environment.	unspecified	21
EGAD50000001925	Whole-genome sequencing data from ten present-day individuals from Himachal Pradesh, India. DNA from these ten present-day individuals was converted to double-stranded libraries and sequenced on Illumina NovaSeq 6000 in 150 bp paired-end mode. The dataset consists of raw fastq and bam (hg19) files.	Illumina NovaSeq 6000	10
EGAD50000001926	This dataset contains raw whole-exome sequencing data from 40 samples of intrahepatic cholangiocarcinoma patients, including tumor and matched normal tissue. Samples were sequenced on the NovaSeq platform. The cohort is part of a proteogenomic profiling study integrating genomic and proteomic data.	Illumina NovaSeq X Plus	38
EGAD50000001932	This dataset contains GeoMx digital spatial profiling (DSP) data from 12 FFPE pre-treatment biopsies of metastatic HR-altered breast cancer patients subsequently treated with PARP inhibitors (Olaparib or Talazoparib). Samples were collected from primary breast tumors (n=7) and metastatic sites including lymph nodes (n=2), bone (n=2), and brain (n=1). Tissue sections were stained for pan-cytokeratin (PANCK) to selectively profile tumor cells, with NGS-based transcriptome data generated for both PANCK+ and PANCK- regions. The dataset includes raw data files (176 fastq files) along with clinical annotations.	NextSeq 2000	13
EGAD50000001933	Dataset includes two scRNA-seq samples from control and irradiated organoids. Files included in the dataset are Fastq files. Sample libraries produced with Chromium X were sequenced on an Illumina NovaSeq 6000 sequencer.	Illumina NovaSeq 6000	2
EGAD50000001934	Single cell RNA sequencing and coupled TCR sequencing data generated following immunological treatments in patient-derived ex vivo tumor models. Dataset consist of 46 sequencing samples originating from eight individual patient-derived ex vivo models. Single cell RNA and TCR sequencing data is generated using 10X Genomics platform with 5´ sequencing. The submission contains 46 Fastq files where each patient-derived ex vivo model and it's transcriptomic response to specified treatment is a unique sequencing run. TCR sequencing runs are submitted as one file per individual model, gathering all the treatment conditions.	Illumina NovaSeq 6000	46
EGAD50000001936	This dataset comprises single-cell DNA sequencing data for selected pancreatic ductal adenocarcinoma (PDAC) autopsy samples. It employs a targeted gene panel to identify single nucleotide variants (SNVs) and copy number variations (CNVs) using the Mission Bio Tapestri platform. The samples encompass primary and metastatic tissue from selected PDAC patients. This dataset includes 74 sample-level BAM files and 24 patient-level H5 files, generated by the MissionBio pipeline.	Illumina NovaSeq 6000	98
EGAD50000001950	This dataset includes 25 LUAD samples consisting of control samples derived from lymph node and tumor tissues samples, which were either untreated or treated with formalin and/or the NEBNext FFPE DNA Repair v2 Module. Sequencing was performed on the Illumina NovaSeq X platform using a 2×150 nt configuration, yielding approximately 18 Gbp per sample.	Illumina NovaSeq X	25
EGAD50000001960	Sequencing data from newly diagnosed high emtastatic burder prostate cancer patients receiving an AR pathway inhibitor (ARPI) or docetaxel. Targeted sequencing data of prostate cancer relevant genes using the PCF-SELECT capture panel. cfDNA from PreADT, baseline, on-treatment (Cycle1 to 6) taken on ARPI or Docetaxel were sequenced as well as patient-matched WBC DNA	Illumina NovaSeq 6000	646
EGAD50000001961	The dataset contains 364 CRAM files from whole-genome next-generation sequencing on the Illumina NovaSeq 6000 at 1- 2x coverage. These are plasma samples from healthy individuals and patients with cancer.	Illumina NovaSeq 6000	364
EGAD50000001965	Beta values generated from Illumina EPIC methylation array data for a subset of NICOLA Wave 1 samples.		1976
EGAD50000001975	EM-seq converted WGS for CSF-derived cfDNA from pediatric brain tumor patients	Illumina NovaSeq 6000	268
EGAD50000001976	Whole-transcriptome single-cell RNA sequencing (scRNA-seq) was performed following the manufacturer’s protocol (BD Rhapsody, BD Biosciences). Targeted enrichment of BCR::ABL1 transcripts was conducted using the RoCK and ROI-seq approach, based on the barcoded bead technology of the BD Rhapsody platform. In the RoCK and ROI-seq workflow, the template-switch oligonucleotides (TSOs) attached to BD Rhapsody beads were modified with one or multiple capture sequences specific to transcripts of interest, enabling targeted transcript capture during reverse transcription. ROI primers that are added during library preparation to direct amplification and sequencing toward these defined regions of interest (ROIs). Resulting libraries were sequenced on an Illumina sequencer.	Illumina NovaSeq 6000	1
EGAD50000001978	This dataset consists of 71 whole-genome samples and 47 RNA-seq samples. All WGS samples are provided in .bam format and RNA-seq samples are provided as .fq.gz (paired-end). Next-generation sequencing data on whole genomes were processed by aligning sequencing reads to a reference genome (NCBI build 37/hg19) using bwa-mem (0.7.13-r1126).	unspecified	75
EGAD50000001979	These documents contain the patch seq data presented in the Bouwen et al Nat Comm 2025. Human brain samples were gathered during surgery, neurons were isolated using patch clamp glass pipette. Isolated cells were processed using Smartseq2 pipeline, and sequenced using an Illumina 2500. Files are formatted in fastq format. There are a total of 12 patient samples, each with 2 replication runs.	Illumina HiSeq 2500	12
EGAD50000001981	This dataset is made of 23 pairs of RNAseq and 36 pairs of Exome-seq, both sequenced from Illumina Novaseq. This includes 47 samples, composed of 25 tumors, 13 healthy and 9 xenograft	Illumina NovaSeq 6000	47
EGAD50000001993	This dataset consists of transcriptome data available for most cases of the whole-genome sequences from 1,364 Korean breast cancers	Illumina NovaSeq 6000	1208
EGAD50000001994	This dataset consists of whole-genome sequences from 1,364 Korean breast cancers	Illumina NovaSeq 6000	2726
EGAD50000002007	FASTQ files from scRNAseq (smartseq2 protocol) from 2485 CD8 T cells from 4 non-muscle-invasive bladder cancer patients who received BCG treatment (induction treatment of 6 weekly instillations). Concerning the 'phenotype' column in metadata : pre-treatment samples include tumor samples ('Tumor') and normal adjacent bladder tissue ('Extratumor'). For post-treatment samples : no residual tumor was found in the bladder of the 4 patients after the BCG treatment . The post-treatment samples with 'Tumor' as phenotype denote samples from macroscopically suspect zones, but with microscopically no tumor detected.	Illumina HiSeq X	2485
EGAD50000002008	WES (FASTQ files) of paired tumor and PBMC samples of non-muscle-invasive bladder cancer patients. All CTRL samples originate from PBMC. For patient UC2, DNA from 3 different synchronous tumors were extracted. DNA from tumors was extracted from fresh frozen tumors,except for patient UC4 (FFPE). NB : patients aliases are the following :ROXY : UC1, OMAD : UC2, MAKI : UC3, HAPE : UC4	unspecified	10
EGAD50000002009	RNAseq (FASTQ files) of 5 tumor from 3 non-muscle-invasive bladder cancer patients. For patient UC2, RNA from 3 different synchronous tumors was extracted. Relating to the other datasets from the project : no RNA was available from patient UC4. NB : patients aliases are the following : ROXY : UC1, OMAD : UC2, MAKI : UC3, HAPE : UC4	unspecified	5
EGAD50000002010	We perform bulk RNA-seq of 10 and single cell RNAseq of 2 intestinal metaplasia samples. We also perform stereoseq spatial transcriptomic analysis of 4 gastric cancer samples. The dataset contain cram files from 10 bulk RNAseq samples, and raw fastq files from 2 single cell RNAseq and 4 stereo-seq samples.	DNBSEQ-T7 Illumina NovaSeq 6000	16
EGAD50000002012	Anti-Her2-CAR T cells produced from T cells derived from healthy human donors were treated with DMSO (control) or either of the immunomodulatory metabolites indole‐3‐carboxaldehyde (ICA), valeric acid (VA) and isovaleric acid (IVA) to assess effects on CAR T cell function. scRNA-seq libraries were generated on a 10X Chromium controller with On-Chip-Multiplexing. The individual samples in a library are distinguished during analysis on the single-cell level via the cell barcode class.	NextSeq 2000	4
EGAD50000002013	The dataset includes FASTQ and BAM files from diagnostic and matched remission (control) samples of one T-ALL patient (ALLT-317). Relapsed (BM ALLT-317L and Tonsil ALLT-3107L) sample files are in dataset set2 of this submission. Sequencing library was made using PCR free-based WGS library preparation (PCR-), using TruSeq DNA PCRfree library preparation kit v1 (PE150). Both diagnostic (60x) and remission (30x) samples were sequenced using HiSeq X.	Illumina HiSeq X	2
EGAD50000002014	The dataset includes FASTQ and CRAM files from diagnostic and matched remission (control) samples of one T-ALL patient (ALLK-118), and relapse (from BM and tonsil) and matched remission (control) samples of another T-ALL sample (ALLT-317, tonsil relapse ALLT-3107L). Diagnostic sample of ALLT-317 in the first dataset in this submission. Sequencing library was made using PCR free-based WGS library preparation (PCR-), using NEBNext® Ultra™ II DNA Library Prep Kit (Cat No. E7645). Both diagnostic/relapse (90x) and remission (30x) samples were sequenced using Illumina Novaseq 6000.	Illumina HiSeq X Illumina NovaSeq 6000	5
EGAD50000002015	The dataset includes FASTQ and CRAM files from diagnostic and matched remission (control) samples of two T-ALL patient (ALLK-126, ALLT-503). ALLT-503 sample has two diagnostic samples one extracted from the BM (ALLT-503D) and from the lymph node (ALLT-503lympD). Sequencing library was made using PCR free-based WGS library preparation (PCR-), using Novogene NGS DNA Library Prep Set (Cat No. PT004). Both diagnostic/relapse (90x) and remission (30x) samples were sequenced using Illumina NovaseqX Plus.	Illumina NovaSeq 6000 Illumina NovaSeq X Plus	5
EGAD50000002018	Peripheral blood mononuclear cells were used to produce paired, full-length IGM, IGK and IGL V(D)J libraries from HA-binding single memory B cells using the ISCAPE (Individualized Single Cell Analysis of Paired Expressed antigen receptors) technique. Locus-specific multiplexed nested PCR with primer sets targeting the 5’ UTR/leader and 3’ constant region was performed before well and plate indexing and paired chain sequencing with an Illumina MiSeq instrument. Expressed total IGM, IGK and IGL V(D)J libraries from each donor were acquired and analyzed to define personalized IG allele content. The fastq file names include the donor and target for expressed IGM, IGK or IGL V(D)J libraries.	Illumina MiSeq	4
EGAD50000002019	Peripheral blood mononuclear cells were used to produce paired, full-length IGM, IGK and IGL V(D)J libraries from HA-binding single memory B cells using the ISCAPE (Individualized Single Cell Analysis of Paired Expressed antigen receptors) technique. Locus-specific multiplexed nested PCR with primer sets targeting the 5’ UTR/leader and 3’ constant region was performed before well and plate indexing and paired chain sequencing with an Illumina MiSeq instrument. Expressed total IGM, IGK and IGL V(D)J libraries from each donor were acquired and analyzed to define personalized IG allele content. The fastq file names include the donor and a numerical plate index for the HA-specific ISCAPE BCR HC and LC V(D)J libraries.	Illumina MiSeq	4
EGAD50000002021	Single-cell RNA-seq analysis of human iPSC-derived microglia/astrocytes/neurons in a novel, human in vitro 3D cortical tissue model, as well as 2D co-cultures for comparison. Samples include distinct conditions as described in the Study section. The dataset was generated using the BD Rhapsody technology and sequenced paired-end with Illumina NovaSeq2000. Information on the multiplexing of samples is reported in the Analyses section.	Illumina NovaSeq 6000	4
EGAD50000002022	A selection of samples was analyzed using single-cell RNA sequencing to explore cellular heterogeneity. 4 inflammatory diseases were chosen for this purpose: Still’s disease, recurrent pericarditis, familial Mediterranean fever (FMF), and chronic non-bacterial osteomyelitis (CNO).	Illumina NovaSeq X	36
EGAD50000002023	The RNA was extracted using the AS1390 PROMEGA kit according to the manufacturer’s instructions. The samples underwent sequencing using a Novaseq 6000 machine, equipped with an SP 200 cycles v1.5 cartridge. The sequencing process varied slightly across batches based on the input RNA and the library preparation kits used.	Illumina NovaSeq 6000	433
EGAD50000002024	A genome-wide approach based on next generation sequencing (NGS) is performed on trios ( Patient+ Father + Mother) with SAID in order to identify known and new gene variants associated with certain forms of the disease. WGSwas performed on frozen PBMCs. DNA was extracted using the QIAamp DNA Blood Midi Kit and quantified with Nanodrop. Sequencing was done on a NovaSeq 6000 with S4 flow cells, targeting 30× coverage.	Illumina NovaSeq 6000	75
EGAD50000002025	The purpose is to perform miRNA analysis on FACS sorted target immune cell populations to uncover novel potential miRNA diagnostic markers biomarkers predictive of therapy response but also possibly identify perturbed signaling pathways or miRNA targets of intervention with the use of agomirs/antagomirs.	Illumina NovaSeq 6000	394
EGAD50000002026	A genome-wide approach based on next generation sequencing (NGS) is performed on trios ( Patient+ Father + Mother) with SAID in order to identify known and new gene variants associated with certain forms of the disease. Whole-genome sequencing (WGS) was performed on frozen PBMCs. DNA was extracted using Chemagic Prime ( Perkin Elmer) and quantified with nanodrop. The raw sequencing data is generated on the Novaseq 6000 using an S4 flowcell, with a sequencing depth of ~30X	Illumina NovaSeq 6000	141
EGAD50000002027	16S-sequencing: All stool samples were processed according to the standards of the Flemish Gut Flora Project (Falony et al., 2016). All samples were processed together and sequenced in as a single batch (dual index run DI23R22) which included positive controls (PC, identical stool sample), negative controls (PCR-mix) and ‘blanks’ or negative controls of the extraction (no sample). In addition, a Runella strain (RS) was included to check for potential cross-contamination, as this strain is not present in human samples. The V4 region of the 16S rRNA gene was amplified with the primer pair 515F and 806R (GTGYCAGCMGCCGCGGTAA and GGACTACNVGGGTWTCTAAT, resp.), modified to contain a barcode sequence between each primer and the Illumina adaptor sequences to produce dual- barcoded libraries (Tito et al., 2019). Sequencing was performed on the Illumina MiSeq. Sequences were processed using the LotuS (Hildebrand et al., 2014) and DADA2 pipeline (Callahan et al., 2016) using RDP 16 as the reference taxonomy.	Illumina MiSeq	265
EGAD50000002029	This study consists of genomic sequencing data obtained through targeted sequencing of regulatory regions (85,394 human cCREs from ENCODE v2, version 2, https://screenv2.wenglab.org/) in 200 ASD (Autism Spectrum Disorder) trios (father and mother unaffected and proband affected). The aim of the project is to detect genomic variants (rare and inherited variants) in regulatory regions that contribute to autism.	Illumina NovaSeq 6000	600
EGAD50000002031	Single-cell RNA-seq analysis of human iPSC-derived microglia/astrocytes/neurons in a novel, human in vitro 3D cortical tissue model. Samples include distinct conditions (each as biological duplicates): - wild-type cells from cultures after 1 month (WT_1mo, cell lines: A18, KOLF) and 3 months (WT_3mo, cell lines: SA2, KOLF, A18) of culturing - knock-in cells from cultures carrying three familial AD mutations in the APP gene (Swedish, Iberian, Arctic; introduced by CRISPR/Cas9-mediated knock-in) after 3 months of culturing (KI_3mo, cell lines: SA2, A18) - subset of WT 1 month-old samples were exposed to 500nM synthetic Abeta42 for 2 consecutive feeds, and a subset of these subsequently treated with 1µg/ml Aducanumab antibody for 14 days	Illumina NovaSeq X	4
EGAD50000002032	This dataset contains error-corrected next-generation sequencing (ecNGS) data generated from 161 rectal mucus samples. These samples belong to colorectal cancer, adenoma, and control participants. Genomic DNA was extracted and processed using a hybrid-capture panel targeting 50 colorectal cancer–associated genes, incorporating unique molecular identifiers (UMIs) for duplex consensus error correction. Libraries were sequenced on an Illumina NovaSeq 6000 (paired-end 150 bp) producing paired-end FASTQ files. These data support analyses of DNA mutations in CRC detection and progression.	Illumina NovaSeq 6000	161
EGAD50000002034	This dataset contains 10 paired fastq files sequenced on Illumina NextSeq 550 sequencer. Tumors were rapidly dissociated after the surgical procedure using the Miltenyi Biotec Human Tumor Dissociation kit (cat# 130-095-929). Libraries were constructed using the VDJ NextGEM v1.1 10x Genomics Chromium kit. Single cell RNA-seq analysis was performed using Seurat R package.	NextSeq 550	10
EGAD50000002035	This dataset contains raw 10x Genomics single-cell RNA sequencing data from a total of eight runs of circulating tumor cells from three small cell lung cancer patients. Each run represented by four FASTQ files (R1, R2, I1, I2) generated using the Illumina NextSeq 2000 platform. These files include read pairs as well as sample and cellular barcode indices. Circulating tumor cells were enriched from the blood of three SCLC patients using the CTC-iChip followed by magnetic depletion of RBCs. The enriched samples were processed with the 10x Genomics Chromium platform (Chromium GEM-X Single Cell 3' Kit v4) and sequenced on a NextSeq 2000 system.	NextSeq 2000	3
EGAD50000002036	Deposited here are 56 whole-genome sequencing BAM files from 15 Trp53f/w, 12 Trp53f/f, 14 Brca1f/f Trp53f/w, and 15 Brca1f/f Trp53f/f mouse mammary tumor samples sequenced on Illumina NovaSeq PE150. An additional 6 whole genome sequencing BAM files from corresponding normal tissues of mice are also included as controls (BPC_D666, B1PC_D840, BPTC_D992, PC_D503, PC_D720_Kid, PTC_H531). All samples were used in the study “Insights into BRCA1 and TP53 associated breast cancer development from integrated whole genome analysis of mouse model mammary tumors, Huo et al., 2026.”	Illumina NovaSeq 6000	62
EGAD50000002037	Genomic DNA was extracted from PBMCs and whole blood using the Chemagic Prime DNA Blood 4k kit. DNA quality and integrity were assessed by Qubit and Tapestation, and only samples with sufficient quantity, DIN > 6, and confirmed sex were used. Whole-exome libraries were prepared with the SureSelect XT Human All Exon V8 kit (Agilent), covering all human protein-coding regions. DNA was fragmented, adapter-ligated, captured, and amplified following the manufacturer’s protocol. Libraries were quantified, pooled (95 samples), and sequenced on an Illumina NovaSeq 6000 using S4 flow cells, generating up to 12 billion paired-end reads per run.	Illumina NovaSeq 6000	258
EGAD50000002038	This dataset comprises bulk and single-nuclei multiomic profiles from patients with ER+/HER2- early breast cancer enrolled in the NeoRHEA phase 2 trial. The bulk RNA-seq component includes 78 pre-treatment and 49 post-treatment tumor samples from 97 patients. The single-nuclei component includes matched pre- and post-treatment samples from 37 patients, totaling 63 samples analyzed by simultaneous snRNA-seq and snATAC-seq. Combined, the data contains bulk transcriptional profiles and single-nuclei gene expression/chromatin accessibility matrices for over 226,000 high-quality nuclei, encompassing tumor, immune, and stromal cells. Technologies used include Illumina NovaSeq 6000 for sequencing and the 10x Genomics Chromium platform (Multiome kit for single-nuclei, standard RNA library prep for bulk).	Illumina NovaSeq 6000	130
EGAD50000002041	The dataset includes more than 130k nuclei isolated using 10x Genomics Chromium Next GEM Single Cell 3' Reagent Kits v3.1. In total, 30 samples are included from 10 controls, 8 MSA (including one sample that was omitted during QC), or 12 PD patients. Samples were sequenced on Illumina Novaseq 6000 or NextSeq 550 instruments.	Illumina NovaSeq 6000 NextSeq 500	30
EGAD50000002043	The dataset contains BAM files of whole genome sequencing data of African Khoe-San population (n=169, 30X coverage). DNA extracted from blood underwent 2 × 150 bp sequencing on the Illumina HiSeq X instrument (Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research). Raw sequencing reads were aligned against human reference hg38 including alternative contigs using bwa (v.0.7.15). The Genome Analysis Toolkit (GATK, v.3.5-0) was used for duplicate marking, indel realignment and base quality recalibration.	HiSeq X Ten	169
EGAD50000002044	This dataset includes bulk RNA-sequencing data from HEK293-L cells subjected to siRNA-mediated knockdown of Dicer and/or ADAR1. In total, 12 samples are included, representing four experimental groups with triplicate biological replicates. RNA was extracted using TRIzol according to standard protocols and sequenced on an Illumina NovaSeq 6000.	Illumina NovaSeq 6000	12
EGAD50000002045	This dataset is composed of RNA-Seq samples: 49 controls which are probands without snRNA rare variants and 19 affected probands with either monoallelic or biallelic variants in RNU2-2. RNA was extracted from short-term lymphocytes cultures. RNA-Seq libraries were obtained by hybrid capture of fragmented cDNA with RNA probes from exome v8 (Agilent)	NextSeq 550	68
EGAD50000002047	In order to perform comprehensive transcriptomics and gene fusion analyses of a cohort of BCP-LBL and extramedullary (EM) manifestations of B-ALL patients, we performed RNA sequencing of 16 tissue samples from BCP-LBL and EM B-ALL patients. Because the material was available as FFPE, and had a relative low quality, we used a capture-based approach, where NGS libraries obtained from total RNA were captured in 4-plex, using a whole exome capture panel.	Illumina NovaSeq 6000	68
EGAD50000002051	The dataset contains 424 paired-end raw fastq files of 224 plasma cfDNA samples from healthy donors and cancer (Ewing Sarcoma, CIC) patients. WGS and methylation binding domain sequencing was performed at 2 × 150 bp on an NovaSeq6000 or NextSeq 500. 200 samples underwent both WGS and methylation binding domain sequencing. 24 samples underwent only WGS.	Illumina NovaSeq 6000	224
EGAD50000002053	Datasets consists of 2 batches of human glial progenitors (iPS), cells were sequenced as controls and IFN treated. Using the 10Genomics platform from the samples 2 libraries were built forgene expression libraries (scRNA-seq) and captured guides sequencing library (gRNA) on a set of Multiple Sclerosis associated SNPs. Batch one includes KRAB and P300 perturbed cells in 2 rounds sequenced RNA libraries for Ctrl and IFN and 1 round of capture guides for Ctrl and IFN. Batch 2 includes 1 round of both RNA expresison library and captures guides for KRAB and P300 perturbations in Ctrl and IFN treated cells.	Illumina HiSeq X Illumina NovaSeq X	18
EGAD50000002054	Datasets consists of 12 samples of nuclei from hIPSCs differentiated optimized protocols for generating dopaminergic neurons. 6 single nuclei (RNA-ATAC) samples of XMAS protocol with cells collected at days 0,11,16,28,42 and 56 after differentiation. 3 snRNA-seq samples from AMS protocol of cells collected at days 14,16 and 28 after differentiation and 3 snRNA-seq samples with AMS protocol until day 14 grafted and grafts processed after 5 months grafting.	Illumina NovaSeq 6000 Illumina NovaSeq X	12
EGAD50000002055	The dataset contains 4 .bam whole transcriptome files that were generated from 2 tumor samples (one .bam file per sample was created using exome-capture (hybrid capture) and the other .bam file was created using ribo-repletion (inverse mRNA selection) library preparation technology). The dataset also contains 2 somatic mutations (.vcf - sample) files (one per sample) which serve as input for Vardetector package (vardetector/vardetector at main · julijselb/vardetector) to produce somatic coverage over those mutations from the .bam file. Therefore, the dataset also contains 4- .csv files (number of reads per somatic mutation that contain the somatic mutation variant and the ones that do not contain the somatic mutation variant) that serve as outputs of the Vardetector package.	Illumina NovaSeq X	4
EGAD50000002056	This dataset contains whole-genome sequencing data of three patients with metastatic salivary gland cancer who underwent autopsy to harvest tumor biopsies from spatially separated metastatic sites and biopsies from healthy skin as a control. Two patients had the subtype adenoid cystic carcinoma (patients 1 and 4) and one patient had the subtype myoepithelial carcinoma.	Illumina NovaSeq 6000	20
EGAD50000002057	Tissue samples were collected from 18 ovarian cancer patients. After mechanical and enzymatic dissociation, a part of these tissues underwent growth factoromics analysis. For four cases, single-cell sequencing analysis was conducted under conditions with or without estradiol. Additionally, the remains of the tissues were frozen and subsequently subjected to exome sequencing on 16 of these samples. Sixteen cases were diagnosed as HGSOC, while two samples were clear cell or mucinous type (Extended Data Table 1). All cases were grade 3, and none had received neoadjuvant chemotherapy.	Illumina NovaSeq 6000 Illumina NovaSeq X	25
EGAD50000002058	scRNAseq of ovarian cancer patient derived organoids with or without estradiol	unspecified	3
EGAD50000002059	bulkRNAseq of four ovarian cancer cell lines (PA-1, A2780, A2780/CP20, TOV-21G)	unspecified	5
EGAD50000002060	Bulk RNA sequencing was performed on mCRC organoids subjected to cetuximab treatment, comparing ATOH1 knockout and control conditions. The dataset is intended to elucidate the contribution of ATOH1-regulated secretory cell population to cetuximab persistence mechanisms.	NextSeq 500	18
EGAD50000002061	Bulk RNA sequencing was performed on mCRC organoids cultured under untreated conditions or treated with crenigacestat alone or in combination with cetuximab. This dataset was generated to investigate the transcriptional changes associated with combination therapy and to compare these profiles with the adaptive responses observed following cetuximab monotherapy in the corresponding ATOH1 wild-type and knockout organoid models.	Illumina NovaSeq X	9
EGAD50000002062	This dataset contains fastq files sequencing with Oxford Nanopore (MK1C)	MinION	9
EGAD50000002063	Epigenetic clock measures for a subset of 1,870 participants within the NICOLA cohort. The following epigenetic clocks were generated: (±PC)PhenoAge, (±PC)GrimAge, (±PC)Horvath1, (±PC)Hannum, PCDNAmTL and DunedinPACE. +PC represents the use of the principal component adjusted version of the clock.		1870
EGAD50000002064	High-grade serous ovarian cancer presents significant challenges due to its poor prognosis and high heterogeneity, both of which complicate treatment responses. This project aims to understand intra-patient tumor evolution by investigating different sampling sites (primary and metastatic) at the time of diagnosis and during disease recurrence. A total of 183 biopsies from 50 patients were collected for this purpose, and bulk mRNA sequencing was performed. The majority of samples originated from following tissue types: omentum, ovary, and ascites.	Illumina NovaSeq 6000	183
EGAD50000002066	Single-cell genotype-to-phenotype (scG2P) of normal esophageal epithelium, based on targeted DNA genotyping of mutation hot spots and targeted RNA capture.	Illumina NovaSeq 6000	6
EGAD50000002068	This dataset contains sequencing data of human urinary cell-free DNA. Whole-genome sequencing (WGS) and whole-genome bisulfite sequencing (WGBS) was performed after DNA extraction from the urine samples. All samples were pair-end sequencing on Illumina system and were aligned to human genome GRCh37 (hg19). WGS and WGBS reads were aligned by SOAP2 and Methyl-Pipe, respectively. The dataset includes samples from healthy subjects, haematuria (non-cancer), and patients of bladder, kidney, or prostate cancer. Pair-end FASTQ files were provided for all the samples mentioned above.	NextSeq 2000	366
EGAD50000002071	Deposited here are whole-genome sequencing data for 51 paired breast cancer DCIS and matched-normal samples taken from the same individual. Average sequence depth is 110x for DCIS samples and 39x for matched-normals. Matched-normals are from blood (n=14) or adjacent normal breast (n=37). Sequencing was performed on Illumina HIseqX (n=70) or Illumina Novoseq (n=32). They are a part of a broader project that uses high-depth WGS to investigate the somatic mutation genomic landscape of DCIS in order to uncover biological insights into breast cancer progression as well as possible methods to stratify DCIS patients for individualised therapy or disease monitoring.	Illumina HiSeq X	102
EGAD50000002072	The dataset contains five samples from P1, one pre-HSCT and four post-HSCT, one sample from P2 and 5-6 HD. RNAseq reads were processed using the LUMC BIOWDL RNAseq pipeline v5.0.0 (https://github.com/biowdl/RNA-seq). Genes with log2CPM≥1 in at least 10% of all samples were retained for downstream analysis. Differential gene expression analysis was performed using the dgeAnalysis R-Shiny application (https://github.com/LUMC/dgeAnalysis/tree/v1.4.4). ATACseq reads were processed using the LUMC BIOWDL Chipseq pipeline 1.0.0-dev (https://github.com/biowdl/ChIP-seq).	Illumina NovaSeq 6000	12
EGAD50000002078	This dataset contains paired-end RNA-Seq data generated from mid-turbinate nasal and throat samples collected as part of a first-in-human SARS-CoV-2 experimental challenge study conducted in the United Kingdom in 2021. Total RNA was extracted using QIAamp Viral RNA kits (Qiagen), and libraries were prepared using TruSeq RNA Exome kits (Illumina). Sequencing was performed on a NovaSeq X Plus instrument configured for 100 bp paired-end reads. The dataset includes raw FASTQ files and associated de-identified metadata.	Illumina NovaSeq X	602
EGAD50000002080	Samples taken from various regions of three human glioblastomas were subject to single cell RNA sequencing.	Illumina NovaSeq 6000	9
EGAD50000002081	RNA sequencing of organoids derived from human glioblastoma subject to hypoxia, plasma, or a combination of both for 72h, sampled at 0, 24, 72, and 144h.	Illumina NovaSeq 6000	55
EGAD50000002082	The Manchester Eye Tissue Repository Genome-Transcriptome (METR-GT) Project comprises whole genome sequencing data and paired short read, high-depth bulk-RNA sequencing data from the neurosensory retina (NSR) and the retinal pigment epithelium (RPE) from 201 unrelated individuals that donated eye-tissue post-mortem. None of the individuals included in the cohort had phenotypic presentation, assessed post-mortem, consistent with late-stage AMD or monogenic ophthalmic disorders. In this dataset we include WGS fastq files, WGS VCF files (at an individual level and an aggregate VCF that has undergone variant-level quality control), RNASeq BAM files of the NSR and RPE, gene expression quantification files of the NSR and RPE (including raw gene counts generated by RNASeQC and estimated TPM and FPKM values generated by RSEM). Additionally, we have included the output files from eQTL mapping in both tissues, generated by tensorQTL and instances of aberrant gene expression events generated by the OUTRIDER module of the DROP pipeline.	Illumina NovaSeq 6000	201
EGAD50000002083	This dataset contains the standardized clinical information related to the Use case 1 - Liver Cancer (HCC) from EuCanImage. The Data is Synthetic.		100
EGAD50000002084	In this study, we characterise PGCCs in ten pleomorphic sarcomas and use topographic single-cell DNA sequencing (scDNA-seq) to investigate their genomic landscape. We selected PGCCs based on their nuclear morphology, including mononucleated or multinucleated bizarre, misshapen nuclei, and analysed them at single-cell resolution. Histopathological analysis showed that PGCCs were often randomly distributed throughout the tumour and did not appear in clusters, suggesting that they arise de novo rather than through clonal expansion. scDNA-seq revealed that PGCCs originate from the dominant tumour population and exhibit extensive copy number heterogeneity, either due to subsequent or ongoing chromosomal instability. Both clonal and subclonal chromothripsis-like events were identified in PGCCs, indicating that chromothripsis is a key driver of heterogeneity in these cells and is linked to multinucleation rather than mononuclear PGCC formation.	Illumina HiSeq 4000	96
EGAD50000002088	The dataset consists of germline whole exome sequencing data generated on the Illumina NovaSeq X Plus platform for 48 patients with MSI-high/ MMRd gastro-intestinal and genito-urinary cancers, in order to detect Lynch syndrome.	Illumina NovaSeq X Plus	48
EGAD50000002089	Targeted tissue (n=81), cfDNA (n=319), and WBC (n=208) Illumina sequencing of metastatic urothelial carcinoma patients. All cfDNA is collected in the metastatic setting, while archival tissue is from across the disease course. White blood cell (WBC) samples are used as patient matched germline samples. All files are in paired-end fastq format.	Illumina NovaSeq X	608
EGAD50000002093	Associated with Molecular profiling of HGBCL-DH-BCL2 patients treated in the HOVON-152 triaL. Targeted sequencing of 30 samples from 30 High-grade B-cell lymphoma patients. Targeted sequencing was used for identify genomic alterations associated with patient outcome.	Illumina NovaSeq X	30
EGAD50000002094	Associated with Molecular profiling of HGBCL-DH-BCL2 patients treated in the HOVON-152 triaL. Targeted sequencing of 30 samples from 30 High-grade B-cell lymphoma patients. Targeted sequencing was used for identify genomic alterations associated with patient outcome.	Illumina HiSeq 4000	30
EGAD50000002095	This dataset contains bulk RNA-seq data from 15 patients who underwent rotator cuff repair surgery, yielding paired proximal and distal samples of the long head of the biceps tendon (30 samples total). The submission includes 90 files, consisting of R1 and R2 FASTQ files for each sample as well asprocessed .TXT files containing gene-level counts generated after genome alignment. Raw sequencing reads were generated on an Illumina platform, aligned to the human reference genome GRCh38.p13, and quantified at the gene level.	Illumina NovaSeq 6000	90
EGAD50000002100	Paired-end whole-exome sequencing data derived from Illumina hybrid-capture experiments (Agilent SureSelect Human All Exon kits). Six human probands with rare diseases; one sample per subject. Sequencing reads from each individual were processed into analysis-ready gVCF files following standard best-practice pipelines, including alignment, duplicate marking, base quality calibration, and variant calling. All gVCF files are encrypted using Crypt4GH and are available under controlled access through our DAC in accordance with a strict DUO-based policy.	Illumina HiSeq 2500	6
EGAD50000002101	Raw fastq files for 10x Chromium snRNA-seq of pineal parenchymal tumors profiled in the study "A tumor-associated photoreceptor signature unifies distinct central nervous system malignancies".	Illumina NovaSeq 6000	33
EGAD50000002103	Tumor implantation and expansion were performed in two 6-week-old NOD/SCID mice. Once tumors reached an average volume of approximately 700 mm³, mice were randomly assigned to one of two treatment arms: vehicle (physiological saline, intraperitoneally, twice weekly) or cetuximab (20 mg/kg, intraperitoneally, twice weekly). After 1 week of treatment, mice were sacrificed, and tumors were harvested and preserved in sCelLiVE® Buffer for downstream single-cell RNA sequencing analysis.	unspecified	4
EGAD50000002104	Organoids were seeded in 12-well plates at a density of 5 × 10⁴ cells per well, embedded in 200 µL of Matrigel diluted 1:1 with culture medium, and maintained under EGF-deprived conditions. After four days of growth, organoids were either left untreated or treated with cetuximab (20 µg/mL). Following 72 hours of treatment, organoids were harvested, dissociated, and live single cells were sorted for single-cell RNA sequencing using the 10x Genomics platform.	unspecified	8
EGAD50000002110	This dataset contains single-cell RNA sequencing data derived from freshly dissociated human omental tissue. The study investigates the transcriptomic landscape of the omentum by comparing samples from patients with benign conditions against those with ovarian cancer metastasis. To capture the heterogeneity of the metastatic niche, sampling was performed across distinct anatomical sites, including non-metastasized tissue, distant omental regions, peritumoral zones, and the tumor core. This comprehensive profiling aims to define the cellular remodeling associated with malignant transformation	Illumina NovaSeq 6000	36
EGAD50000002111	Bam files containing PacBio HiFi reads from carriers of ring and marker chromosomes. The reads where genereated using the PacBio Revio platform. Each individual was sequenced to roughly 30X coverage on one flow cell per individual. The chromosome of interest is indicated in the file name.	unspecified	10
EGAD50000002112	The dataset contains 46 samples: a germline and tumour sample from each of 23 donors. Donors are men diagnosed with prostate cancer at a young age, from the Bob Champion Cancer Trust. The data consists of 92 whole exome sequenced paired FASTQ files, compressed with bgzip. Exome sequences were enriched using Agilent SureSelect XT2 Human All Exon V5 baits, and sequenced with Illumina HiSeq 2500.	Illumina HiSeq 2500	45
EGAD50000002113	CRISPR perturbations of IRF4, PRDM1, and AAVS1 (within PPP1R12C gene) in activated B cells	Illumina NovaSeq 6000	96
EGAD50000002114	We developed a B cell stimulation protocol that allowed us to expand a smaller number of naive B cells in culture (2,000 B cells at day 0). On day 6, we performed single-cell RNA-and BCR-sequencing on the entire culture, preserving both clonal diversity and size.	Illumina NovaSeq 6000	32
EGAD50000002115	We purified human naive and memory B cells from PBMCs. We stimulated them with IgM cross-linker providing BCR stimulation, multimeric CD40L that engages costimulatory receptor CD40 on the B cell, and cytokines IL-2 and IL-21. To comprehensively characterise activation dynamics of naive and memory B cells, we performed single-cell RNA sequencing (scRNA-seq) and B cell receptor sequencing (BCR-seq) in resting cells (day 0), before the first division (day 1), at the time of the first division (day 3) and following the full B cell differentiation (day 6)	Illumina NovaSeq 6000	168
EGAD50000002117	This dataset contains raw paired-end FASTQ files from targeted DNA sequencing using the TruSight Oncology 500 panel in patients with glioblastoma, IDH wildtype, treated with surgery, standard chemoradiotherapy and Tumor Treating Fields at Charité – Universitätsmedizin Berlin. Tumour samples were collected at first surgery and are linked to pseudonymised clinical and molecular annotations described in the associated EGA study.	NextSeq 550	63
EGAD50000002120	The dataset contains transcriptome sequencing (RNA-seq) data of 704 soft tissue tumors (STT) from 704 patients. The sequencing data comes in FASTQ format and contains 1408 files. A total of 56 different STT types were included in the present study and each tumor type was represented by >2 samples. The vast majority of the patients had been diagnosed and treated at the sarcoma centers in Lund (Sweden) or Stockholm (Sweden) during the period 1988-2020, but also a few samples from other centers, obtained through previous collaborative projects, were included. RNA was extracted from fresh frozen tumor material and samples were primary lesions unless otherwise indicated. The sequencing was performed using 2x150 bp paired-end chemistry on a series of Illumina instruments at various facilities during the period 2013-2023.Samples were obtained after informed consent from the patients and the study was approved by the regional ethics committee (Etikprövningsmyndigheten, Uppsala, dnr 2023-01550-01, dnr 2025-02997-02).	unspecified	704
EGAD50000002121	This dataset includes exome-captured RNA-Seq made with exome v8 (Agilent) - library preparation on a MAGNIS XT-HS2 starting from 100ng of RNA. RNA has been extracted from lymphocytes cultures from 2 probands with SF3B1 mutations and 14 controls. All 16 RNA-Seq were sequenced on a single HighOutput flowcell 2x75bp (NextSeq550). Raw fastq files have between 20 and 30M paired-reads.	NextSeq 550	16
EGAD50000002122	This dataset contains spatial transcriptomics sequencing data from 61 hepatocellular carcinoma (HCC) tissue samples profiled using the 10x Genomics Visium platform. The cohort includes 38 resected tumors and 23 tumors from patients treated with atezolizumab plus bevacizumab. Sequencing data are provided as paired-end FASTQ files. These data support the analysis of intra-tumoral heterogeneity and its association with clinical outcomes.	NextSeq 2000	49
EGAD50000002123	This RNAseq dataset contains full transcriptome data on n=77 pure DCIS samples. The RNA is derived from fresh-frozen DCIS samples, obtained from two hospital based cohorts (NKI cohort and Oslo cohort). The libraries were sequenced with 54 bp paired end reads using a Novaseq instrument (Illumina). We share fastq data files of paired end sequencing (two files per sample).	Illumina NovaSeq 6000	77
EGAD50000002124	Paired 10X 3' single-nucleus RNA sequencing with PacBio Kinnex long-read sequencing in 7 control and 8 Alzheimer's disease human frontal cortex samples.	Illumina NovaSeq 6000 Sequel II	15
EGAD50000002125	TIIC dataset generated by BCAST project in BCAC from tumor cores from 12,285 breast cancer patients. Includes automated scores (multiply imputed) for CD8, CD20, CD163 and FOXP3 in all tissue, stromal and tumour tissue separately. Linked clinical patient data.		12285
EGAD50000002126	The dataset contains cfMeDIP-sequencing data (bam files) for 18 plasma samples from pleural mesothelioma patients. The libraries were prepared according to the published protocol (Shen, S. Y., Burgener, J. M., Bratman, S. V. & De Carvalho, D. D. Preparation of cfMeDIP- seq libraries for methylome profiling of plasma cell-free DNA. Nat. Protoc. 14, 2749–2780 (2019)). Whole genome sequencing (WGS) was performed for pre-made libraries on NovaSeq X Plus with a 150 paired-end strategy. Low-quality reads and adapters were removed with Trim Galore version 0.6.6 (https://www.bioinformatics.babraham.ac.uk/projects/trim_galore). The trimmed reads were aligned to hg38 with Bowtie2 version 2.3.4.3. SAMTools version 1.9 was used to convert the SAM alignment files to BAM files, sort and index reads, and remove duplicates.	Illumina NovaSeq X	18
EGAD50000002127	The dataset contains WES data (bam files) for 42 pleural mesothelioma samples. Nextera Flex for Enrichment solution (Illumina, San Diego, CA) in combination with SureSelect Human All Exon V7 probes (Agilent, Santa Clara, CA) was used for library preparation and generated libraries were sequenced on NovaSeq 6000 (Illumina, San Diego, CA) in 150 pair-end mode. Quality control of the raw data was conducted using fastQC (v. 0.11.8) (https://www.bioinformatics. babraham.ac.uk/projects/fastqc/). WES data were then analyzed with the DRAGEN Bio IT platform 49 (version 4.2.4) and the human GRCh38 as reference genome.	Illumina NovaSeq 6000	42
EGAD50000002128	The dataset contains WES data (bam files) for 45 pleural mesothelioma samples. Nextera Flex for Enrichment solution (Illumina, San Diego, CA) in combination with SureSelect Human All Exon V7 probes (Agilent, Santa Clara, CA) was used for library preparation and generated libraries were sequenced on NovaSeq 6000 (Illumina, San Diego, CA) in 150 pair-end mode. Quality control of the raw data was conducted using fastQC (v. 0.11.8) (https://www.bioinformatics. babraham.ac.uk/projects/fastqc/). WES data were then analyzed with the DRAGEN Bio IT platform 49 (version 4.2.4) and the human GRCh38 as reference genome.	Illumina NovaSeq 6000	90
EGAD50000002129	The dataset contains RRBS data (bam files) for 58 pleural mesothelioma samples. The samples were sequenced on NovaSeq6000 (Illumina, San Diego, CA) in paired-end 150bp mode. RRBS raw reads were trimmed for adaptor sequences using trim galore (v. 0.6.5) (http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) and filtered for low-quality sequences using fastQC (v. 0.11.8) (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/). High quality trimmed reads were mapped to the Human reference genome (UCSC genome assembly GRCh38/hg38) using Bismark (v. 0.22.3) with default parameters.	Illumina NovaSeq 6000	58
EGAD50000002130	The dataset contains RRBS data (bam files) for 25 pleural mesothelioma samples.The samples were sequenced on NovaSeq6000 (Illumina, San Diego, CA) in single-end 150bp mode. RRBS raw reads were trimmed for adaptor sequences using trim galore (v. 0.6.5) (http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) and filtered for low-quality sequences using fastQC (v. 0.11.8) (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/). High quality trimmed reads were mapped to the Human reference genome (UCSC genome assembly GRCh38/hg38) using Bismark (v. 0.22.3) with default parameters.	Illumina NovaSeq 6000	25
EGAD50000002131	The dataset contains RNA-sequencing data (bam files) for 82 pleural mesothelioma samples. All the samples were sequenced on an Illumina NovaSeq 600 sequencer in paired-end mode, generating 100 nt length reads, to obtain an average of 60 million clusters for RNA. Demultiplexing was performed using Illumina bcl2fastq2. Fastq quality was assessed using fastQC (v. 0.11.8) (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and low-quality reads were discarded. Sequence reads were aligned to Human reference genome (UCSC genome assembly GRCh38/hg38) using STAR (v. 2.7.0b).	Illumina NovaSeq 6000	82
EGAD50000002132	Samples was collected at the Mayo Clinic. MM tumor samples were enriched using the StemCell EasySep Human CD138 positive selection kit II (StemCell, Cambridge, MA). DNA from CD138+ MM cell (tumor) or peripheral blood samples (germline control) at Mayo Clinic (Rochester or Arizona) underwent WGS. For samples collected at Mayo Clinic Rochester, library preparation was completed using the NEB Ultra II (New England Biolabs, Ipswich, MA) and the Nextera Flex systems (Illumina, San Diego, CA) and sequenced at the Mayo Clinic Genome Analysis Core with an approximate depth of 30X for tumor samples and 60X for the germline control. For samples collected at Mayo Clinic Arizona, Novogene was used for WGS following random shearing of the DNA, end-repairs, A-tailed and ligated with Illumina adapters. Sequencing was performed on the Illumina Novaseq 6000 sequencer Reference Genome: GRCh38.	Illumina NovaSeq 6000	43
EGAD50000002134	Data of advanced basal cell carcinoma patients receiving hedgehog-inhibitor and immune checkpoint-inhibitor combination treatment. The single-cell RNA-sequencing data was generated via the Parse Biosciences platform from PBMC samples of 10 patients at 4 four timepoints each. The spatial transcriptomics dataset was generated using the 10x Visium platform (+CITE-seq) and contains altogether 10 biopsies at different timepoints during treatment from 3 patients. The processed matrices with the spatial images can be downloaded from: https://zenodo.org/records/17814765	Illumina NovaSeq 6000	30
EGAD50000002135	This dataset contains the genotypes status of rs774984872G>T, a loss of function variant in ANGPTL8. 8 individuals had the reference allele (G/G), 5 were heterozygous (G/T) and 1 was a knock out (T/T).		14
EGAD50000002137	Whole exome sequencing of patient-derived xenograft (PDX) models from NOD/SCID mice without treatment. These PDX are obtained from tumors from patients with metastatic colorectal cancer after serial passaging in mice.	Illumina NovaSeq X	92
EGAD50000002138	Standard polyA enriched RNA-seq profiles of patient-derived xenograft (PDX) models from NOD/SCID mice without treatment. These PDX are obtained from tumors from patients with metastatic colorectal cancer after serial passaging in mice.	Illumina NovaSeq X	76
EGAD50000002141	We performed longitudinal genomic analysis of samples from patients with colorectal-derived neoplasias from different stages through metastasis and treatment.	Illumina Genome Analyzer	488
EGAD50000002148	Low coverage whole genome sequencing form CSF-derived cell free DNA for matching samples with EM-seq cohort	Illumina NovaSeq X	57
EGAD50000002156	The dataset includes FASTQ and BAM files from diagnostic and matched remission (control) samples of four B-other ALL patients (ALLT-321, ALLT-311, ALLT-314, ALLT-319). Sequencing library was made using PCR free-based WGS library preparation (PCR-), using TruSeq DNA PCRfree library preparation kit v1. Both diagnostic (60x) and remission (30x) samples were sequenced using HiSeq X.	Illumina HiSeq X	8
EGAD50000002157	The dataset includes FASTQ and BAM files from diagnostic and matched remission (control) samples of five B-other ALL patients (ALLT-327, ALLT-334, ALLT-357, ALLK-108, ALLT-361). Sequencing library was made using NEB Next® Ultra™ DNA Library Prep Kit. Both diagnostic (60x) and remission (30x) samples were sequenced using HiSeq X.	Illumina HiSeq X Illumina NovaSeq 6000	10
EGAD50000002158	The dataset includes FASTQ and BAM/CRAM files from diagnostic and matched remission (control) samples of 10 B-other ALL patients (ALLT-369,ALLT-378, ALLK-109, ALLK-110, ALLK-115, ALLT-391, ALLK-101, ALLK-113, ALLK-123, ALLT-393). Sequencing library was made using PCR free-based WGS library preparation (PCR-), using NEBNext® Ultra™ II DNA Library Prep Kit. Both diagnostic (90x) and remission (30x) samples were sequenced using Illumina NovaSeq 6000.	Illumina NovaSeq 6000	20
EGAD50000002159	The dataset includes FASTQ and CRAM files from diagnostic and matched remission (control) samples of two B-other ALL patients (ALLT-389, ALLK-124). Sequencing library was made using Novogene NGS DNA Library Prep Set. Both diagnostic (90x) and remission (30x) samples were sequenced using Illumina NovaSeq 6000.	Illumina NovaSeq 6000	4
EGAD50000002160	The dataset includes FASTQ and BAM/CRAM files from diagnostic and matched remission (control) samples of two B-other ALL patients (ALLT-396, ALLK-131). Sequencing library was made using PCR free-based WGS library preparation (PCR-), using Novogene NGS DNA Library Prep Set. Both diagnostic (90x) and remission (30x) samples were sequenced using NovaSeqX Plus.	Illumina NovaSeq X Plus	4
EGAD50000002161	The dataset includes FASTQ and CRAM files from diagnostic and matched remission (control) samples of one B-other ALL patients (ALLK-138). Sequencing library was made using Novogene NGS DNA Library Prep Set. Both diagnostic (90x) and remission (30x) samples were sequenced using NovaSeqX Plus.	Illumina NovaSeq 6000	2
EGAD50000002162	DNA was purified using AMPure XP magnetic beads (Beckman Coulter [BC] Life Sciences, Indianapolis, IN) and its quality was evaluated in an Agilent 4200 Tapestation using the Genomic DNA ScreenTape system (Agilent, Santa Clara, CA). DNA was quantified using the Qubit System (Invitrogen). Extracted DNA from AL patients was fragmented to 200bp by a Covaris S220 ultrasonicator (Covaris, Woburn, MA) and used to generate and capture libraries following the manufacturer’s indications from Twist Bioscience’s Human Core Exome Enrichment Kit (Twist Bioscience). Extracted DNA from MM patients and HA was fragmented to an average size of 225 bp using a Covaris S220 ultrasonicator (Covaris, Woburn, MA) and subjected to DNA library construction using Chromium Genome Reagent Kit V2 for Exome Assays (10X Genomics). Target enrichment was performed with SureSelectXT Human All Exon V6 Capture Library (Agilent) or SureSelectXT2 Human All Exon V5+UTRs (Agilent), and sequence targets were captured and amplified in accordance with the manufacturer’s protocol. Final libraries were combined with standard Illumina sequencing primers and paired-end sequenced (150bp) on a HiSeq X platform (Illumina, CA) or in a NovaSeq 6000 (Illumina).	Illumina NovaSeq 6000	221
EGAD50000002163	To delineate the mutational landscape of classical Hodgkin Lymphoma targeted sequencing using a capture panel ("medistinal lymphoma panel") targeting 217 genes was performed on HRS cells isolated from classical Hodgkin Lymphoma tumors (n=107) and corresponding normal samples (n=25)	NextSeq 550	132
EGAD50000002164	To delineate the mutational landscape of classical Hodgkin Lymphoma Whole Exome sequencing (WES) was performed on HRS cells isolated from classical Hodgkin Lymphoma tumors and corresponding normal samples (n=8)	NextSeq 550	16
EGAD50000002166	3'-end low-depth RNA sequencing data from 108 patients with chronic lymphocytic leukemia and 8 patients with lymphoma. Each primary patient sample was treated in vitro with DMSO as a control and with up to ten small-molecule inhibitors (ibrutinib, duvelisib, everolimus, trametinib, nutlin-3a, I-BET762, MK2206, selinexor, compound 26, and a combination of ibrutinib and compound 26). Sequencing was performed on the Illumina HiSeq4000. The dataset contains FASTQ files.	Illumina HiSeq 4000	1476
EGAD50000002167	For patient P01, HiChIP experiment for was performed using the Arima-HiC+ kit (A510008, Arima genomics) according to the manufacturer’s protocols.	unspecified	1
EGAD50000002169	Samples were sequenced with NovaSeq6000 and the sequencing files were processed using the RSSnextflow workflow (https://gitlab.com/b8307038/rssnextflow). The raw read counts from all batches were combined. Analysis-ready batch-corrected read counts were generated using ComBat-seq.	Illumina NovaSeq 6000	97
EGAD50000002170	Samples were sequenced with NovaSeq6000 and the sequencing files were processed using the RSSnextflow workflow (https://gitlab.com/b8307038/rssnextflow). The raw read counts from all batches were combined. Analysis-ready batch-corrected read counts were generated using ComBat-seq. Read counts normalisation for DGE analysis was performed with limma-voom function (log2 CPM) as seen in the pipeline.	Illumina NovaSeq 6000	97
EGAD50000002172	This dataset comprises raw sequencing data (FASTQ files) derived from 13 Formalin-Fixed Paraffin-Embedded (FFPE) melanoma tissue samples profiled using the 10x Genomics Visium platform. These spatially resolved transcriptomic profiles serve as the primary dataset for the STimage study, enabling the development and benchmarking of deep learning models that predict gene expression directly from H&E histology images.	Illumina NovaSeq 6000	10
EGAD50000002178	Tumor versus Germline variant calling on UPST-SCCHN3 cohort	Illumina Genome Analyzer II Illumina NovaSeq 6000	460
EGAD50000002179	Transcriptomics of Tumor samples on UPST-SCCHN3 cohort	Illumina Genome Analyzer II	217
EGAD50000002180	This dataset contains summarized somatic variant call data derived from paired tumor–blood whole-exome sequencing of human samples. Access to the dataset is controlled and granted only upon submission and approval of a formal data access request. Requests are reviewed by the Data Access Committee led by the Principal Investigator to ensure compliance with applicable legal, ethical, and data protection requirements. Users must not attempt to re-identify study participants and must comply with all relevant confidentiality and data protection regulations. In accordance with local legal and ethical requirements, individual-level raw sequencing data are not shared.		10
EGAD50000002181	Bulk-RNA sequencing from hiPSC-derived cells (hiPSC; Endothelial Cells; Neural Crests Cells; Vascular Smooth Muscle Cells) from CADASIL patient lines (n=3) and respective isogenic-controls (n=3). CADASIL is a hereditary brain small vessel disease caused by pathogenic variants in the NOTCH3 gene, which lead to deposits of NOTCH3 protein in the walls of small arteries. This causes pathological vessel wall changes including degeneration of vascular smooth muscle cells. To investigate the impact of pathogenic NOTCH3 variants in hiPSC and differentiated cells we performed a bulk-RNA sequencing.	Illumina HiSeq X	24
EGAD50000002182	This dataset comprises whole exome sequencing data and RNA-seq data of advanced rare cancer patients.	Illumina NovaSeq 6000	555
EGAD50000002183	This dataset comprises shallow whole-genome sequencing data (~3× coverage) of circulating tumor DNA extracted from baseline and on-treatment plasma samples of patients with relapsed or refractory germ cell tumors treated with salvage high-dose chemotherapy. Libraries were generated from 20 ng of cfDNA using the KAPA Hyper Prep protocol with unique dual indexing and sequenced on an Illumina NovaSeq 6000 platform to enable genome-wide copy number and biomarker analyses associated with treatment response and clinical outcome.	Illumina NovaSeq 6000	74
EGAD50000002185	The dataset includes FASTQ and CRAM files from diagnostic and matched remission (control) samples of one BCR::ABL1 ALL patient (ALLT-399). Sequencing library was made using PCR free-based WGS library preparation (PCR-), using Novogene NGS DNA Library Prep Set Kit (Cat No. PT004). Both diagnostic (90x) and remission (30x) samples were sequenced using NovaSeqX Plus.	Illumina NovaSeq X Plus	2
EGAD50000002186	55 Supratentorial Ependymomas patient samples were sequenced scRNA-sequencing using Smart-Seq2 and 10x protocol (51 samples) and 10x Genomics protocol (4 samples) on a NextSeq 550 sequencer (Illumina).	NextSeq 550	55
EGAD50000002187	Processed data for 100 WES from the Data Processing Freeze 1		100
EGAD50000002189	This dataset contains raw single-cell DNA sequencing data generated from 31 patients with aplastic anemia using the Mission Bio Tapestri platform. Libraries were prepared with a targeted DNA amplicon panel and sequenced on an Illumina platform. These raw data capture single-cell variant profiles and clonal architecture and serve as the basis for downstream analysis of somatic mutations and clonal hematopoiesis.	Illumina NovaSeq 6000	82
EGAD50000002190	Raw 10X 5' v3 scRNA-seq data (GEX) for PBMC of 3 helthy donors and two AGLCD patient timepoints as well as of LPMC of 3 Crohns disease patients and one timepoint of the AGLCD patient.	Illumina NovaSeq 6000	9
EGAD50000002191	Raw 10X 5' v3 scTCR-seq data (TCR) for PBMC of 3 helthy donors and two AGLCD patient timepoints as well as of LPMC of 3 Crohns disease patients and one timepoint of the AGLCD patient.	Illumina NovaSeq 6000	9
EGAD50000002195	This dataset contains raw single-cell whole-genome sequencing data generated from a single patient with aplastic anemia using the BioSkryb platform. Single cells were isolated based on surface marker expression (CD3+, CD33+, and CD34+) and subjected to whole-genome amplification followed by sequencing on an Illumina platform.	Illumina NovaSeq X	665
EGAD50000002199	This dataset comprises nine neuroblastoma cell lines used to construct the reference matrix for the RRBS-CL dataset in the corresponding benchmarking study. In addition, it includes one extra neuroblastoma cell line and a pooled cfDNA sample from multiple donors, which were used to generate the in silico mixtures. All the samples were profiled using DNA methylation sequencing.	Illumina NovaSeq 6000	11
EGAD50000002201	Ex vivo CD4 T-cells from 4 donors were stimulated with antigen-coupled beads to aCD3, ABD, EBNA1, ANO2 or CRYAB and the LiveCD3+CD4+CFSEdim cells were sorted by flow cytometry. Cells were labelled with barcoded antibodies to identify the original stimulus. All stimulations from each donor were pooled into four reactions, i.e. one reaction per donor. Libraries were assembled with the 10X Chromium Next GEM Single Cell 5’ Kit for TCR and transcriptome with feature barcodes. Dataset contains raw fastq reads as well as supplementary information needed to link fastq files to modality, i.e. gene-expression, protein or VDJ, and includes supplementary information to de-multiplex the reactions by stimulation condition.	Illumina NovaSeq X	4
EGAD50000002202	The dataset includes whole transcriptome RNAseq BAM files of both organoids and fibroblast mono- and co-cultures with have been processed by Picard markdups. The experiment featured 6 fibroblast cell lines derived from 5 different early CRC biobank patients (pt10, pt12, pt15, pt19, pt22) as well as from different regions within those samples (normal tissue, core or invasive front). The organoids were generated from patient pt5 carcinoma samples.	NextSeq 2000	31
EGAD50000002203	The dataset includes single-cell RNAseq data generated by Plate-Seq technology from scDiscoveries. The dataset contains samples from a total of 5 different early CRC biobank patients (pt5, pt11, pt13, pt14, pt16) from 4 different sequencing runs. Punch biopsies of different regions of the tumor (Core, Invasive front and, if available, Adenoma) were FACS sorted onto plates prior to sequencing.	NextSeq 500	16
EGAD50000002204	The dataset includes whole genome sequencing data generated from regional patient derived colorectal cancer (CRC) and adjacent normal tissue organoids. The dataset contains samples from a total of 10 different early CRC biobank patients from 2 different sequencing runs.	Illumina NovaSeq 6000	23
EGAD50000002205	The goal of IMMUcan is to understand how the immune system interacts with tumors and how therapeutic interventions influence this interaction. This cohort includes patients with confirmed locally recurrent and/or metastatic squamous cell carcinoma of the head and neck (SCCHN) who receive first-line treatment with immune checkpoint inhibitors (ICI) or ICI combined with chemotherapy. This dataset contains whole exome sequencing data from tumor and matching germline samples processed with the Vegan pipeline. It includes SNP and CNV calling.	Illumina Genome Analyzer II	499
EGAD50000002206	The goal of IMMUcan is to understand how the immune system interacts with tumors and how therapeutic interventions influence this interaction. This cohort includes patients with confirmed locally recurrent and/or metastatic squamous cell carcinoma of the head and neck (SCCHN) who receive first-line treatment with immune checkpoint inhibitors (ICI) or ICI combined with chemotherapy. This dataset contains transcriptomics data processed with the STAR pipeline. Two sequencing kits were employed: Roche KAPA for high-quality RNA samples and Takara SMART-Seq Stranded for samples with limited RNA quantities.	Illumina Genome Analyzer II	202
EGAD50000002209	FASTQ files from bulk RNA sequencing of a iPSC-derived rapid neuronal model (i3Ns) harbouring the SNCA A53T mutation in a WTC11 background as a model of synucleinopathy, with or without treatment of 300nM RSL3 for 6 hours. Libraries were prepared using the Watchmaker rRNA and globin depletion kit and sequenced on a NovaSeq X with an average depth of ~27 million paired-end reads with a fragment length of 100 base pairs, per sample. 19 samples are included in this dataset, each with two paired fastq files (R1 and R2). The dataset includes control neurons (WT: dcas9 parental line and C4 CRISPR control) and A53T clones (B7, H8 and L8) across two neuronal inductions (ind1 vs ind2), with or without treatment of RSL3 (UT vs. RSL3).	Illumina NovaSeq X	19
EGAD50000002210	10x Chromium Single Cell 3′ single-cell cDNA sequencing data for adrenal tissue sampleA and sampleB.	Illumina NovaSeq X Plus	2
EGAD50000002211	Full-length single-cell cDNA PacBio Kinnex whole-transcriptome sequencing data for adrenal sampleA and sampleB.	Revio	2
EGAD50000002212	Twist custom capture–based targeted PacBio long-read scRNA-seq datasets for human adrenal sampleA and sampleB	Revio	2
EGAD50000002213	Participants collected a fresh stool sample immediately after defecation using a sterile Pasteur pipette and placed it in a 15 mL conical polypropylene tube. They were instructed to collect the sample as close as possible to their study center visit. Samples were transported and stored at −80 °C until DNA extraction. Microbial DNA was extracted from 200 mg of stool using a QIAamp DNA Stool Mini Kit (Qiagen), following manufacturer's instructions, after all samples were collected. DNA was quantified using a Qubit 2.0 Fluorometer with a dsDNA Assay Kit (Thermo Fisher Scientific). Library preparation and circularization of equimolarly pooled libraries were done with MGIEasy FS DNA Library Prep Set (MGI Tech Co, Ltd, Shenzhen, China) according to the standard protocol. The sequencing was performed with MGISEQ-2000 High-throughput Sequencing Set (FCL PE150) according to the manufacturer’s instructions (MGI Tech, Shenzhen, China)	unspecified	2071
EGAD50000002215	This dataset contains bulk RNA sequencing data generated from 36 multi-region tumour samples obtained from patients with intraductal papillary mucinous neoplasms (IPMN) and associated pancreatic ductal adenocarcinoma (PDAC). A total of 160 sequencing runs are included. Libraries were prepared from poly(A)-selected RNA and sequenced on an Illumina platform. The data are provided as raw sequencing files for downstream transcriptomic analyses.	Illumina NovaSeq 6000	160
EGAD50000002217	This dataset contains sequencing data from Bulk RNAseq, PCR-based RepSeq and SMRT sequencing to compare immunoglobulin repertoire profiling strategies using matched human B-cell samples	Illumina MiSeq Illumina NovaSeq X unspecified	3
EGAD50000002218	This dataset contains raw FASTQ files from bulk RNA sequencing of two patient-derived PDAC organoid lines (P28 and P40) carrying a WNT7B reporter knock-in. Organoids were sorted into mNeonGreen high (WNT-high) and low (WNT-low) populations, with three biological replicates per sorted population. RNA was extracted using the Qiagen QiaSymphony SP system, and libraries were prepared with the Illumina Truseq stranded polyA kit. Sequencing was performed on an Illumina NextSeq2000 platform using single-end 50bp reads. Quality control was conducted with FastQC, and reads were trimmed with TrimGalore as appropriate. Data are controlled-access and intended for transcriptomic analysis of WNT7B expression-dependent transcriptional programs in PDAC organoids.	NextSeq 2000	12
EGAD50000002219	This dataset contains raw FASTQ files from bulk RNA sequencing of three patient-derived pancreatic ductal adenocarcinoma (PDAC) organoid lines (P28, P40, P47). Each organoid line was treated with either the WNT secretion inhibitor LGK974 or DMSO for 24 hours, with three biological replicates per treatment condition. RNA was extracted using the Qiagen QiaSymphony SP system, and libraries were prepared with the Illumina Truseq stranded polyA kit. Sequencing was performed on an Illumina NextSeq2000 platform using single-end 50bp reads. Quality control was conducted with FastQC, and reads were trimmed with TrimGalore as appropriate. Data are controlled-access and intended for transcriptomic analysis of WNT pathway perturbation in PDAC organoid models.	NextSeq 2000	18
EGAD50000002220	This dataset contains raw FASTQ files from single cell RNA sequencing of three patient-derived pancreatic ductal adenocarcinoma (PDAC) organoid lines (P28, P40, P47) cultured under standard conditions. For each line, single cells were FACS sorted into 384-well capture plates, with each well containing a 50nl droplet of barcoded primers. Plates were processed following an adapted SORT-seq protocol, and cDNA libraries were generated using CEL-Seq2 with TruSeq small RNA primers (Illumina). Sequencing was performed on an Illumina NextSeq500 platform using paired-end reads (read 1: 26 cycles, index read: 6 cycles, read 2: 60 cycles). Data are controlled-access and intended for single cell transcriptomic analysis of PDAC organoid heterogeneity and WNT pathway activity.	NextSeq 500	3
EGAD50000002221	This is whole genome sequencing data from a cohort of 18 patients with germ cell tumours or Hodgkin Lymphoma, who developed clinically significant bleomycin-induced pneumonitis despite low clinical risk. DNA was extracted from whole blood, and 150-base paired-end reads were generated on an Illumina NovaSeq 6000 instrument to a minimum 30x depth. Sequence reads were aligned to the GRCh38 reference with BWA MEM, and germline variants were called following GATK best practice procedures. The dataset comprises per-sample read alignment files in BAM format, and a joint-called germline variant file in VCF format.		18
EGAD50000002222	This dataset includes whole-exome sequencing data from precancerous colorectal lesions (adenomas and advanced adenomas) obtained from 19 patients with Lynch syndrome. Forty-four samples are analyzed, including 33 mismatch repair–deficient (dMMR) and 11 mismatch repair–proficient (pMMR) lesions as assessed by immunohistochemistry. Library preparation was performed using the Twist Human Core Exome + RefSeq + Mitochondrial Panel, followed by sequencing on the Illumina NovaSeq 6000 platform. Aligned BAM files were generated with the DRAGEN pipeline using the human reference genome hg38	Illumina NovaSeq 6000	44
EGAD50000002224	This dataset contains the paired fastq files generated from 221 samples collected from patients with moderate-to-severe plaque psoriasis and sequenced on the Illumina NovaSeq 6000 platform. Patients were randomized in a 1:1:1:1:1 ratio to receive oral Zasocitinib at doses of 2 mg, 5 mg, 15 mg, 30 mg, or placebo once daily for 12 weeks. Clinical response was defined as achieving a PASI 75 response at week 12, while histologic response was defined as the absence of KRT16-positive cells in lesional skin after 12 weeks of treatment.	Illumina NovaSeq 6000	221
EGAD50000002230	Liquid biopsies generated for minimal residual disease monitoring of non-small cell lung cancer. Ultra deep targeted sequencing of lung cancer related genes with UMIs. This dataset contains the dedication cohort	Illumina NovaSeq 6000	505
EGAD50000002231	Liquid biopsies generated for minimal residual disease monitoring of advanced non-small cell lung cancer. Ultra deep targeted sequencing of lung cancer specific genes, with UMIs. This dataset consists of the biobank cohort	Illumina NovaSeq 6000	152
EGAD50000002235	The goal of IMMUcan is to understand how the immune system interacts with tumors and how therapeutic interventions influence this interaction. This cohort includes patients with non-small cell lung cancer who had surgical resection. This is a subcohort of the EORTC SPECTAlung study. This dataset contains whole exome sequencing data from tumor samples processed with the Vegan pipeline. In the absence of matching germline samples, variant calling is performed using two panel of normals (one for female and one for male). It includes SNP and CNV calling.	Illumina Genome Analyzer II	191
EGAD50000002236	The goal of IMMUcan is to understand how the immune system interacts with tumors and how therapeutic interventions influence this interaction. This cohort includes patients with non-small cell lung cancer who had surgical resection. This is a subcohort of the EORTC SPECTAlung study. This dataset contains transcriptomics data processed with the STAR pipeline. Two sequencing kits were employed: Roche KAPA for high-quality RNA samples and Takara SMART-Seq Stranded for samples with limited RNA quantities.	Illumina Genome Analyzer II	125
EGAD50000002237	Deposited here are whole-genome sequencing data for 26 paired breast cancer DCIS and matched-normal samples taken from the same individual. Average sequence coverage is 118x for DCIS samples and 41x for matched-normals. Matched-normal samples are from blood. Sequencing was performed on an Illumina HiseqX. Due to specific restrictions imposed by the ethical approval at sample collection, the use of the germline data is restricted to filtering of somatic mutation calls only and cannot be used outside this purpose.They are a part of a broader project that uses high-depth WGS to investigate the somatic mutation genomic landscape of DCIS in order to uncover biological insights into breast cancer progression as well as possible methods to stratify DCIS patients for individualised therapy or disease monitoring.	HiSeq X Ten	52
EGAD50000002238	Paired plasma cell-free DNA (cfDNA) samples were collected from 11 breast cancer patients before and after radiotherapy. Sequencing libraries were prepared using the Agilent SureSelect XT HS2 kit and sequenced on the Illumina NovaSeq 6000 platform. Neo-RT (NCT03818100) is a non-randomised, single-arm feasibility study evaluating neoadjuvant radiotherapy combined with endocrine therapy in women with ER-positive, HER2-negative breast cancer. Eligible patients had grade 1 or 2 disease (grade 3 permitted if chemotherapy was contraindicated) and a palpable tumour size ≥20 mm, for whom radiotherapy was intended to facilitate breast-conserving surgery.	Illumina NovaSeq 6000	22
EGAD50000002239	Illumina short read WGS of 12 oesophagogastric cancer samples	Illumina NovaSeq 6000	12
EGAD50000002240	A multi-omic dataset of single-nucleus paired ATAC-seq + RNA-seq data of nuclei from the post-mortem human primary motor cortex from patients with ALS/ALS-FTD and unaffected controls.	Illumina NovaSeq 6000	32
EGAD50000002241	6 samples from 3 mice related to Alveolar Rhabdomysarcoma. In this study, we employ scRNA-seq to analyze ARMS tumors from a conditional mouse model of ARMS. The conditional knock-in mouse model of ARMS, in which the Pax3-Foxo1 allele is activated in the skeletal muscle cell lineage via Myf6Cre (Myf6CrePax3P3Fm/P3FmTrp53F2-10/F2-10; referred to hereafter as GEMM-ARMS) has been described previously23,24. These mice also carry Cre-activated reporters, eYFP to label Pax3::Foxo1-expressing cells and dsRED2 to labels Myh2-expressing cells. Some tumors are derived from GEMM-ARMS mice, and some are allografts of GEMM-ARMS tumors into immunodeficient mice	unspecified	6
EGAD50000002242	RNA-Sequencing of whole blood collected from patients with biallelic RNU4ATAC/RNU6ATAC variants, parents who carry the variants, healthy controls and type 1 diabetes controls.	Illumina NovaSeq X	23
EGAD50000002243	A dataset of single-nucleus RNA-seq data of nuclei from the post-mortem human primary motor cortex from patients with ALS-FTD. The nuclei were sorted by flow cytomotetry into NeuN+ (neuronal) TDP-43 low vs. TDP-43 high fractions.	Illumina NovaSeq X Plus	11
EGAD50000002244	Microsatellite stable (MSS) colorectal cancers (CRC) are largely unresponsive to immune checkpoint inhibition (ICI), prompting investigation into strategies to enhance sensitivity. The MAYA trial, which utilized temozolomide (TMZ) in MGMT-silenced MSS mCRC, hypothesized that TMZ-induced hypermutation could sensitize tumors to ICI. This phase II trial met its primary endpoint, demonstrating durable clinical responses with TMZ combined with ipilimumab and nivolumab. To elucidate factors in uencing response heterogeneity, we conducted multi-omic spatial pro ling of samples from patients who participated in the MAYA trial, including baseline and on-treatment tissue and blood specimens. While increased neoantigen load following TMZ exposure did not consistently predict for deep responses, spatial profiling revealed key determinants. Lymphocyte proportions, particularly CD8+KI67+ cells, within stromal and tumor compartments, along with macrophage composition (CD68+CD163+ cells) at the tumor-stromal interface, were predictive of response. Treatment pressures dynamically altered the tumor microenvironment composition and activated peripheral immune cells. This study is the first to identify spatial predictors of response to this promising novel treatment approach for MSS CRC.	Illumina NovaSeq X	45
EGAD50000002247	Single-cell sequencing data from 2 HIV-1 post-intervention controllers and 2 non-controllers reported in Fisher, Garcia, Frattari, Naasz, et al. Longitudinal samples were collected before antiretroviral therapy inititation, during suppressive therapy, and after analytical treatment interruption.	Illumina NovaSeq 6000	16
EGAD50000002249	The dataset contains circulating tumor DNA (ctDNA) profiles from frozen plasma samples of 593 patients from the TRIDENT-1 clinical trial. Plasma was processed with centrifugation and automated liquid handling systems, and cell-free DNA was extracted using the Qiagen QIAsymphony platform. ctDNA libraries were constructed with the Guardant360 2.11 assay, which used targeted, hybridization-based capture and barcoding of cfDNA fragments for digital analysis, followed by high-throughput sequencing on the Illumina NextSeq 550 instrument. Sequencing data were processed and interpreted using the Bioinformatics Pipeline (BIP v3.5.2 or later) for variant calling and quantification and managed within LabVantage LIMS (v3.7 or later) to support sample tracking and workflow integration. Aligned read files are provided in bam and bai formats.	NextSeq 550	593
EGAD50000002254	This dataset contains individual-level clinical phenotype data from patients with pancreatic ductal adenocarcinoma, including CA19-9 measurements, Lewis antigen status, and relevant clinicopathological variables. Data were generated as part of an ethically approved study and are available under controlled access.		615
EGAD50000002257	This dataset contains long-read whole-genome sequencing data generated from patients with aplastic anemia and RCC. Genomic DNA was prepared for long-read sequencing and sequenced using single-molecule real-time technology on a PacBio platform.	Revio	8
EGAD50000002258	SNPs have been associated with the risk of Parkinson’s disease but not in regulatory regions. While those potentially disrupt TFBS altering gene expression and contributing to cis-regulatory variation. To study changes in A and B compartment interactions and in topologically associated domains (TADs), we sequenced: - 3D chromatin contact using LowC, 6 samples. 3 replicates for each condition: smNPC (controls) cells and differentiated neurons after 30 days. Resulting paired-end are provided as BAM files. In order to confirm the predicted impact of the PD-associated allele of rs144814361 on BAG3 promoter, we proceeded with genome editing of the TH-REP1 cell line by using prime editing to insert the “T” allele at the position chr10:119651405 in the BAG3 promoter. - Chromatin accessibility using ATAC-seq, 9 samples of derived cell line TH-REP1. 3 replicates for each condition: 1) iPSC Wild Type 2) SNP-BAG3 variant in iPSC 3) SNP-BAG3 variant in smNPC. BAG3 variant: heterozygous for the SNP variant: rs144814361 (chr10 119651405) Resulting sequencing are provided as paired-end FASTQ files.	NextSeq 2000	15
EGAD50000002260	This dataset contains raw sequencing data (BAM/BAI files) generated from human neuroblastoma patient samples. The data were produced using targeted high-throughput next-generation sequencing and are intended to support genomic analyses of tumor-associated alterations. All files are provided under controlled access in accordance with EGA data protection requirements.	Illumina MiSeq	26
EGAD50000002261	This dataset contains biomodal sequencing data generated using the duet + modC assay from plasma-derived cell-free DNA of endometrial cancer patients and healthy donors. Sequencing was performed on an Illumina NovaSeq 6000. Dataset includes raw sequencing FASTQ files. Duet +modC output enables whole-genome profiling of both genetic variation and DNA methylation.	Illumina NovaSeq 6000	88
EGAD50000002262	This dataset contains targeted sequencing data generated using the TruSight Oncology 500 ctDNA v2 assay from plasma-derived cell-free DNA of dMMR endometrial cancer patients, at baseline and C3D1 of immunotherapy. Sequencing was performed on Illumina NovaSeq600. Dataset includes aligned sequencing files in BAM format generated using the DRAGEN TruSight Oncology 500 ctDNA v2.6.0 analysis pipeline.	Illumina NovaSeq 6000	52
EGAD50000002263	This dataset contains whole-genome sequencing data generated from plasma cell-free DNA of dMMR endometrial cancer patients. Sequencing was performed on Illumina platforms to an average depth of approximately 10× and the dataset includes raw sequencing reads in FASTQ format.	Illumina NovaSeq 6000	30
EGAD50000002264	This dataset contains whole exome sequencing data of matched pairs of primary tumour and normal frozen tissue of seven osteosarcoma patients. Whole exome sequencing with a minimum coverage of 100x was performed using the Illumina Novaseq6000 platform and the Agilent SureSelect Human All Exon V7 kit.	Illumina NovaSeq 6000	14
EGAD50000002265	Single-cell RNA sequencing (scRNA-seq) was performed on cells isolated from intestinal mucosa and small intestinal neuroendocrine tumors (SI-NETs) to compare their transcriptomic profiles. Tumor and adjacent normal tissues were collected from two SI-NET patients, enzymatically dissociated, and processed into single-cell suspensions. Single-cell libraries were generated using the Chromium Single Cell 3′ Reagent Kit v3 and sequenced on an Illumina platform to produce FASTQ files for downstream analysis.	Illumina NovaSeq 6000	4
EGAD50000002266	Single-cell RNA sequencing and spatial transcriptomic sequencing data associated with manuscript 'Ectopic NMDAR expression in cancer unmasks germline-encoded autoimmunity'	NextSeq 500	1
EGAD50000002273	CTCF single-cell D&D-seq with GoT-ChA targeted genotyping of IDH2 R140Q CHIP variant from primary blood sample	Illumina NovaSeq X	4
EGAD50000002277	10X snMultiome (ATAC+GEX) sequencing of 1 human reactive tonsil sample and 1 DLBCL sample for the study of "SPEN loss drives extra-follicular diffuse large B cell lymphoma with female-specific lethality and TLR pathway therapeutic vulnerabilities"	NextSeq 550	4
EGAD50000002278	RNA sequencing was performed on untreated patient tumor tissue and matched patient-derived xenograft (PDX) models of primary head and neck squamous cell carcinoma (HNSCC). The aim of this analysis was to characterize the transcriptional landscape of the tumors in the in vivo setting. 117 total samples were sequenced with paired reads on 10 lanes from different flowcells. They were loaded as 2400 separate files to maintain the data as it was received from BGI - each sample is therefore composed of 5 separate EGAN with 4 files each, adding a 0.1 - .. - 0.5 to each BioSample ID to generate sample aliases. All R1 (and R2, separately) files referring to a BioSample ID should be concatenated to obtain fastq files corresponding to single patients or PDXs.	Illumina HiSeq X	585
EGAD50000002279	Single-cell RNA sequencing (5′ gene expression) and single-cell V(D)J sequencing (B-cell receptor and T-cell receptor) were performed on bone marrow mononuclear cells from patients with Waldenström macroglobulinemia treated with ibrutinib monotherapy. Sequential bone marrow aspirate samples were collected at baseline and after treatment. A total of 222 BAM files are included in this dataset.	Illumina NovaSeq 6000	74
EGAD50000002284	Visium HD aligned reads from patient 1D525, BAM file from 10X spaceranger output directory	NextSeq 2000	1
EGAD50000002285	The dataset includes FASTQ and BAM/CRAM files from diagnostic and matched remission (control) samples of five ETV6::RUNX1 ALL patients (ALLT-380, ALLK-114, ALLT-388, ALLT-392, ALLT-394). Sequencing libraries were made using PCR free-based WGS library preparation (PCR-), using NEBNext® Ultra™ II DNA Library Prep Kit (Cat No. E7645). Both diagnostic (90x) and remission (30x) samples were sequenced using Illumina NovaSeq 6000.	Illumina NovaSeq 6000	10
EGAD50000002286	The dataset includes FASTQ and CRAM files from second relapse and matched remission (control) samples of one ETV6::RUNX1 ALL patient (ALLK-104). Sequencing libraries were made using using NEBNext® Ultra™ II DNA Library Prep Kit (Cat No. E7645). Both second relapse (90x) and remission (30x) samples were sequenced using Illumina NovaSeq 6000.	Illumina NovaSeq 6000	2
EGAD50000002287	The dataset includes FASTQ and CRAM files from diagnostic and matched remission (control) samples of two ETV6::RUNX1 ALL patients (ALLT-397, ALLK-132). Sequencing libraries were made using PCR free-based WGS library preparation (PCR-), using Novogene NGS DNA Library Prep Set (Cat No. PT004). Both diagnostic (90x) and remission (30x) samples were sequenced using Illumina NovaSeqX Plus.	Illumina NovaSeq X Plus	4
EGAD50000002289	Exome sequencing of 120 lymphoma samples and 13 normal samples for LySeqST: A targeted sequencing assay for robust genomic classification of diffuse large B-cell lymphoma.	unspecified	133
EGAD50000002290	Targeted capture sequencing of 445 lymphoma samples and 13 normal samples for LySeqST: A targeted sequencing assay for robust genomic classification of diffuse large B-cell lymphoma.	unspecified	458
EGAD50000002292	The dataset comprises whole-genome sequences from 40 primary uveal melanomas and their matched normal samples.	Illumina HiSeq X	80
EGAD50000002293	To ellucidate protein activity in mRCC, Kinomics data was generated as part of the EuroTarget study	unspecified	268
EGAD50000002294	Small RNA and MicroRNA sequencing data was generated for the multiomics analysis of Eurotarget	Illumina HiSeq 2000	108
EGAD50000002295	To ellucidate methylation activity in mRCC, methylation data was generated as part of the EuroTarget study	unspecified	268
EGAD50000002296	To ellucidate gene activity in mRCC, RNA Sequencing data was generated as part of the EuroTarget study. Star aligner 2.7 was used to generate count files	Illumina HiSeq 2500	268
EGAD50000002299	Ex vivo differentiation experiments were performed using human CD34+ hematopoietic stem and progenitor cells along the erythropoietic trajectory at multiple timepoints (differentiation days 0, 5, 8, 18, and 24). Simultaneous profiling of transcriptome, chromatin accessibility, and GATA1 binding were performed by integrating D&D-seq with 10X Multiome.	Illumina NovaSeq X	2
EGAD50000002300	We generated long-read whole-genome sequencing (WGS) data using PacBio HiFi from two human gastric cancer cell lines exhibiting microsatellite instability (MSI). The dataset contains aligned bam (hg38) files.	Revio	2
EGAD50000002306	This dataset comprises 16S rRNA gene V4 region amplicon sequencing data generated from 1,826 human faecal samples, predominantly from the Estonian population. The dataset includes raw sequencing reads and processed amplicon sequence variant (ASV) outputs, consisting of ASV abundance tables, representative ASV sequences, and taxonomic assignments. In addition, a sample metadata table is provided, containing sample and host information, including sample identifier, sequencing run identifier, date of sampling, sex, age, height, weight, nationality, disease status, and Bristol stool type. Sequencing was performed using the Illumina MiSeq and iSeq 100 platforms. The cohort includes individuals with no reported disease (n = 889), gastrointestinal diseases (n = 217), allergies and asthma (n = 194), thyroid gland disorders (n = 117), cardiovascular diseases (n = 103), various general health conditions (n = 139), cancer (n = 28), diabetes (n = 20), gynaecological issues (n = 7). Disease status was unavailable for 112 samples.	Illumina iSeq 100 Illumina MiSeq	1826
EGAD50000002307	WGS data from biliary tract cancer samples (Holzapfel, et al., J Gastrointest Oncol,14(1):379-389, 2023). The dataset includes tumour and normal BAM files sequenced on Illumina HiSeq/NovaSeq and aligned against GRCh38 (n=20 tumour/normal pairs).		40
EGAD50000002308	WTS data for pancreatic cancer samples (PASS-01 Trial; Knox et al., Journal of Clinical Oncology, 2025). The dataset includes tumour bam files sequenced on Illumina HiSeq/NovaSeq and aligned against GRCh38 (n=115 tumours)		115
EGAD50000002309	WGS data for pancreatic cancer samples (PASS-01 Trial; Knox et al., Journal of Clinical Oncology, 2025). The dataset includes tumour and normal cram files sequenced on Illumina HiSeq/NovaSeq and aligned against GRCh38 (n=127 tumour/normal pairs)		254
EGAD50000002314	Cells previously cryopreserved in FBS-10%DMSO were thawed and resuspended in 1mL of cold lysis buffer (10 mM Tris-HCl pH 7.4, 154 mM NaCl, 0.2% BSA, 0.1% NP-40, 1 mM CaCl₂, 0.5 mM MgCl₂ in ultra-pure water) to lyse the cell membrane and release intact nuclei. The nuclei were stained with propidium iodide (10 µg/mL) and Hoechst 33258 (10µg/mL) to facilitate sorting based on cell cycle state and viability. Single nuclei from 48 cells per sample were isolated into 96-well plates containing 5µL of freeze buffer (1X PBS, 7.5% DMSO, and 40% ProFreeze freezing medium (Lonza)) using a MoFlo Astrios cell sorter (Beckman Coulter) at the Flow Cytometry Unit of the University Medical Center Groningen (UMCG), Netherlands. The plates were centrifuged at 500 g and stored at -80°C until library preparation. Single-nuclei libraries were prepared using a Bravo Automated Liquid Handling Platform (Agilent Technologies). Nuclei were lysed, and DNA was labelled with unique 10 bp dual barcodes and amplified as previously described. Sequencing was performed on an Illumina NextSeq 2000 platform with 77bp single end reads. The generated data were subsequently demultiplexed using sample-specific barcodes and converted to FASTQ files with bcl2fastq. Sequencing reads were aligned to the GRCh38/hg38 human reference genome using Bowtie2 (v2.2.4), and duplicate reads were marked with BamUtil (v1.0.3). Single-cell karyotypes were determined by performing copy number analysis using AneuFinder (v4.3.3) (https://github.com/ataudt/aneufinder). The analysis included GC-content correction and blacklisting of artifact regions identified from euploid controls. Copy number calling was conducted with the dnacopy and edivisive algorithms, employing a bin size of 1 Mb and a step size of 500 Kb. Libraries with an average of fewer than 10 reads per chromosome copy per bin or less than 95% concordance between the two algorithms were excluded from analysis. Whole-chromosome aneuploidies were identified when over 95% of bins exhibited deviation from the disomic state. Aneuploidy scores were calculated as weighted averages of absolute copy number deviations from the euploid state.	NextSeq 2000	143
EGAD50000002315	Whole exome sequencing of patient tumour samples and matched patient derived organoids.	unspecified	22
EGAD50000002316	This dataset links 9 samples from the study "Phosphoproteomics adds value to treatment recommendations in molecular tumor boards" (EGAS00001007891) to the study EGAS00001007934. Further information about the samples can be found in the respective studies.		9
EGAD50000002318	This dataset contains the BAM files from 32 Danish patients with prostate cancer, both germline and tumor tissues. These samples are part of the Pan Prostate Cancer Genome project.	Illumina NovaSeq 6000	64
EGAD50000002319	This dataset contains raw sequencing and processed data of spatially resolved transcriptomes. The data were generated with 10X Visium probe-based assay of FFPE tissues of 14 CPT tumour resections and 2 non-neoplastic reference samples. Raw data contain two-lane fastq files of the sequencing. Processed data contain alinged and filteres count matrices as well as positional files as generated by 10x SpaceRanger	NextSeq 2000	16
EGAD50000002320	This dataset contains raw sequencing and processed data of single nuclei transcriptomes. The data were generated with 10x chromium probe-based single-nucleus assay on FFPE tissues of 43 Choroid Plexus Tumour resections and 12 non-neoplastic fetal and adult ChP resections. Raw data contain BAM files of the sequencing. Processed data contain count matrices as generated by 10x CellRanger for raw and filtered running options.	Illumina NovaSeq 6000	57
EGAD50000002321	This dataset contains bulk RNA-seq transcriptomic profiles from a human induced pluripotent stem cell (hiPSC) line and three engineered derivatives: B2M knockout, EGFP overexpression, and US2 overexpression, alongside the parental line. Total RNA integrity was assessed using the Experion RNA StdSens 1K Analysis Kit on the Bio-Rad Experion Automated Electrophoresis System prior to library preparation and sequencing. Libraries were sequenced on an Illumina NovaSeq 6000 generating ~20 million reads per sample. Raw read quality was assessed with FastQC, reads were aligned to the human reference genome hg38 (Ensembl release 109) using STAR, and alignment/library QC metrics (e.g., orientation, composition, coverage) were evaluated with Picard.	Illumina NovaSeq 6000	9
EGAD50000002323	This dataset contains WGS data derived from plasma cfDNA. The cohort includes PDAC and breast cancer patients alongside matched controls. These data are intended to support research into ctDNA and genomic fragmentation patterns.	Illumina NovaSeq X	57
EGAD50000002325	WES analysis was performed on untreated patient tumor/normal mucosa tissue and matched patient-derived xenograft (PDX) models of primary head and neck squamous cell carcinoma (HNSCC) to investigate the preservation of tumor-specific genomic alterations in the in vivo context. 183 total samples were sequenced with paired reads on 8-10 lanes from different flowcells. They were loaded as 3538 separate files to maintain the data as it was received from BGI - each sample is therefore composed of 5 separate EGAN with 2-4 files each, adding a 0.1 - .. - 0.5 to each BioSample ID to generate sample aliases. All R1 (and R2, separately) files referring to a BioSample ID should be concatenated to obtain fastq files corresponding to single patients or PDXs.	Illumina HiSeq X	915
EGAD50000002326	10X genomics chromium single-cell ATAC+RNA (Multiome) was use to prepare single-nucleus RNA- and ATAC-seq libraries, sequenced on an Illumina NovaSeq 6000 platform. BAM files are provided. Samples from 6 donors were sequenced after sorting for primitive HSPCs in 5 libraries; each library comprises a single donor, except one with cells from 1 male and 1 female donor.	Illumina NovaSeq 6000	5
EGAD50000002327	10X genomics chromium single-cell ATAC+RNA (Multiome) was use to prepare single-nucleus RNA- and ATAC-seq libraries, sequenced on an Illumina NovaSeq 6000 platform. BAM files are provided. Samples from 4 donors were sequenced after sorting for primitive HSPCs in 2 libraries, each with cells from 1 male and 1 female CB.	Illumina NovaSeq 6000	2
EGAD50000002328	10X genomics chromium single-cell ATAC+RNA (Multiome) was use to prepare single-nucleus RNA- and ATAC-seq libraries for human HSPC sorted from xenografts recovered from inflammatory challenge. 10X genomics 3' kits were used to prepare single-cell RNA-seq libraries for committed progenitors and myeloid cells from the same xenografts. These were sequenced on an Illumina NovaSea 6000 platform; BAM files are provided. Samples from 3 conditions were sequenced for each cell type; 3 chromatin accessibility runs and 9 gene expression runs are provided.	Illumina NovaSeq 6000	9
EGAD50000002331	This dataset comprises processed genomic data from 733 samples, corresponding to 733 whole-exome sequencing experiments. Data files include CRAM, gVCF (containing short variants, copy number variants, and structural variants), RoH (BED), B-allele frequencies (BW), and variant IGV visualizations (XML). Associated phenotypic and pedigree information for all samples is also included.		733
EGAD50000002332	This dataset comprises processed genomic data from 1243 samples, corresponding to 1243 whole-exome sequencing experiments. Data files include CRAM, gVCF (containing short variants, copy number variants, and structural variants), RoH (BED), B-allele frequencies (BW), and variant IGV visualizations (XML). Associated phenotypic and pedigree information for all samples is also included.		1243
EGAD50000002335	Bulk 3' mRNA-Seq raw files (FASTQ) from human kidney tubuloids derived from 4 separate donors in collagen-based dome culture and suspension culture. Samples were collected for sequencing on day 7 of passage 3, 4, and 5. Total = 24 samples.	Illumina NovaSeq 6000	24
EGAD50000002337	Single‑nucleus multiome data of 53 human fetal liver hematopoiesis spanning 5-18 post-conception-weeks. Nuclei were isolated after tissue dissociation and sorting live CD45+CD235- cells, and processed according to the Chromium Next GEM Single Cell Multiome ATAC + Gene Expression User Manual specifications. Sequencing data was processed with CellRanger Arc v2.0.2 and aligned to GRCh38-2020-A-2.0.0.	Illumina NovaSeq 6000	65
EGAD50000002339	Raw sequencing data (GEX, TCR, adn CSP libraries) for longitudinal cerebrospinal fluid and PBMC samples from a multiple myeloma patient treated with cilta-cel who developed MNT.	Illumina NovaSeq 6000	10
EGAD50000002341	This dataset consists of bulk RNA sequencing profiles generated from monocytes isolated from peripheral blood mononuclear cells (PBMCs) collected from 315 patients enrolled in the ImmunAID consortium. Each sample corresponds to a unique patient, providing a resource to investigate transcriptional variation in circulating monocytes across the cohort. PBMCs were isolated from whole blood, and monocytes were subsequently enriched using standard immunological protocols. Total RNA was extracted and libraries were prepared using the SMART-Seq mRNA Library Preparation Kit with Unique Molecular Identifiers (UMIs). Sequencing was performed on an Illumina NovaSeq 6000 platform following manufacturer-recommended protocols.	Illumina NovaSeq 6000	315
EGAD50000002343	This dataset contains FASTQ files from RNA sequencing (RNA-seq) of fibroblasts derived from a healthy donor (control), a patient with Kleefstra syndrome (KSP), and a patient carrying a pathogenic variant in EHMT2 (P2). Ribosomal RNA was removed using an RNase H–based depletion method, and strand-specific libraries were prepared for paired-end sequencing on the MGI DNBSEQ PE100 platform.	unspecified	3
EGAD50000002344	This dataset contains FASTQ files from RNA sequencing (RNA-seq) of induced pluripotent stem cells (iPSCs) derived from a healthy donor (control), a patient with Kleefstra syndrome (KSP), and two patients carrying pathogenic variants in EHMT2 (P1 and P2). Poly(A)-selected RNA was used for library preparation, and strand-specific libraries were generated for paired-end sequencing on the MGI DNBSEQ PE100 platform.	unspecified	4
EGAD50000002345	This dataset contains targeted next-generation sequencing data from 76 FFPE tumor tissues focusing on IDH1 exon 4 and codon 132. Genomic DNA was isolated using the QIAamp DNA FFPE Tissue Kit and sequenced on the Illumina MiSeq platform with 200x minimum coverage. The results are provided as encrypted somatic variant call files (VCF) generated through the Nextera XT library preparation protocol. This collection represents the specific somatic mutation profiles of the intrahepatic cholangiocarcinoma cohort for controlled access.	Illumina MiSeq	76
EGAD50000002353	Endometriosis, despite its high prevalence, is underdiagnosed and poorly managed due to lack of clinically validated biomarkers and pathophysiological insight. Menstrual bloodderived stem cells (MenSCs) have been implicated in disease pathogenesis, but their diagnostic potential remains unexplored. We conducted a clinical study (n=42; 19 endometriosis, 23 controls) to assess whether DNA methylation profiles of freshly isolated MenSCs can identify disease-specific biomarkers. Whole-genome methylation sequencing revealed differentially methylated regions (DMRs) enriched in genes linked to hallmarks of endometriosis (e.g., inflammation, tissue remodelling, development). These DMRs robustly distinguished cases from controls, independent of technical and clinical variables. Machine learning models trained and validated on these DMRs achieved high diagnostic performance (specificity 83%, sensitivity 79%). Integration with an independent single-cell RNA sequencing dataset showed that the DMRs may modulate gene expression, further supporting their biological relevance. These findings position MenSC DNA methylation profiling as a promising, non-invasive approach for early endometriosis diagnosis and personalised care.	Illumina NovaSeq X Plus	42
EGAD50000002359	The dataset comprises 27 single-cell multiome (ATAC + GEX) datasets generated from 13 patients with AML, collected across different stages of disease progression. In addition, matched long-read sequencing data from 5 samples were used to enable longitudinal analyses throughout the manuscript. Finally, bulk RNA-seq and ATAC-seq datasets from the cell line used in this study are also included.	Illumina NovaSeq 6000 PromethION	39
EGAD50000002362	This dataset contains WGS from whole blood of patients II.2, II.3, III.1-4 and WGS from fibroblasts of patients III.2 and III.3. All WGS was performed on an Illumina NextSeq2000 to a minimum depth of 250 million read pairs (2x150nts).	NextSeq 2000	8
EGAD50000002363	This datasets contains the putative telomeric sequences isolated by TARPON in ubam format after using a Duplex Capture Based Enrichment Protocol for patients II.2, II.4, and III.4.	MinION	3
EGAD50000002364	This datasets contains the Ribo-depleted RNA-sequencing of patients II.3, III.1, and III.3. Libraries were sequenced to a minimum depth of 195 million read pairs (2x75nts) on an Illumina NextSeq500 Midoutput FC.	NextSeq 500	3
EGAD50000002366	10X snMultiome (ATAC+GEX) sequencing of 5 human reactive tonsil samples for the study of Non-canonical NF-κB signaling skews B cells away from germinal center to low-affinity effector fate.	NextSeq 550	10
EGAD50000002367	Long read data generated for de novo assembly (PacBio and ONT FASTQ files)	PromethION unspecified	9
EGAD50000002370	The dataset contains RNA sequencing data of 66 high-grade serous carcinoma (HGSC) patients sequenced with Novoseq 6000. The 155 samples are fresh frozen tissue samples that have been collected from different tissues and time points before and during treatment. The files provided are paired fastq files.	Illumina NovaSeq 6000	143
EGAD50000002371	RNA sequencing data of fibroblasts from 3 controls and 3 patients with RBMX variants.	NextSeq 500	6
EGAD50000002377	SNP and indel variants were called for 1,925 samples from phase 1 of the TenK10K project. Variant calling from WGS alignments to the GRCh38 reference assembly was performed using GATK4 HaplotypeCaller in DRAGEN mode. These VCFs consist of common (>=1% minor allele frequency [MAF]) and rare (<1% MAF) variants . This dataset is comprised of autosomal variants provided as multisample compressed VCF format files. Principal component scores were derived by using gnomAD’s run_pca_with_relateds method.		1925
EGAD50000002378	Tandem repeat variants were called for 1,925 samples from phase 1 of the TenK10K project. Variant calling from WGS alignments to the GRCh38 reference assembly was completed with ExpansionHunter v5. This dataset is comprised of autosomal tandem repeat variants provided as multisample compressed VCF format files. SNV-derived principal component scores used in the manuscript (Tanudisastro et al.) are also provided in this dataset.		1925
EGAD50000002379	Single cell RNA-sequencing data for PBMCs from the Tenk10k Phase 1 cohort (1925 individuals post-QC). Libraries were prepared using the 10x Genomics 3’ Chromium Next GEM Single Cell HT V3.1 kit and sequenced on the NovaSeq 6000 platform. Reads were mapped to the GRCh38 reference genome with Cellranger, and count matrices were preprocessed using Scanpy.		1925
EGAD50000002380	Fastq files from RNA sequencing and BAM files from tNGS of normal and tumor samples from 17 patients from the locally advanced prostate cancer cohort.	Illumina NovaSeq 6000 unspecified	164
EGAD50000002381	Fastq files from RNA sequencing of normal and tumor samples from 16 patients from the de novo metastatic prostate cancer cohort.	Illumina NovaSeq 6000	46
EGAD50000002389	This dataset comprises processed genomic data from 1063 samples, corresponding to 1063 whole-exome sequencing experiments. Data files include CRAM, gVCF (containing short variants, copy number variants, and structural variants), RoH (BED), B-allele frequencies (BW), and variant IGV visualizations (XML). Associated phenotypic and pedigree information for all samples is also included.		1063
EGAD50000002390	irCLIP library generated from HEK293 cells transfected with either RIG-IWT or RIG-IC268F (n=3).	Illumina NovaSeq 6000	6
EGAD50000002392	This dataset comprises matched tumor–normal cell line samples. Whole-exome sequencing data are available for three matched pairs, and targeted deep sequencing–based somatic variant calls are provided for them as well in the form of VCF files.	Illumina HiSeq 2500 Illumina MiSeq	6
EGAD50000002394	The dataset contains RNAseq profiles of 386 patients from the CheckMate-577 (CA209-577) clinical trial whose ICF allows data deposition into a public repository. The Allprep DNA/RNA FFPE kit was used to simultaneously purify genomic DNA and total RNA from formalin-fixed, paraffin embedded (FFPE) tissue sections. RNA was analyzed using the Illumina TruSeq RNA Access method for library preparation, followed by sequencing on the Illumina NovaSeq platform with a 50bp paired-end strategy and a target read depth of 50M per sample. Fastq files are included.	Illumina NovaSeq 6000	386
EGAD50000002395	The dataset contains WES profiles of 396 patients from the CheckMate-577 (CA209-577) clinical trial whose ICF allows data deposition into a public repository. The Allprep DNA/RNA FFPE kit was used to simultaneously purify genomic DNA and total RNA from formalin-fixed, paraffin embedded (FFPE) tissue sections. DNA was analyzed by Whole Exome Sequencing using Swift Biosciences Accel-NGS 2S Hyb DNA Library Kit for library preparation and hybridization using Agilent SureSelect All Exon v6, followed by sequencing on the Illumina NovaSeq platform with a 100bp paired-end strategy and a target read depth of 100M per tumor sample. Fastq files are included.	Illumina NovaSeq 6000	396
EGAD50000002396	Here we present the Indigenous American Genomic Diversity Project (IAGDP), comprising 128 newly generated high-coverage (NGS; ~44X) whole genomes from Indigenous individuals across eight Latin American countries, representing 45 populations and 28 language families. This dataset expands Indigenous representation in genomics, with emphasis on geographically and linguistically diverse populations, particularly from the South American lowlands.	BGISEQ-500	128
EGAD50000002397	We report a rare case of a young patient (VENUS 167) initially diagnosed with grade 1 endometrioid endometrial cancer, which, following endocrine treatment, presented with mixed aggressive carcinoma with three distinct histologic patterns: grade 1 endometrioid, large cell neuroendocrine, and undifferentiated carcinoma. The NGS fastq data are WES, RNAseq and ATAC	Illumina NovaSeq 6000	23
EGAD50000002398	HiFi whole‑genome sequencing data from a single human saliva sample collected and stabilized using an Oragene device. Sequencing was performed on a SMRT® Cell using PacBio Revio™ SPRQ chemistry. Sequencing produced >100 Gb of HiFi reads yielding >30× genome coverage suitable for comprehensive variant and methylation analysis. Dataset includes long‑read HiFi sequences in BAM format, containing both aligned and unaligned reads.	Revio	1
EGAD50000002404	The dataset contains raw data (FASTQ) of whole transcriptome sequencing of C1498 cells (n=1 sample). Libraries were prepared with the Illumina Stranded Total RNA Prep with Ribo-Zero Plus kit (Illumina, San Diego, CA, USA) and and 100bp paired-end reads were generated on the NovaSeq 6000 Sequencing System (Illumina).	Illumina NovaSeq 6000	1
EGAD50000002408	This dataset contains mRNA-seq results from eleven samples from human donors, each sample being composed of 5-10 rebulked fiber type-specific myofibers. This sample set includes; 1) Three FT-I and three FT-IIa rebulked samples from control individuals 2) Three FT-I and two FT-IIa rebulked samples from NEM6 patients with a KBTBD13R408C variant (NP_001094832.1: p.Arg408Cys)	Illumina NovaSeq X	11
EGAD50000002409	The dataset contains whole genome sequencing data of 26 high-grade serous carcinoma (HGSC) patients sequenced with Novoseq 6000. The 86 samples are either fresh frozen tumour samples or blood samples. The files provided are paired fastq files.	Illumina NovaSeq 6000	86
EGAD50000002425	The dataset includes FASTQ and BAM files from diagnostic and matched remission (control) samples of three hyperdiploid ALL patients (ALLT-329, ALLT-347, ALLT-356). Sequencing libraries were made using NEB Next® Ultra™ DNA Library Prep Kit. Both diagnostic (60x) and remission (30x) samples were sequenced using Illumina HiSeq X.	Illumina HiSeq X	6
EGAD50000002426	The dataset includes FASTQ and BAM files from diagnostic and matched remission (control) samples of two hyperdiploid ALL patients (ALLT-315, ALLT-320). Sequencing libraries were made using PCR free-based WGS library preparation (PCR-), using TruSeq DNA PCRfree library preparation kit v1. Both diagnostic (60x) and remission (30x) samples were sequenced using Illumina HiSeq X.	Illumina HiSeq X	4
EGAD50000002427	The dataset includes FASTQ and BAM/CRAM files from diagnostic and matched remission (control) samples of eight hyperdiploid ALL patients (ALLT-364, ALLT-367, ALLT-368, ALLT-371, ALLT-377, ALLK-111, ALLT-382, ALLT-395). Sequencing libraries were made using PCR free-based WGS library preparation (PCR-), NEBNext® Ultra™ II DNA Library Prep Kit (Cat No. E7645). Both diagnostic (90x) and remission (30x) samples were sequenced using Illumina NovaSeq 6000.	Illumina NovaSeq 6000	16
EGAD50000002428	The dataset includes FASTQ and CRAM files from diagnostic and matched remission (control) samples of four hyperdiploid ALL patients (ALLT-502, ALLK-130, ALLK-137, ALLK-139). Sequencing libraries were made using PCR free-based WGS library preparation (PCR-), using Novogene NGS DNA Library Prep Set (Cat No. PT004). Both diagnostic (90x) and remission (30x) samples were sequenced using Illumina NovaSeq X Plus.	Illumina NovaSeq X Plus	8
EGAD50000002429	The dataset includes FASTQ and CRAM files from diagnostic and matched remission (control) samples of one hyperdiploid ALL patient (ALLK-134). Sequencing libraries were made using Novogene NGS DNA Library Prep Set (Cat No. PT004). Both diagnostic (90x) and remission (30x) samples were sequenced using Illumina NovaSeq X Plus.	Illumina NovaSeq X Plus	2
EGAD50000002433	The dataset includes samples from four individual donors, pooled at each cell-harvesting time point (days 21, 28, and 35). Single-cell RNA sequencing was performed on the 10x Genomics Chromium platform using GEM-X Single Cell 5′ v3 and GEM-X Single Cell V(D)J v3 for BCR profiling. For each timepoint, paired libraries were constructed: a 5′ GEX library and a V(D)J-B library. Per reaction 350,000-400,000 cells were loaded, aiming for 60,000 cells per reaction. Pooled libraries were sequenced on an Illumina NovaSeq X sequencer. 50,000 read pairs per cell for GEX and 5,000 read pairs per cell for V(D)J were targeted.	Illumina NovaSeq X	6
EGAD50000002441	Sixteen matched, male and female patient sample sets, each comprising normal DNA, tumor DNA, and RNA, were processed for storage comparison studies. Upon receipt, 10 μl quality control aliquots were prepared from each sample. DNA samples were characterized using Qubit dsDNA High Sensitivity assay for quantification, Genomic DNA TapeStation for integrity assessment (DIN scores), and NanoDrop for purity evaluation. RNA samples were analyzed using Qubit RNA High Sensitivity assay, RNA TapeStation (RIN scores), and NanoDrop. Each sample was divided equally by volume for parallel processing.	Illumina NovaSeq X Plus	1312
EGAD50000002443	RNA sequencing was performed on patient-derived xenograft (PDX)-derived organoid models of metastatic colorectal cancer (mCRC). The aim of this study was to characterize the transcriptional landscape of KRAS-mutant mCRC cells in response to therapeutic strategies targeting the EGFR and KRAS signaling pathways. Transcriptomic profiling was conducted following drug treatments to investigate molecular changes associated with single and combination therapies. A total of 24 samples were analyzed.	Illumina NovaSeq X	24
EGAD50000002451	This dataset comprises genomic data from the Jeju Genome Project (JGP). WGS libraries were prepared using TruSeq Nano DNA kits and sequenced on Illumina NovaSeq 6000, with variants called via the DRAGEN Germline pipeline (v4.0.3). Genotyping was performed using the customized Axiom_JPMI v1 Chip on the Affymetrix GeneTitan platform. Raw data were processed using APT (v2.11.4), followed by phasing with Eagle (v2.4.1) and imputation via Minimac3 (v2.0.1) using a JGP-specific reference panel to optimize accuracy for the Jeju population.		5309
EGAD50000002453	This dataset contains whole exome sequencing variant data (VCF files) from individuals with suspected Mendelian disorders, including sickle cell disease, muscular dystrophy, haemophilia, and Fanconi anemia. Variant calling was performed using the Illumina DRAGEN pipeline, with downstream variant prioritisation conducted using the Zi-Mendelian bioinformatics pipeline. The dataset supports research into rare genetic diseases and genomic variation in African populations.	NextSeq 2000	10
EGAD50000002466	Whole genome sequencing was performed for 1,925 samples from phase 1 of the TenK10K project. Sequencing was done with Illumina 2 x 150bp chemistry on a NovaSeq 6000 instrument to achieve mean 30x coverage. Sequence reads were aligned to the GRCh38 reference assembly with a fork of DRAGMAP v1.3.1. This dataset is comprised of 1,925 alignment files in cram format, and their corresponding index files.		1925

No results

Browse datasets

13359 datasets