Click on a Dataset ID in the table below to learn more, and to find
out who to contact about access to these data
Dataset ID
Description
Technology
Samples
EGAD50000002727
This dataset contains gene usage matrices calculated from the processed productively rearranged bulk BCR sequences from peripheral blood B cells from 102 subjects with celiac disease and 102 control donors. The files contain per sample V, D and J gene usage frequencies for IGH, IGK and IGL sequences. Unique rearrangements were quantified with Alakazam in gene mode. More details about processing steps are found in the linked publication. The dataset also contains an accompanying metadata file (csv).
Illumina MiSeq
204
EGAD50000002729
This dataset contains genotype information (vcf files) for genotyped SNPs within or near the BCR loci of 187 individuals (97 controls and 88 subjects with coeliac disease). SNPs were genotyped using an Axiom Human Genotyping SARS-CoV-2 Research Array (Thermo Fisher) and the data was processed with the Axiom Analysis Suite using the “Best Practices Workflow” pipeline. The genotyped SNPs include 1241 SNPs within/near IGH on Chr14, 157 SNPs within/near IGK on Chr2, and 608 SNPs within/near IGL on Chr22 passing quality control. The dataset also contains a metadata file (csv) describing sample phenotypes and providing links to genotype sample names.
205
EGAD50000002730
This dataset contains processed productively rearranged BCR sequences (tsv file format) from bulk AIRR-seq of naive peripheral blood B cells from 102 subjects with celiac disease and 102 control donors. Raw sequences have been processed with presto and Change-O as detailed in the linked publication in order to filter by quality, create consensus sequences and collapse identical rearrangements, and analysed with IgBlast. The dataset also contains an accompanying metadata file (csv).
Illumina MiSeq
204
EGAD50000002731
This dataset contains processed productively and non-productively rearranged TCR sequences (tsv file format) from bulk AIRR-seq of naive peripheral blood CD4+ T cells from 103 subjects with celiac disease and 103 control donors. Raw sequences have been processed with presto and Change-O as detailed in the linked publication in order to filter by quality, create consensus sequences and collapse identical rearrangements, and analysed with IgBlast. The dataset also contains an accompanying metadata file (csv).
Illumina MiSeq
Illumina NovaSeq 6000
206
EGAD50000002732
This dataset contains raw bulk adaptive immune receptor repertoire sequencing (AIRR-seq) data targeting the B cell receptor heavy (IGK), kappa light (IGK) and lambda light (IGL) chains in peripheral blood naïve B cells. The dataset contains raw demultiplexed fastq files from a total of 102 subjects with celiac disease and 102 controls, split into 15 sequencing libraries. The dataset also contains an accompanying metadata file (csv).
Peripheral blood mononuclear cells were obtained from study participants, and naïve B cells were FACS-sorted for RNA extraction and library preparation. Libraries were generated using a 5' RACE strategy incorporating unique molecular identifiers. Separate IGH, IGK and IGL libraries from each individual were constructed using combinatorial dual indexes, and pooled before sequencing. Pooled libraries were sequenced on an Illumina MiSeq platform using 300 bp paired-end reads.
Illumina MiSeq
204