Copied to clipboard!

ProjectMinE :llumina HiSeqX and HiSeq 2000 whole genome sequence data on 3,001 ALS samples including 212 with known C9orf72 repeat expansions

Whole genome sequence (WGS) data was generated on 3,001 samples previously quantified for the presence of the C9orf72 repeat expansion (212 expanded and 2,789 wild type), These samples were sequenced using a combination of 2x100bp reads on an Illumina HiSeq2000 and 2x150bp reads on an Illumina HiSeqX sequencer. The repeat expansions were called using ExpansionHunter to demonstrate the ability to call large repeats from high throughput, WGS data. Provided here are all read pairs where at least one of the reads aligns within 1kb of the C9orf72 repeat expansion in addition to reads aligned in pre-determined off target locations where the aligners are known to mis-align reads.

Type: Other
Archiver: European Genome-phenome Archive (EGA)

1 Dataset

Click on a Dataset ID in the table below to learn more, and to find out who to contact about access to these data

Dataset ID	Description	Technology	Samples
EGAD00001004834	This dataset includes cram files from 3,001 samples. These cram files include all read pairs where at least one of the reads aligns within 1kb of the C9orf72 repeat expansion. Additionally, these cram files also contain reads that are aligned to any of 29 pre-determined off target locations where the aligners are known to mis-align reads associated with this repeat expansion. These samples were sequenced using a combination of 2x100bp reads on an Illumina HiSeq2000 and 2x150bp reads on an Illumina HiSeqX sequencer and aligned using the Isaac aligner.	HiSeq X Ten Illumina HiSeq 2000	3001