Study

ProjectMinE :llumina HiSeqX and HiSeq 2000 whole genome sequence data on 3,001 ALS samples including 212 with known C9orf72 repeat expansions

Study ID Alternative Stable ID Type
EGAS00001003383 Other

Study Description

Whole genome sequence (WGS) data was generated on 3,001 samples previously quantified for the presence of the C9orf72 repeat expansion (212 expanded and 2,789 wild type), These samples were sequenced using a combination of 2x100bp reads on an Illumina HiSeq2000 and 2x150bp reads on an Illumina HiSeqX sequencer. The repeat expansions were called using ExpansionHunter to demonstrate the ability to call large repeats from high throughput, WGS data. Provided here are all read pairs where at least one of the reads aligns within 1kb of the C9orf72 repeat expansion in addition to reads aligned in pre-determined off target locations where the aligners are known to mis-align reads.

Study Datasets 1 dataset.

Click on a Dataset ID in the table below to learn more, and to find out who to contact about access to these data

Dataset ID Description Technology Samples
EGAD00001004834
This dataset includes cram files from 3,001 samples. These cram files include all read pairs where at least one of the reads aligns within 1kb of the C9orf72 repeat expansion. Additionally, these cram files also contain reads that are aligned to any of 29 pre-determined off target locations where the aligners are known to mis-align reads associated with this repeat expansion. These samples were sequenced using a combination of 2x100bp reads on an Illumina HiSeq2000 and 2x150bp reads on an ... (Show More)
HiSeq X Ten,Illumina HiSeq 2000 3001

Who archives the data?

There are no publications available