Genome of the Netherlands
|Study ID||Alternative Stable ID||Type|
Study Datasets 5 datasets.
Click on a Dataset ID in the table below to learn more, and to find out who to contact about access to these data
These files contain a total of 20.4M SNVs and the complete information output by the GATK UnifiedGenotyper v1.4 on all 767 GoNL samples. These calls are not trio-aware and all genotypes were reported regardless of their quality. Both filtered and passing calls are reported in these files. Filtered calls include (1) calls failing our VQSR threshold and (2) calls in the GoNL inaccessible genome.
The samples in this panel come from 250 families: 248 parents-child trios and 2 parent-child duos. As the children do not provide additional haplotypes or population information, they were excluded from the panel. The samples present in the release are composed of 248 couples, 2 single individuals and 1 sample composed from the 2 haplotypes from the duo's children transmitted by their missing parent. The composed sample is named gonl-220c_223c.The files contain a total of 18.9M SNVs and 1.1M ... (Show More)
Raw sequencing data for all samples in fastq format.
|Illumina HiSeq 2000||767|
We mapped the data to the UCSC human reference genome build 37 using BWA 0.5.9-r16. We first mapped each read pair separately using bwa aln. Then we used bwa sampe to map the paired reads together to a BAM9 file. The BAM file was then sorted by genomic position and indexed using PicardTools-1.32 SortSam. To prevent PCR artifacts from influencing the downstream analysis of our data, we used Picard to mark the duplicate reads, which were ignored in downstream analysis. We used GATK IndelRealigner ... (Show More)
These files contain indels and structural variants on 769 GoNL samples (SV release 6, 2016-05-25).
|Illumina HiSeq 2000||769|