Study

Genome of the Netherlands

Study ID Alternative Stable ID Type
EGAS00001000644 Other

Study Description

The Genome of the Netherlands (GoNL) Project characterizes DNA sequence variation, common and rare, for SNVs, short insertions and deletions (indels) and larger deletions in 769 individuals of Dutch ancestry selected from five biobanks under the auspices of the Dutch hub of the Biobanking and Biomolecular Research Infrastructure (BBMRI-NL). The samples come from a representative sample of 250 trio-families from all provinces in the Netherlands. The parent-offspring trios include adult individuals ranging in age from 19 to 87 years (mean=53 years; SD=16 years) from birth cohorts 1910-1994. Sequencing was done on blood-derived DNA from uncultured cells and accomplished coverage was 14-15x. Samples where contributed by LifeLines (http://lifelines.nl/lifelines-research/general), The Leiden Longevity Study (http://www.healthy-ageing.nl; http://www.langleven.net), The Netherlands Twin Registry (NTR: http://www.tweelingenregister. org), The Rotterdam studies, (http://www.erasmus-epidemiology.nl/rotterdamstudy) and the Genetic Research in Isolated Populations program ... (Show More)

Study Datasets 5 datasets.

Click on a Dataset ID in the table below to learn more, and to find out who to contact about access to these data

Dataset ID Description Technology Samples
EGAD00001000743
These files contain a total of 20.4M SNVs and the complete information output by the GATK UnifiedGenotyper v1.4 on all 767 GoNL samples. These calls are not trio-aware and all genotypes were reported regardless of their quality. Both filtered and passing calls are reported in these files. Filtered calls include (1) calls failing our VQSR threshold and (2) calls in the GoNL inaccessible genome.
767
EGAD00001000744
The samples in this panel come from 250 families: 248 parents-child trios and 2 parent-child duos. As the children do not provide additional haplotypes or population information, they were excluded from the panel. The samples present in the release are composed of 248 couples, 2 single individuals and 1 sample composed from the 2 haplotypes from the duo's children transmitted by their missing parent. The composed sample is named gonl-220c_223c.The files contain a total of 18.9M SNVs and 1.1M ... (Show More)
499
EGAD00001000821
Raw sequencing data for all samples in fastq format.
Illumina HiSeq 2000 767
EGAD00001001038
We mapped the data to the UCSC human reference genome build 37 using BWA 0.5.9-r16. We first mapped each read pair separately using bwa aln. Then we used bwa sampe to map the paired reads together to a BAM9 file. The BAM file was then sorted by genomic position and indexed using PicardTools-1.32 SortSam. To prevent PCR artifacts from influencing the downstream analysis of our data, we used Picard to mark the duplicate reads, which were ignored in downstream analysis. We used GATK IndelRealigner ... (Show More)
769
EGAD00001002261
These files contain indels and structural variants on 769 GoNL samples (SV release 6, 2016-05-25).
Illumina HiSeq 2000 769

Who archives the data?

Publications

Citations

Retrieving...
Retrieving...
Retrieving...
Retrieving...
Retrieving...
Retrieving...
Retrieving...
Retrieving...
Retrieving...
Retrieving...
Retrieving...
Retrieving...
Retrieving...
Retrieving...
Retrieving...
Retrieving...
Retrieving...
Retrieving...
Retrieving...
Retrieving...