Copied to clipboard!

Genome of the Netherlands

The Genome of the Netherlands (GoNL) Project characterizes DNA sequence variation, common and rare, for SNVs, short insertions and deletions (indels) and larger deletions in 769 individuals of Dutch ancestry selected from five biobanks under the auspices of the Dutch hub of the Biobanking and Biomolecular Research Infrastructure (BBMRI-NL). The samples come from a representative sample of 250 trio-families from all provinces in the Netherlands. The parent-offspring trios include adult individuals ranging in age from 19 to 87 years (mean=53 years; SD=16 years) from birth cohorts 1910-1994. Sequencing was done on blood-derived DNA from uncultured cells and accomplished coverage was 14-15x. Samples where contributed by LifeLines (http://lifelines.nl/lifelines-research/general), The Leiden Longevity Study (http://www.healthy-ageing.nl; http://www.langleven.net), The Netherlands Twin Registry (NTR: http://www.tweelingenregister. org), The Rotterdam studies, (http://www.erasmus-epidemiology.nl/rotterdamstudy) and the Genetic Research in Isolated Populations program (http://www.epib.nl/research/geneticepi/research.html#gip). The sequencing was carried out in collaboration with the Beijing Institute for Genomics (BGI). The analysis was done by a consortium lead by UMCG, LUMC, Erasmus MC, VU university and UMCU, see http://www.nlgenome.nl. Funding for the project was provided by the Netherlands Organization for Scientific Research under award number 184021007, dated July 9, 2009 and made available as a Rainbow Project of the Biobanking and Biomolecular Research Infrastructure Netherlands (BBMRI-NL).

Type: Other
Archiver: European Genome-phenome Archive (EGA)

5 Datasets 10 Publications

Click on a Dataset ID in the table below to learn more, and to find out who to contact about access to these data

Dataset ID	Description	Technology	Samples
EGAD00001000743	These files contain a total of 20.4M SNVs and the complete information output by the GATK UnifiedGenotyper v1.4 on all 767 GoNL samples. These calls are not trio-aware and all genotypes were reported regardless of their quality. Both filtered and passing calls are reported in these files. Filtered calls include (1) calls failing our VQSR threshold and (2) calls in the GoNL inaccessible genome.		-
EGAD00001000744	The samples in this panel come from 250 families: 248 parents-child trios and 2 parent-child duos. As the children do not provide additional haplotypes or population information, they were excluded from the panel. The samples present in the release are composed of 248 couples, 2 single individuals and 1 sample composed from the 2 haplotypes from the duo's children transmitted by their missing parent. The composed sample is named gonl-220c_223c.The files contain a total of 18.9M SNVs and 1.1M INDELs in autosomal chromosomes. They were generated by phasing/imputing the SNVs (a) and INDELs (b) using MVNCall. Only sites passing filters are reported. Sites filtered as part of the GoNL inaccessible genome were kept (but flagged as filtered) and still may contain true positive calls but should be used with care as they are located in parts of the genome that are less well captured (systematic under or over-covered or low-mapping quality)		-
EGAD00001000821	Raw sequencing data for all samples in fastq format.	Illumina HiSeq 2000	767
EGAD00001001038	We mapped the data to the UCSC human reference genome build 37 using BWA 0.5.9-r16. We first mapped each read pair separately using bwa aln. Then we used bwa sampe to map the paired reads together to a BAM9 file. The BAM file was then sorted by genomic position and indexed using PicardTools-1.32 SortSam. To prevent PCR artifacts from influencing the downstream analysis of our data, we used Picard to mark the duplicate reads, which were ignored in downstream analysis. We used GATK IndelRealigner on our data around known indels (from 1KG Pilot). The IndelRealigner creates all possible read alignments using the source and computes the likelihood of the data containing the indel based on the read pileup. Whenever the maximum likelihood contains an indel, the reads are realigned accordingly. Each base is associated with a phred-scaled base quality score. Calibration of Phred scores is crucial as they are used in some of the downstream analysis models. We used GATK to recalibrate the base qualities with respect to (i) the base cycle, (ii) original quality score, and (iii) dinucleotide context. To minimize issues stemming from mapping problems around indels, we decided to undergo a second round of indel realignment using the GATK IndelRealigner by family rather than by individual. For this second round, we considered two sources of possible indels: 1KG Phase 1 indels and indels aligned by BWA in the GoNL data.		-
EGAD00001002261	These files contain indels and structural variants on 769 GoNL samples (SV release 6, 2016-05-25).	Illumina HiSeq 2000;	-

Publications	Citations
The Genome of the Netherlands: design, and project goals. Boomsma DI, Wijmenga C, Slagboom EP, Swertz MA, Karssen LC, Abdellaoui A, Ye K, Guryev V, Vermaat M, van Dijk F, Francioli LC, Hottenga JJ, Laros JF, Li Q, Li Y, Cao H, Chen R, Du Y, Li N, Cao S, van Setten J, Menelaou A, Pulit SL, Hehir-Kwa JY, Beekman M, Elbers CC, Byelas H, de Craen AJ, Deelen P, Dijkstra M, den Dunnen JT, de Knijff P, Houwing-Duistermaat J, Koval V, Estrada K, Hofman A, Kanterakis A, Enckevort Dv, Mai H, Kattenberg M, van Leeuwen EM, Neerincx PB, Oostra B, Rivadeneira F, Suchiman EH, Uitterlinden AG, Willemsen G, Wolffenbuttel BH, Wang J, Wang J, de Bakker PI, van Ommen GJ, van Duijn CM. Eur J Hum Genet 22: 2014 221-227	208
Improved imputation quality of low-frequency and rare variants in European samples using the 'Genome of The Netherlands'. Deelen P, Menelaou A, van Leeuwen EM, Kanterakis A, van Dijk F, Medina-Gomez C, Francioli LC, Hottenga JJ, Karssen LC, Estrada K, Kreiner-Møller E, Rivadeneira F, van Setten J, Gutierrez-Achury J, Westra HJ, Franke L, van Enckevort D, Dijkstra M, Byelas H, van Duijn CM, Genome of Netherlands Consortium, de Bakker PI, Wijmenga C, Swertz MA. Eur J Hum Genet 22: 2014 1321-1326	83
Whole-genome sequence variation, population structure and demographic history of the Dutch population. Genome of the Netherlands Consortium. Nat Genet 46: 2014 818-825	546
Characteristics of de novo structural changes in the human genome. Kloosterman WP, Francioli LC, Hormozdiari F, Marschall T, Hehir-Kwa JY, Abdellaoui A, Lameijer EW, Moed MH, Koval V, Renkens I, van Roosmalen MJ, Arp P, Karssen LC, Coe BP, Handsaker RE, Suchiman ED, Cuppen E, Thung DT, McVey M, Wendl MC, Genome of Netherlands Consortium, Uitterlinden A, van Duijn CM, Swertz MA, Wijmenga C, van Ommen GB, Slagboom PE, Boomsma DI, Schönhuth A, Eichler EE, de Bakker PI, Ye K, Guryev V. Genome Res 25: 2015 792-801	101
Genome-wide patterns and properties of de novo mutations in humans. Francioli LC, Polak PP, Koren A, Menelaou A, Chun S, Renkens I, Genome of the Netherlands Consortium, van Duijn CM, Swertz M, Wijmenga C, van Ommen G, Slagboom PE, Boomsma DI, Ye K, Guryev V, Arndt PF, Kloosterman WP, de Bakker PIW, Sunyaev SR. Nat Genet 47: 2015 822-826	307
Transmission of human mtDNA heteroplasmy in the Genome of the Netherlands families: support for a variable-size bottleneck. Li M, Rothwell R, Vermaat M, Wachsmuth M, Schröder R, Laros JF, van Oven M, de Bakker PI, Bovenberg JA, van Duijn CM, van Ommen GJ, Slagboom PE, Swertz MA, Wijmenga C, Genome of Netherlands Consortium, Kayser M, Boomsma DI, Zöllner S, de Knijff P, Stoneking M. Genome Res 26: 2016 417-426	69
A high-quality human reference panel reveals the complexity and distribution of genomic structural variants. Hehir-Kwa JY, Marschall T, Kloosterman WP, Francioli LC, Baaijens JA, Dijkstra LJ, Abdellaoui A, Koval V, Thung DT, Wardenaar R, Renkens I, Coe BP, Deelen P, de Ligt J, Lameijer EW, van Dijk F, Hormozdiari F, Genome of the Netherlands Consortium, Uitterlinden AG, van Duijn CM, Eichler EE, de Bakker PI, Swertz MA, Wijmenga C, van Ommen GB, Slagboom PE, Boomsma DI, Schönhuth A, Ye K, Guryev V. Nat Commun 7: 2016 12989	91
A SNP panel for identification of DNA and RNA specimens. Yousefi S, Abbassi-Daloii T, Kraaijenbrink T, Vermaat M, Mei H, van 't Hof P, van Iterson M, Zhernakova DV, Claringbould A, Franke L, 't Hart LM, Slieker RC, van der Heijden A, de Knijff P, BIOS consortium, 't Hoen PAC. BMC Genomics 19: 2018 90	38
RNA-Seq in 296 phased trios provides a high-resolution map of genomic imprinting. Jadhav B, Monajemi R, Gagalova KK, Ho D, Draisma HHM, van de Wiel MA, Franke L, Heijmans BT, van Meurs J, Jansen R, GoNL Consortium, BIOS Consortium, 't Hoen PAC, Sharp AJ, Kiełbasa SM. BMC Biol 17: 2019 50	29
WGS-based telomere length analysis in Dutch family trios implicates stronger maternal inheritance and a role for RRM1 gene. Nersisyan L, Nikoghosyan M, Genome of the Netherlands consortium, Arakelyan A. Sci Rep 9: 2019 18758	7