Study

Synthetic data - Genome in a Bottle

Study ID Alternative Stable ID Type
EGAS00001005591 Other

Study Description

In May, the National Institute of Standards and Technology (NIST) released its first genome in a bottle, a reference sample of DNA for validating human genome sequences. This so-called truth sequence comes from a decades-old sample donated by a Utah woman for (other) research purposes (NA12878 cell line), which, over the years, has been one of the most studied, and hence best-characterized, human samples. Seeing genomic medicine moving toward mainstream healthcare, researchers at NIST recognized the need for a reference human genome and assembled a private-public consortium in 2012 to create one. As detailed in a 2014 Nature Biotechnology paper (Nat. Biotechnol.32, 246–251, 2014), the group integrated and arbitrated among sequences from 14 data sets, five sequencing technologies, seven read mappers and three variant callers.

Study Datasets 3 datasets.

Click on a Dataset ID in the table below to learn more, and to find out who to contact about access to these data

Dataset ID Description Technology Samples
EGAD00001008095
This dataset contains whole genome sequencing data, based in BAM files of three trio members. These BAM files contain information of chromsomes 21, X, Y and mitochondrial.
3
EGAD00001008096
This dataset contains whole genome sequencing data, based in paired end Fastq files of three trio members.
Illumina HiSeq 2500 3
EGAD00001008097
This dataset contains whole genome sequencing data, based in VCF of three trio members.
3

Who archives the data?

Publications

Citations

Retrieving...
Retrieving...
Retrieving...
Retrieving...
Retrieving...
Retrieving...
Retrieving...
Retrieving...