Study
Analysis of error profiles in deep next-generation sequencing data
Study ID | Alternative Stable ID | Type |
---|---|---|
EGAS00001003444 | Other |
Study Description
BackgroundSequencing errors are key confounding factors for detecting low-frequency genetic variants that are important for cancer molecular diagnosis, treatment, and surveillance using deep next-generation sequencing (NGS). However, there is a lack of comprehensive understanding of errors introduced at various steps of a conventional NGS workflow, such as sample handling, library preparation, PCR enrichment, and sequencing. In this study, we use current NGS technology to systematically investigate these questions.ResultsBy evaluating read-specific error distributions, we discover that the substitution error rate can be computationally suppressed to 10-5 to 10-4, which is 10- to 100-fold lower than generally considered achievable (10-3) in the current literature. We then quantify substitution errors attributable to sample handling, library preparation, enrichment PCR, and sequencing by using multiple deep sequencing datasets. We find that error rates differ by nucleotide substitution types, ranging from 10-5 for A>C/T>G, C>A/G>T, and C>G/G>C changes to 10-4 for ... (Show More)
Study Datasets 1 dataset.
Click on a Dataset ID in the table below to learn more, and to find out who to contact about access to these data
Dataset ID | Description | Technology | Samples |
---|---|---|---|
EGAD00001004595 |
VALCAP files for Ma et al. (2019) Genome Biology (accepted) titled “Analysis of error profiles in deep next-generation sequencing data"
|
HiSeq X Ten | 47 |
Who archives the data?

Publications
Citations
Retrieving...

Retrieving...
