Need Help?

SeqControl: Process Control for DNA Sequencing

As next-generation sequencing (NGS) continues to increase in speed and throughput, routine clinical and industrial application draws steadily closer. These “production” uses of NGS will require enhanced quality-monitoring and quality-control to optimize output and reduce costs. We therefore developed a framework called SeqControl for predicting sequencing quality and coverage using a set of 15 metrics describing overall coverage, coverage distribution, base-wise coverage and base-wise quality. Using whole-genome sequences of 27 prostate cancers and 26 normal references we derive multivariate models that predict sequencing quality and depth. SeqControl robustly predicts how much sequencing is required to reach a given coverage depth (AUC = 0.993), accurately classifies clinically relevant formalin-fixed paraffin-embedded samples and makes predictions from as little as 1/8 of a lane of sequencing data (AUC = 0.967). These techniques can be immediately incorporated into existing NGS pipelines to monitor data quality in real-time. SeqControl represents a first step towards statistical process-control for NGS.

Click on a Dataset ID in the table below to learn more, and to find out who to contact about access to these data

Dataset ID Description Technology Samples
EGAD00001001115 Illumina HiSeq 2500 54