Comparison of structural variations from 10X Genomics linked-reads and conventional Illumina short-reads sequencing

Study ID Alternative Stable ID Type
EGAS00001004093 Other

Study Description

Structural variations (SVs) are large genomic rearrangements that can drive many diseases. Conventional short-reads whole genome sequencing (cWGS) allows their identification with base-pair resolution, but suffers from high false discovery rate. cWGS taps in short-range information from short-reads while linked-reads sequencing (10XWGS) utilizes long-range information. 10XWGS allows linkage of short-reads originating from the same large DNA molecule with a unique barcode captured in a gel bead in emulsion. This mitigates alignment-based artefacts from cWGS especially in repetitive regions. However, the false discovery rate of this technology is unclear. In this study, we performed a comprehensive analysis of different type and size of SVs predicted from these two technologies. The SVs common between both technologies were found to be highly specific by PCR and Sanger sequencing while validation rate dropped for uncommon events. Further, we propose a novel enrichment approach for filtering out false positive calls from both the technologies independently. To this end, we trained a ... (Show More)

Study Datasets 1 dataset.

Click on a Dataset ID in the table below to learn more, and to find out who to contact about access to these data

Dataset ID Description Technology Samples
This dataset contains whole genome sequencing data from Illumina short-reads sequencing (2X150bp) and 10X Genomics linked-reads sequencing. Both the sequencing technologies were used to sequence MCF7 cell line and a primary breast triple-negative cancer sample. The fastq of paired-end reads for both the samples sequenced with both the technologies is available.
Illumina NovaSeq 6000 4

Who archives the data?