Need Help?


Transcriptome studies in patients with rare genetic diseases can potentially aid in theinterpretation of likely causal genetic variation through identification of altered transcriptabundance and/or structure. RNA-Seq is the most sensitive assay for both investigatingtranscript structure and abundanceThe primary aim of this pilot project is to investigate to what degree integrating exome-Seqand RNA-Seq data on the same individual can accelerate the identification of causal allelesfor rare genetic diseases. There are two main strands to this: (i) identifying which variantsdiscovered in exome-seq appear to be having a functional impact on transcripts, and (ii)identifying transcript outliers, especially among known causal genes, that may not necessarilyhave a causal variant identified from exome sequencing. The latter may identify the presenceof causal variants that lie far from coding regions (e.g. the formation of cryptic splice sitesdeep within introns, or loss of long range regulatory elements), which can be confirmed withfurther targeted genetic assays. Just over 50% of all disease-causing variants recorded in theHuman Gene Mutation Database (HGMD) affect transcript structure and abundance (e.g.nonsense SNVs, essential splice site SNVs, frameshifting indels, CNVs).This pilot project will study RNA from lymphoblastoid cell-lines from 12 patients withprimordial dwarfism syndromes, for 10 of these samples we have previously generate exomedata as part of our collaboration with the group of Prof Andrew Jackson. The two remainingsamples are positive controls where the causal mutation is known, and is known to affecttranscript structure and/or abundance.Primordial dwarfism is a prime candidate for these RNA-seq studies because all knowncausal mutations to date have key roles in DNA replication and thus, unsurprisingly, theproducts of the causal genes are typically ubiquitously expressed.Each RNA will be sequenced, with two technical replicates (independent RT-PCR and libraries) persample, and each replicate run in 1/2 of a HiSeq lane using 100bp paired reads. Samples preparation was as follows :The cells were grown to confluency, then pellets frozen at -80. RNA samples were prepared using the Qiagen RNeasy kit, then nanodropped and analyzed using the bioanalyzer to determine concentration and purity.This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see

Click on a Dataset ID in the table below to learn more, and to find out who to contact about access to these data

Dataset ID Description Technology Samples
EGAD00001000640 Illumina HiSeq 2000 24