RNAseq data of polyA+ RNA from Leukocytes from 624 individuals of the ProgeNIA cohort.

Dataset ID Technology Samples
EGAD00001003102 Illumina HiSeq 2000 624

Dataset Description

We sequenced the polyA+ fraction of the RNA of the leukocytes from 624 sardinian individuals with RNAseq. Prior to library preparation we added either ERCC ExFold RNA Spike-In. An average of 60M reads per samples with 51 bp paired-end reads were generated on a HiSeq 2000 (Illumina). Sequencing reads were then aligned using STAR-2.2.0c2 to the h37d5 reference genome supplemented with the ERCC spike-ins sequences. We further provided an exon-exon junction database that we generated from the GENCODE v14 annotation. In order to remove a contamination from a parallel experiment, we discarded any reads that mapped to the genomic regions of CBLB (chr3:105370773-105592330) and BCL11A (chr2:60672555-60784156). Filtered aligned reads (bam format) are shared.

Who controls access to this dataset

For each dataset that requires controlled access, there is a corresponding Data Access Committee (DAC) who determine access permissions. Access to actual data files is not managed by the EGA. If you need to request access to this data set, please contact:

DAC for Sardinia Leukocytes polyA RNAseq 624 project
Contact person: Francesco Cucca
Email: francesco [dot] cucca [at] irgb [dot] cnr [dot] it
More details: EGAC00001000561


