RNA-sequencing (RNA-seq) was performed with RNA extracted from fresh-frozen human tumor tissue samples. cDNA libraries were prepared from poly-A selected RNA applying the Illumina TruSeq protocol for mRNA. The libraries were then sequenced with a 2 x 100bp paired-end protocol to a minimum mean coverage of 30x of the annotated transcriptome.
The Finland-United States Investigation of NIDDM Genetics (FUSION) study is a long-term effort to identify genetic variants that predispose to type 2 diabetes (T2D) or that impact the variability of T2D-related quantitative traits (QTs). Most of the variants associated with T2D and related traits (glucose and insulin, anthropometrics, lipids) through genome-wide association studies (GWAS) occur in non-coding regions, suggesting a strong regulatory component to disease susceptibility. Regulatory element activity is often tissue-specific, which further complicates discovery of the causal/functional variation. Therefore, there is a critical need to identify the appropriate cell type, regulatory elements, target genes, and causal variants(s) in T2D-relevant tissues. We hypothesize that a subset of T2D and related variants alter gene expression regulation in skeletal muscle and adipose tissue - two major insulin target tissues and play key roles in insulin resistance. To that end, our study contains a comprehensive survey of genomics, epigenomics and transcriptomics in skeletal muscle and adipose tissue from individuals with glucose tolerance categories ranging from normal to T2D.For this FUSION Tissue Biopsy Study, we obtained RNA-Seq, microRNA (miRNA)-Seq, and DNA methylation (methyl)-Seq data on biopsy samples from 331 individuals from across the range of glucose tolerance: 124 normal glucose tolerance (NGT), 77 impaired glucose tolerance (IGT), 44 impaired fasting glucose (IFG), and 86 newly-diagnosed T2Ds. Participants completed two study visits, two weeks apart. First visits comprised most of the clinical phenotyping, including four-point OGTT (fasting, and 30, 60, and 120 minute post-load); BMI, WHR; lipids; blood pressure; and many other variables. Participants also completed FUSION health history, medication, and lifestyle questionnaires. On the second visit, we obtained ~250mg vastus lateralis skeletal muscle, ~750mg abdominal subcutaneous adipose, and a ~5x15mm section of abdominal skin. Visits were completed in March 2013. RNA isolation is ongoing in the Collins laboratory at the NIH, RNA and miRNA sequencing at the NIH Intramural Sequencing Center (NISC), and genotyping at the Center for Inherited Disease Research (CIDR). Individual-level data is available here for the 306 individuals who consented to data deposit. To focus on evaluation of gene expression and its regulation in skeletal muscle, we analyzed mRNA extracted from vastus lateralis skeletal muscle obtained from 271 of the 331 individual subjects from Finland, along with genome-wide genotypes. Individual-level data is available here for the 250 subjects who consented to the use of their data.Release phs001048.v2.p1 adds muscle data for an additional 42 subjects and data from adipose tissue for 276 subjects. Total RNA was isolated using Trizol extraction in the Collins laboratory at the NIH. The mRNA was poly-A selected, 24-plex libraries were generated using the Illumina TruSeq directional mRNA-seq library protocol and RNA sequencing was performed on HiSeq2000 sequencers using 101bp paired-end reads at NISC. miRNA libraries were prepared from total RNA from 296 muscle and 270 adipose samples, pooled and sequenced 50bp single-end reads on Illumina HiSeq2500. Data for 272 muscle and 251 adipose samples are available here for individuals with consent for data deposit. DNA was extracted from blood in the Collins laboratory, and genotyping on the Illumina Omni2.5M array was performed at CIDR. Genotypes were imputed using the HRC 2016 reference panel. In order to assess regions of open chromatin in skeletal muscle, we obtained muscle tissue from a commercial provider to perform ATAC-seq; these samples were sequenced at the University of Michigan DNA Sequencing Core.Release phs001048.v3.p1 adds single-nucleus (sn) RNA-seq and ATAC-seq data in 287 skeletal muscle samples out of the original 331 individuals. Individual-level data is available here for the 265 subjects who consented to the use of their data. The frozen tissue biopsy samples were processed in ten batches, each consisting of 40-41 samples. These batches were organized using a randomized block design to protect against experimental contrasts of interest including cohort, age, sex, BMI among others. Samples in each batch were pulverized, pooled together followed by nuclei isolation. The nuclei were processed on the 10X Genomics Chromium platform separately for snATAC-seq and snRNA-seq (v. 3.1 chemistry for snRNA-seq).Release phs001048.v4.p1 adds additional phenotypes and provides some corrections to several previous phenotypes.
Chip-seq samples for 20 colorectal patients with paired adjacent normal mucosa to characterize H3K27ac and H3K4me1.
Whole blood RNA sequencing data generated from human samples collected as part of the BCG-Flu Challenge study. The dataset includes 746 samples. Paired-end sequencing was performed on an Illumina NovaSeq 6000 platform with a 150 bp paired-end configuration. Raw FASTQ files are provided.
This dataset contains 10x Genomics single-nucleus RNA sequencing data from postoperative brain tissues of patients with focal cortical dysplasia type II (FCD II). The samples were processed using the Chromium platform.
raw RNAseq data from blood plasma of patients diagnosed with liver disease. Total RNA libraries were prepared using the SMARTer Stranded Total RNA-Seq-Kit v3 - Pico Input Mammalian (Takara Bio). Libraries were paired-end sequenced (2x100) o a NovaSeq 6000 instrument using NovaSeq S2 or S1 kit (Illumina).
This dataset consists of SOLiD small RNA-seq of 250 colorectal samples: 100 tumor tissue samples, 100 normal tissue samples (adjacent to tumor sites) and 50 matched control samples of healthy individuals. CSfasta and qual files converted to single fastq files prior to uploading.
37 transcriptomes derived from fresh-frozen glioblastoma tumor samples. These transcriptomes have been produced for validation purposes and match the corresponding RRBS and WGS profiles in that DNA and RNA was extracted from the same tumor samples.
43 low-coverage genomes derived from fresh-frozen glioblastoma tumor samples. These genomes have been produced for validation purposes and match the corresponding RRBS and RNA-seq profiles in that DNA and RNA was extracted from the same tumor samples.
Chromatin immunoprecipitation (ChIP) was carried out employing antibodies against H3K36me3 and RNA polymerase II using the HistonePath and TranscriptionPath assays by ActiveMotif. Whole genome sequencing was carried out using an Illumina HiSeq2000 and data is provided as 6 BAM files. H3K36me3 chipseq RNA polymerase II chipseq and input coverage for each cell line.