Need Help?

The Finland-United States Investigation of NIDDM Genetics (FUSION) Study - Islet Expression and Regulation by RNAseq and ATACseq

This study evaluates gene expression and its regulation in human pancreatic islets, a tissue relevant in the study of genetic risk factors contributing to diabetes. We obtained islets from deceased donors and generated data from genome-wide SNP chip, bulk RNA-Seq, microRNA (miRNA)-Seq, whole genome sequence, DNA methylation (methyl)-Seq, transcription initiation profiles using cap analysis of gene expression (CAGE)-Seq, single cell RNA-seq, and single nuclei ATAC-seq. These data include ATAC-seq of two islet subjects, RNA-seq of 31 additional subjects, genome-wide chip genotypes, and imputed genotypes of the 33 subjects released with phs001188.v1. For genotyping, 500-1000 islet equivalents (IEQ) were cultured as in Gershengorn (Science, 2004, PMID: 15564314); genomic DNA isolated from islet cultures. For RNA analyses, 2500-5000 IEQ from each islet source were used for bulk or single-cell RNA isolation. Messenger RNA was isolated with trizol extraction and 12-plex libraries were generated using the Illumina TruSeq directional mRNA-seq library protocol. Bulk RNA sequencing was performed on HiSeq2000/HiSeq2500 sequencers using paired-end reads at the NIH Intramural Sequencing Center (NISC). miRNA libraries were prepared from total RNA from 68 samples, pooled and sequenced 50bp single-end reads on Illumina HiSeq2500. CAGE libraries were prepared from total RNA samples using the nAnT-iCAGE protocol at DNAFORM, Japan. CAGE libraries were sequenced at the NIH Intramural Sequencing Center (NISC) on the HiSeq2000 sequencer. Genotyping on the Illumina Omni2.5M array was performed at the NHGRI Genomics Core facility. Genotypes were imputed using the HRC.r1.1.2016 reference panel. In order to assess regions of open chromatin in islets, we performed bulk ATAC-seq on HiSeq2000 sequencers using paired-end reads at NISC. Single-nuclei ATAC-seq libraries were prepared using single-cell-combinatorial-indexing (sci-) ATAC-seq protocol and sequenced on Illumina NextSeq using paired-end reads. scRNA-seq libraries were generated using the 10X Genomics platform and sequenced on Illumina HiSeq3000 at the Genomics Technology Core of the National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS).

Greater than 90% of the loci associated with T2D through genome-wide association studies occur in non-coding regions, suggesting a strong regulatory component to disease susceptibility. Therefore, there is a critical need to understand the full spectrum of genetic variation and regulatory element usage in T2D-relevant tissues. To that end, this study contains whole genome sequence and whole genome bisulfite sequence, and/or Illumina MethylationEPIC Array data, providing a comprehensive survey of both individual genetic variation as well as DNA methylation across different tissues from multiple individuals. In addition, we carried out sequencing of single cell RNAs (two subjects) and single cell nuclei (one subject) to characterize gene expression and chromatin accessibility of islets.