Oxidative bisulfite sequencing (oxBS-Seq) for APL
Whole genome bisulfite sequencing (WGBS) for APL
10x Genomics 5' library scRNA-seq data for 4 iAMP21 patients
RNA-Seq transcriptome data is only for academic use.
Dataset for "Intratumoral Heterogeneity and Clonal Evolution Induced by HPV Integration" (ONT)
Dataset for "Intratumoral Heterogeneity and Clonal Evolution Induced by HPV Integration" (pacBio)
Dataset for Synchronous Endometrial and Ovarian Cancer
Merged bam files for PACA-CA Whole Genome Sequencing, for DCC release 25
Cohort DescriptionIn 1948, the researchers recruited 5,209 men and women between the ages of 30 and 62 from the town of Framingham, Massachusetts, and began the first round of extensive physical examinations and lifestyle interviews that they would later analyze for common patterns related to CVD development. Since 1948, the subjects have returned to the study every two years for an examination consisting of a detailed medical history, physical examination, and laboratory tests, and in 1971, the study enrolled a second-generation cohort -- 5,124 of the original participants' adult children and their spouses -- to participate in similar examinations. The second examination of the Offspring cohort occurred eight years after the first examination, and subsequent examinations have occurred approximately every four years thereafter. In 1994, the need to establish a new study reflecting a more diverse community of Framingham was recognized, and the first Omni cohort of the Framingham Heart Study, consisting of 506 participants, was enrolled. In April 2002 4095 third generation of participants, the grandchildren of the original cohort, were added. In 2003, 103 spouses of the offspring Cohort (NOS), and a second group of 410 Omni participants were enrolled. Through 2019, the original cohort has completed a total of 32 exams, the Offspring cohort 9 exams, the OMNI1 cohort 4 exams, and GEN3, NOS and OMNI2 cohorts each have completed 3 exams. The FHS is a joint project of the National Heart, Lung and Blood Institute and Boston University.Data Being Submitted Wave 1 questionnaire data includes 3967 variables for up to 3112 FHS participants in C4R.Wave 2 questionnaire data includes 448 variables for up to 2337 FHS participants in C4R.Dried Blood Spot/Serosurvey data includes 7 variables for up to 2189 FHS participants in C4R.Derived data includes 43 variables for up to 3151 FHS participants in C4R.Phenotype data includes 113 variables for up to 3151 FHS participants in C4R.
Recent advances in throughput and accuracy mean that the Oxford Nanopore Technologies (ONT) PromethION platform is a now a viable solution for WGS. New bioinformatic methods have been developed to take advantage of this long read data, however much of the validation of these tools has focussed on calling germline variants (both SNVs and structural variants). Somatic variants are outnumbered many-fold by germline variants and their detection is further complicated due to their frequency varying depending on tumour purity/subclonality. Here, we evaluate the extent to which Nanopore WGS enables genome-wide detection and analysis of somatic variation. We do this through sequencing tumour and germline genomes for a patient with diffuse B-cell lymphoma. We examine the capability of currently available tools for calling somatic variants in ONT data by comparing the data with results from 150bp short-read sequencing of the same samples. We then conduct a detailed analysis of the performance of multiple long-read mappers and structural variant callers for calling large, somatic structural variants (SVs) in ONT data. Our protocol achieved yields of up to 96 mapped Gb per PromethION flow cell with average read lengths of ~5kb. Calling germline SNVs from these data achieved good specificity and sensitivity. However, results of somatic SNV calling highlight the need for the development of specialized joint calling algorithms. Our analysis of structural variants shows that the comparative performance of different tools varies significantly between SV types, and suggest long reads are especially advantageous for calling large somatic deletions and duplications. Finally, we highlight the utility of long reads for phasing clinically relevant variants by using the ONT data to confirm that a somatic 1.6Mb deletion and a p.(Arg249Met) mutation involving TP53 are oriented in trans.