The Cleveland Family Study is the largest family-based study of sleep apnea world-wide, consisting of 2284 individuals (46% African American) from 361 families studied on up to 4 occasions over a period of 16 years. The study was begun in 1990 with the initial aims of quantifying the familial aggregation of sleep apnea. NIH renewals provided expansion of the original cohort (including increased minority recruitment) and longitudinal follow-up, with the last exam occurring in February 2006. Index probands (n=275) were recruited from 3 area hospital sleep labs if they had a confirmed diagnosis of sleep apnea and at least 2 first-degree relatives available to be studied. In the first 5 years of the study, neighborhood control probands (n=87) with at least 2 living relatives available for study were selected at random from a list provided by the index family and also studied. All available first degree relatives and spouses of the case and control probands also were recruited. Second-degree relatives, including half-sibs, aunts, uncles and grandparents, were also included if they lived near the first degree relatives (cases or controls), or if the family had been found to have two or more relatives with sleep apnea. Blood was sampled and DNA isolated for participants seen in the last two exam cycles (n=1447). The sample, which is enriched with individuals with sleep apnea, also contains a high prevalence of individuals with sleep apnea-related traits, including: obesity, impaired glucose tolerance, and HTN. Phenotyping data have been collected over 4 exam cycles, each occurring ~every 4 years. The last three exams targeted all subjects who had been studied at earlier exams, as well as new minority families and family members of previously studied probands who had been unavailable at prior exams. Data from one, two, three and four visits are available for 412, 630, 329 and 67, participants, respectively. In the first 3 exams, participants underwent overnight in-home sleep studies, allowing determination of the number and duration of hypopneas and apneas, sleep period, heart rate, and oxygen saturation levels; anthropometry (weight, height, and waist, hip, and neck circumferences); resting blood pressure; spirometry; standardized questionnaire evaluation of symptoms, medications, sleep patterns, quality of life, daytime sleepiness measures and health history; venipuncture and measurement of total and HDL cholesterol. The 4th exam (2001-2006) was designed to collect more detailed measurements of sleep, metabolic and CVD phenotypes and included measurement of state-of-the-art polysomnography, with both collection of blood and measurement of blood pressure before and after sleep, and anthropometry, upper airway assessments, spirometry, exhaled nitric oxide, and ECG performed the morning after the sleep study. Data have been collected by trained research assistants or GCRC nurses following written Manuals of Procedures who were certified following standard approaches for each study procedure. Ongoing data quality, with assessment of within or between individual drift, has been monitored on an ongoing basis, using statistical techniques as well as regular re-certification procedures. Between and within scorer reliabilities for key sleep apnea indices have been excellent, with intra-class correlation coefficients (ICCs) exceeding 0.92 for the apnea-hypopnea index (AHI). Sleep staging, assessed with epoch specific comparisons, also demonstrate excellent reliability for stage identification (kappas>0.82). There has been no evidence of significant time trends-between or within scorers- for the AHI variables. We also have evaluated the night-to-night variability of the AHI and other sleep variables in 91 subjects, with each measurement made 1-3 months apart. There is high night to night consistency for the AHI (ICC: 0.80), the arousal index (0.76), and the % sleep time in slow-wave sleep (0.73). We have demonstrated the comparability of the apnea estimates (AHI) determined from limited channel studies obtained at in-home settings with in full in-laboratory polysomnography. In addition to our published validation study, we more recently compared the AHI in 169 Cleveland Family Study participants undergoing both assessments (in-home and in-laboratory) within one week apart. These showed excellent levels of agreement (ICC=0.83), demonstrating the feasibility of examining data from either in-home or in-laboratory studies for apnea phenotyping. Data collected in the GCRC were obtained, when possible, with comparable, if not identical techniques, as were the same measures collected at prior exams performed in the participants' homes. To address the comparability of data collected over different exams, we calculated the crude age-adjusted correlations ~3 year within individual correlations between measures made in the most recent GCRC exam with measures made in a prior exam and demonstrated excellent levels of agreement for BMI (r=.91); waist circumference (0.91); FVC (0.88); and FEV1 (0.86). As expected due to higher biological and measurement variability, 149 somewhat lower 3-year correlations were demonstrated for SBP (0.56); Diastolic BP (0.48); AHI (0.62); and nocturnal oxygen desaturation (0.60). NHLBI Candidate-gene Association Resource. The NHLBI initiated the Candidate gene Association Resource (CARe) to create a shared genotype/phenotype resource for analyses of the association of genotypes with phenotypes relevant to the mission of the NHLBI. The resource comprises nine cohort studies funded by the NHLBI: Atherosclerosis Risk in Communities (ARIC), Cardiovascular Health Study (CHS), Cleveland Family Study (CFS), Coronary Artery Risk Development in Young Adults (CARDIA), Cooperative Study of Sickle Cell Disease (CSSCD), Framingham Heart Study (FHS), Jackson Heart Study (JHS), Multi-Ethnic Study of Atherosclerosis (MESA), and the Sleep Heart Health Study (SHHS). A database of genotype and phenotype data will be created that includes records for approximately 50,000 study participants with approximately 50,000 SNPs from more than 1,200 selected candidate genes. In addition, a genome wide association study using a 1,000K SNP Chip will be conducted on approximately 9,500 African American participants drawn from the 50,000 participants in the nine cohorts. Some relevant CARe publications CARe Study: PMID 20400780 CVD Chip Design: PMID 18974833
Circle-Seq experiment.
BMI1 ChIP-seq on human K562
The dataset represents a total of 18 DNA samples from 6 male and 3 female pediatric patients affected with central or peripheral nervous system tumors not classified as embryonal central nervous system tumors, nor gliomas, glioneuronal, or neuronal tumors. One tumor tissue sample and one peripheral blood sample from each patient were subject to whole genome sequencing (WGS) and were sequenced 2x150 bp paired-end on an Illumina HiSeqX v2.5 instrument. The FASTQ files generated were aligned to the human reference genome sequence GRCh38/hg38 using bwa-mem, with the ALT-aware option turned on. Sorting of reads and marking of PCR duplicates was performed with GATK. Base quality score recalibration and joint realignment of reads around insertions and deletions (indels) were conducted using GATK tools. The dataset consists of 18 files in the CRAM format (lossless compression) with a total file size of ~3,4 TB. Additional genomic and molecular data (FASTQ, BAM, IDAT, and VCF files) and limited clinical data can be requested by ethically approved projects conducting research in the field of pediatric cancer.
This study investigates the clinical utility of serum CA19-9 levels in pancreatic ductal adenocarcinoma (PDAC), with a focus on identifying patients with Lewis antigen–negative status due to germline FUT3 variants. Using data from multicenter prospective cohorts in Taiwan, we analyzed CA19-9 classification, FUT2/FUT3 genotypes, and clinicopathological annotations. The goal of this study is to refine prognostic stratification and reduce false-negative interpretation of CA19-9 in PDAC patients. All participants provided informed consent, and the study was approved by the relevant institutional review boards.
To capture the full heterogeneity of the cellular types and states during HSPCs differentiation, haematopoietic progenitors will be obtained through column-enrichment of cells expressing CD34, a pan-progenitor marker for HSPCs. The samples from tumour and non-tumour lung tissue from different patients will be processed using the 10x Genomics platform. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/
In this study, a total of 300 patients with MIBC receiving chemotherapy were included; 62 received NAC before cystectomy and 245 received first-line chemotherapy upon detection of locally-advanced (T4b) or metastatic disease. Treatment response, defined as pathological downstaging (< pTa,CIS,N0) after NAC or complete or partial response after first-line treatment (RECIST criteria). WES was performed using DNA from 165 tumors (76x median coverage) and associated germline DNA (46x median coverage). Data provided here consist of 5,828 fastq files for WES.
In this study, we explore the potential of classifying pediatric brain tumors based on methylation profiling of the cell-free DNA in cerebrospinal fluid (CSF). For this proof-of-concept study, we collected 20 cerebrospinal fluid samples of pediatric brain cancer patients via a ventricular drain placed for reasons of increased intracranial pressure. For 11 patients in this study we collected matched tumor DNA. This cohort contains fastQ files of cfRRBS data of these samples.
Lifestyle, environmental and other exposures to exogenous mutagens generate somatic mutations in normal human cells in vivo and increase cancer risk. However, the global repertoire of exogenous mutagen exposures is uncertain. Using single-molecule duplex sequencing of normal kidney (n=319) and blood (n=272) samples from 10 countries, we show that kidney proximal tubule cells exhibit higher mutation rates than most normal cell types despite low cell division rates. Compared to cells from kidney glomeruli, medulla, distal tubules, or peripheral blood, proximal tubule cells show marked enrichment of mutational signatures due to the exogenous carcinogenic mutagens, aristolochic acids, and of several signatures of unknown causes. The results suggest the existence of multiple, common, systemically circulated mutagens affecting human populations and indicate that the genomes of kidney proximal tubule cells report such exposures with high sensitivity.