The Cleveland Family Study is the largest family-based study of sleep apnea world-wide, consisting of 2284 individuals (46% African American) from 361 families studied on up to 4 occasions over a period of 16 years. The study was begun in 1990 with the initial aims of quantifying the familial aggregation of sleep apnea. NIH renewals provided expansion of the original cohort (including increased minority recruitment) and longitudinal follow-up, with the last exam occurring in February 2006. Index probands (n=275) were recruited from 3 area hospital sleep labs if they had a confirmed diagnosis of sleep apnea and at least 2 first-degree relatives available to be studied. In the first 5 years of the study, neighborhood control probands (n=87) with at least 2 living relatives available for study were selected at random from a list provided by the index family and also studied. All available first degree relatives and spouses of the case and control probands also were recruited. Second-degree relatives, including half-sibs, aunts, uncles and grandparents, were also included if they lived near the first degree relatives (cases or controls), or if the family had been found to have two or more relatives with sleep apnea. Blood was sampled and DNA isolated for participants seen in the last two exam cycles (n=1447). The sample, which is enriched with individuals with sleep apnea, also contains a high prevalence of individuals with sleep apnea-related traits, including: obesity, impaired glucose tolerance, and HTN. Phenotyping data have been collected over 4 exam cycles, each occurring ~every 4 years. The last three exams targeted all subjects who had been studied at earlier exams, as well as new minority families and family members of previously studied probands who had been unavailable at prior exams. Data from one, two, three and four visits are available for 412, 630, 329 and 67, participants, respectively. In the first 3 exams, participants underwent overnight in-home sleep studies, allowing determination of the number and duration of hypopneas and apneas, sleep period, heart rate, and oxygen saturation levels; anthropometry (weight, height, and waist, hip, and neck circumferences); resting blood pressure; spirometry; standardized questionnaire evaluation of symptoms, medications, sleep patterns, quality of life, daytime sleepiness measures and health history; venipuncture and measurement of total and HDL cholesterol. The 4th exam (2001-2006) was designed to collect more detailed measurements of sleep, metabolic and CVD phenotypes and included measurement of state-of-the-art polysomnography, with both collection of blood and measurement of blood pressure before and after sleep, and anthropometry, upper airway assessments, spirometry, exhaled nitric oxide, and ECG performed the morning after the sleep study. Data have been collected by trained research assistants or GCRC nurses following written Manuals of Procedures who were certified following standard approaches for each study procedure. Ongoing data quality, with assessment of within or between individual drift, has been monitored on an ongoing basis, using statistical techniques as well as regular re-certification procedures. Between and within scorer reliabilities for key sleep apnea indices have been excellent, with intra-class correlation coefficients (ICCs) exceeding 0.92 for the apnea-hypopnea index (AHI). Sleep staging, assessed with epoch specific comparisons, also demonstrate excellent reliability for stage identification (kappas>0.82). There has been no evidence of significant time trends-between or within scorers- for the AHI variables. We also have evaluated the night-to-night variability of the AHI and other sleep variables in 91 subjects, with each measurement made 1-3 months apart. There is high night to night consistency for the AHI (ICC: 0.80), the arousal index (0.76), and the % sleep time in slow-wave sleep (0.73). We have demonstrated the comparability of the apnea estimates (AHI) determined from limited channel studies obtained at in-home settings with in full in-laboratory polysomnography. In addition to our published validation study, we more recently compared the AHI in 169 Cleveland Family Study participants undergoing both assessments (in-home and in-laboratory) within one week apart. These showed excellent levels of agreement (ICC=0.83), demonstrating the feasibility of examining data from either in-home or in-laboratory studies for apnea phenotyping. Data collected in the GCRC were obtained, when possible, with comparable, if not identical techniques, as were the same measures collected at prior exams performed in the participants' homes. To address the comparability of data collected over different exams, we calculated the crude age-adjusted correlations ~3 year within individual correlations between measures made in the most recent GCRC exam with measures made in a prior exam and demonstrated excellent levels of agreement for BMI (r=.91); waist circumference (0.91); FVC (0.88); and FEV1 (0.86). As expected due to higher biological and measurement variability, 149 somewhat lower 3-year correlations were demonstrated for SBP (0.56); Diastolic BP (0.48); AHI (0.62); and nocturnal oxygen desaturation (0.60). NHLBI Candidate-gene Association Resource. The NHLBI initiated the Candidate gene Association Resource (CARe) to create a shared genotype/phenotype resource for analyses of the association of genotypes with phenotypes relevant to the mission of the NHLBI. The resource comprises nine cohort studies funded by the NHLBI: Atherosclerosis Risk in Communities (ARIC), Cardiovascular Health Study (CHS), Cleveland Family Study (CFS), Coronary Artery Risk Development in Young Adults (CARDIA), Cooperative Study of Sickle Cell Disease (CSSCD), Framingham Heart Study (FHS), Jackson Heart Study (JHS), Multi-Ethnic Study of Atherosclerosis (MESA), and the Sleep Heart Health Study (SHHS). A database of genotype and phenotype data will be created that includes records for approximately 50,000 study participants with approximately 50,000 SNPs from more than 1,200 selected candidate genes. In addition, a genome wide association study using a 1,000K SNP Chip will be conducted on approximately 9,500 African American participants drawn from the 50,000 participants in the nine cohorts. Some relevant CARe publications CARe Study: PMID 20400780 CVD Chip Design: PMID 18974833
A genomewide study of lung cancer in never smokers Abstract and specific aims In the United States, lung cancer incidence and mortality rates have been steadily declining over the past decade, following decline in the prevalence of tobacco smoking. However, lung cancer remains the leading cause of cancer death, killing more patients than breast, colon, and prostate cancers combined. Although tobacco smoke is the predominant risk factor for development of lung cancer, some patients develop the disease without a history of tobacco smoking. About 10 - 15% of all lung cancers occur in lilfetime never smokers. This figure will increase as the proportion of never smokers increases in the population. Even at present rates, lung cancer in never smokers, if considered a separate disease, is 6th to 8th top cause of cancer death. The growing number of never smokers in the USA and other countries emphasizes the importance of understanding the epidemiology and biology underlying lung cancer in this group. Genetic polymorphisms associated with the risk of lung cancer in never smokers are expected to overlap with those associated with the risk of lung cancer in ever smokers only partially. Epidemiological, molecular and clinical data suggest that molecular mechanisms of LC may differ in smokers and non-smokers, implying that lung cancer in never smokers is a different disease compared to the lung cancer in smokers. One can expect that there should be stronger genetic component in the control lung cancer in never smokers because effects of the genetic factors in never smokers are unmasked by the lack of tobacco smoke exposure. The genetic epidemiology of lung cancer in never smokers has not been well explored, largely because of difficulties in accruing the needed sample size for association studies. We propose a multicenter (total 14 sites from the US and Europe) genomewide association study of lung cancer in never smokers with the following specific aims: Aim 1: To identify candidate SNPs influencing risk for lung cancer in never smokers using Discovery sample. In the Discovery phase we will genotype 1256 Caucasian cases and 1365 age- and gender-matched never smoker controls using the Illumina Human660W-Quad platform. In addition, we will include in the analysis 284 cases and 175 matched controls already genotyped on the 610Quad platform. In this phase we will only include the study sites that have collected blood specimens (MDACC, Mayo Clinic, Karmanos Cancer Institute, The University of Liverpool Cancer research Centre, Institute of Cancer Research in Sutton, and Lunenfeld Research Institute in Toronto, Canada). All the samples will be sent to the independent lab for genotyping, to reduce site-specific technical artifacts. The final sample will consist of 1540 cases and 1540 controls matched by study site. Aim 2: To perform the second phase (validation) analysis of significant SNPs identified in aim 1 using an independent set of cases and controls. SNPs associated with risk at the significance level of 0.01 or below in the discovery set will be included in the replication phase. The proposed threshold guarantees an adequate power to retain SNPs with the typical effect size of 1.3. We plan to carry 6000-7000 SNPs for validation. The independent replication set will include 800 cases and 800 controls, mostly from sites that collected tissue (Mayo Clinic, Karmanos Cancer Institute, UT Southwestern) or buccal specimens (UCLA), but also blood samples (Imperial College London, University of Pennsylvana, German Cancer Research Center, Heidelberg, National Research Center for Environment and Health, Neuherberg, Carmel Medical Center, Haifa). We will then perform a joint analysis to test the significance of the SNPs identified in the first stage using a stringent critical p-value of 10-7. There will be 2340 cases and 2340 controls in the joint set. Based on our experience with GWAS in smokers and assuming that genetic component in lung cancer risk in never smokers can be higher than genetic component in smokers, we expect to identify about 5-10 candidate regions associated with lung cancer risk in never smokers. Aim 3: To identify and explore pathways associated with the risk of lung cancer in never smokers. Results of the number of studies on the molecular mechanisms and drug response suggest that lung cancer in never smokers is a different disease and different pathways will be associated with lung cancer risk in non-smokers and smokers. To identify pathways and molecular functions associated with lung cancer risk in never smokers we will apply Ingenuity and DAVID bioinformatics tools. We will use at least 300 top candidate genes identified in joint and discovery analysis. The reason why we select rather large number of candidate genes for functional annotation is two-fold: 1. Both algorithms are looking for enrichment of pathways and function by most significant genes and they produce statistically robust results only when number of genes is relatively high. 2. Despite the fact that this study will be largest possible for never smokers we still are underpowered to detect SNPs with relatively small effect size. But though those SNPs will not reach genome wide level of significance they will tend to be on the top of the list. In other words genes from the gray zone (significant on individual level and non-significant for genome wide level) are expected to be enriched by true discoveries. True discoveries are likely to be associated with limited number of pathways / functions while false positives are expected to be uniformly distributed across functions and pathways. Therefore significant clustering of the gene to a given function will suggest that that those genes are true discoveries. This is the first GWAS aiming at identifying the genetic control of susceptibility to lung cancer in Caucasian never smokers. We will combine the available resources from the multiple sites to achieve the sample size sufficient for this study. The study will identify genetic architecture of the predisposition to the lung cancer in never smokers.
In this study we will sequence the transcriptome of Verified Cancer Cell lines. This will be married up to whole exome and whole genome sequencing data to establish a full catalog of the variations and mutations found.
Genotyping data for ACE2 (rs2285666), MX1 (rs469390) and TMPRSS2 (rs2070788) variants. Patients are classified as mild (n=34) and severe (n=32). DNA genotyping was performed using the TaqMan® Genotyping Master Mix (Applied Biosystems). Allelic discrimination assays were performed on a 7900HT Fast Real-Time PCR System (Applied Biosystems).
Paired RNA-Seq data from 16 samples of different tumors CD8+ T cells added to the study "Proteogenomic analysis reveals RNA as a source for tumor-agnostic neoantigen identification (H021)". Sequencing was performed on Illumina NextSeq 500. The sequencing was always paired
The data published here contains bulk RNA-sequencing (RNAseq) data as obtainedfrom monocyte-derived dendritic cells in treated with/without LPS and with/without CESi (WWL113). Sequencing was performed in a paired-ended fashion on the NovaSeq6000.
This dataset contains 75 Nanopore sequencing experiments using a MinION sequencer and R9 flow cells from 51 patient biopsies. Gzipped tar files containing all fast5 files per sample are provided.
Single-cell sequencing and genotyping for Cambridge samples analyzed as part of a project evaluating single-cell gene expression & lymphocyte receptor sequences in CSF and PBMC of MS and other neuroinflammatory disorders.
WGS sequencing using nanopore long reads sequencing. These data is used for the diagnosis of patients with inherited retinal dystrophies.
This is a DAC for data deposited into EGA generated by Cancer Discover Hub, NCCS. The use of datasets in this portal is solely for research and academic use only, uses other than the have to be approved by the relevant DAC members.