The impact of genetic variants on molecular pathways that give rise to neurodegenerative diseases such as Alzheimer's and Parkinson's is best elucidated in the appropriate cell types and molecular contexts. Existing studies have focused on bulk profiling of mixed cell types, but have ignored assaying genetic effects across development and cell differentiation. At the core of this proposal is the idea to use single-cell assays to study genetic effects during differentiation of dopaminergic and cortical neurons to identify the sequence of molecular events from variants to healthy and diseased cell states in a cell-specific manner. 1) This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2020-05-18.
Cancer is a genetic disease caused by an accumulations of mutations, however many of these mutations have been identified in pathologically normal tissue. We aim to use laser-capture microscopy (LCM) to sample individual clones from the lung tissue of individuals with a variety of lung diseases (COPD, UIP, IPF, Emphysema, pulmonary hypertension). This will allow us to identify whether cancer-associated mutations appear in this normal tissue, assess the mutational burden present, and identify the mutational processes causing these mutations. Smoking is a large risk factor for developing many of these lung diseases so we are particularly keen to determining whether there is evidence of a smoking signature in these patients. . This dataset contains all the data available for this study on 2020-01-15.
T-cell lymphoblastic lymphoma (T-LBL) is a common pediatric malignancy accounting for approximately 20% of the non-Hodgkin lymphomas during childhood. Survival rates of T-LBL are ~80%, but outcome after relapse is dismal, with salvage rates reaching only ~15. Considering the extremely poor prognosis after relapse and absence of clinically relevant high-risk genetics, there is an urgent need for the identification of molecular risk factors and new prognostic biomarkers in T-LBL, as well as identification of new therapeutic strategies. In this study we present a novel entity of high-risk pediatric T-LBL patients characterized by previously unknown NOTCH1 gene fusions and highly elevated blood TARC levels
Diffuse large B-cell lymphoma (DLBCL) is the most common non-Hodgkin lymphoma (NHL), comprising 25-30% of all NHL in developed countries with an annual incidence in the USA of 7 cases/100000 persons/year. Collectively, DLBCL is classified based on a common morphological appearance of diffuse growth of large transformed B-cells, immunophenotype, high proliferation rate and aggressive behaviour. Despite these similarities, DLBCLs are a heterogeneous collection of malignancies with distinct clinical and molecular characteristics that do not always correlate with immunohistological features. This gene expression dataset includes transcriptomes of ABC-DLBCLs and of GCB-DLBCLs where cell of origin is determined by the HTG-EdgeSeq quantitative nuclease protection assay. Also included are clonality results from BCR profiling from high-grade B-cell lymphomas sequenced using a NOVA sequencer
Knowledge about abnormal organ development is important to understand pathology and to develop novel treatment approaches for individuals with congenital and acquired disease. Most of our current understanding is based on examination of tissues from the embryo and early foetus, collected from women undergoing termination of pregnancy in the first trimester (third) of pregnancy. There is very little known about normal and abnormal organ development from a developmental perspective during the crucial last two-thirds of pregnancy when much remodelling of foetal tissues occurs. This study will generate a single-cell atlas of late-foetal lungs, blood, heart, bone and immune organs. . This dataset contains all the data available for this study on 2025-10-14.
This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ We performed exome sequencing on serial samples from a patient with CMML who progressed to AML. The exome sequencing suggests that NPM1, TET2 and DNMT3a mutations were present in the dominant clone in the CMML sample and that NRAS is a new subclonal mutation in the AML sample. Diagnostic data shows the presence of a FLT3-ITD mutation in the AML sample, which is likely to have driven progression. Here we are performing re-sequencing of the putative driver and some passenger mutations which appear to be in the same clone to validate these mutations and to verify the relative quantification of these abnormalities .
This study includes 1,220 cases with young onset stroke (stroke before age 60 years) who are participants of the larger RACE study. Risk Assessment of Cerebrovascular Events (RACE) is an on-going existing case-control study of stroke now involving over 5000 imaging confirmed cases of stroke and 5000 controls, recruited from seven centers in Pakistan. The study is aimed to investigate the genetic, biomarker and lifestyle determinants of stroke and its subtypes. Cases are eligible for inclusion in the study if they: (i) are aged at least 18 years; (ii) present with a sudden onset of neurological deficit respecting a vascular territory with sustained deficit at 24 hours verified by medical attention within 72 hours after onset (onset is defined by when the patient was last seen normal and not when found with deficit); and (iii) the diagnosis is supported by CT/MRI; and (iv) present with a Modified Rankin Score < 2 prior to the stroke. Findings from patient's history, 12-lead ECG and CT or MRI of the brain. The mandatory procedures for inclusion in this investigation are: (i) clinical verification of cerebrovascular event within 72 hours of onset; (ii) neuroimaging CT (non-contrast) or MRI (MRI is not a mandatory investigation but recorded whenever ordered by the attending physician); and (iii) 12-lead ECG. All other ancillary investigations ordered by the attending physician are recorded as well. The TOAST classification method is used to classify ischemic stroke based on aetiology whereas the Oxfordshire classification is used to classify stroke neuro-anatomically. Control participants for this subset of young onset stroke were individuals enrolled in the Pakistan Risk of Myocardial Infarction Study (PROMIS), a case-control study of acute MI based in Pakistan. RACE capitalizes on the genetic data (including information on GWAS) that has already been collected from the healthy participants enrolled in PROMIS. RACE and PROMIS share similar methodology of recruitment. Participants from both these investigations are derived from similar catchment areas, hence providing an attractive opportunity for RACE to utilize PROMIS controls as common controls for genetic investigations. Controls in PROMIS were recruited following procedures and inclusion criteria as adopted for RACE cases. In order to minimize any potential selection biases, PROMIS controls selected for this stroke substudy were frequency matched to RACE cases based on age and gender and were recruited in the following order of priority: (1) non-blood related or blood related visitors of patients of the out-patient department; (2) non-blood related visitors of stroke patients; (3) patients of the out-patient department presenting with minor complaints (e.g. back pain, minor gastric complaints). Control subjects from the PROMIS study were genotyped at the Wellcome Trust Sanger Institute on the Illumina 660W Quad array. The Center for Non-Communicable Diseases, Pakistan, serves as the coordinating center for both RACE and PROMIS. More information on these research investigations can be found at www.cncdpk.com. This young onset stroke component to the RACE study was funded through the Gene Environment Association Studies initiative (GENEVA, www.genevastudy.org as one of three studies designed to assess the genetics of young onset stroke and modification of genetic effects by smoking. GENEVA is part of the trans-NIH Genes, Environment, and Health Initiative (GEI). Genotyping of 1,220 young onset stroke cases was performed at the Johns Hopkins University Center for Inherited Disease Research (CIDR). Data cleaning and harmonization were done at the GEI-funded GENEVA Coordinating Center at the University of Washington. This study is part of the Gene Environment Association Studies initiative (GENEVA, http://www.genevastudy.org) funded by the trans-NIH Genes, Environment, and Health Initiative (GEI). The overarching goal is to identify novel genetic factors that contribute to stroke through large-scale genome-wide association studies of cases and controls recruited within Pakistan. Genotyping was performed at the Johns Hopkins University Center for Inherited Disease Research (CIDR). Data cleaning and harmonization were done at the GEI-funded GENEVA Coordinating Center at the University of Washington.
The data is for non-commercial use only.
This dataset represents two combined study populations. Serrated Colorectal Cancer: An Emerging Disease Subtype (called the Advanced Colorectal Cancer of Serrated Subtype Study or ACCESS Study) was a grant awarded to investigate a newly-recognized, biologically-distinct subtype of colorectal cancer (CRC) called “serrated CRC.” The objective of this project was to characterize factors related to the genetic predisposition, clinical presentation, and prognosis of serrated CRC. The study recruited incident invasive CRC cases diagnosed between April 2016 and December 2018, aged 20-74 years at diagnosis. Cases were identified through the Surveillance, Epidemiology and End Results (SEER) cancer registry serving 13 counties in western Washington State. Eligibility for all individuals was limited to those who were English-speaking and could consent. Participation included completing a baseline epidemiologic questionnaire shortly after diagnosis, optional donation of a saliva sample for genetic analysis, and optional consent to release of medical records and tissue specimens related to their diagnosis. Tumor specimens were tested for serrated CRC-defining molecular characteristics. Further, we have vital status on all participants and cause of death on those that have died since enrollment. Hormones and Colon Cancer: Epigenetic Subtypes, Risks, and Survival (called the Post-Menopausal Hormones Study or PMH Study) was a grant awarded to investigate the impact of post-menopausal hormone use on colon cancer risk, tumor molecular characteristics, and outcomes. Eligible cases were females, newly diagnosed with invasive colorectal adenocarcinoma between October 1998 and February 2002, aged 50 to 74 years. Cases were residents of 10 out of the 13 counties in western Washington State served by the Surveillance, Epidemiology and End Results (SEER) cancer registry. Eligibility for all individuals was limited to those who were English-speaking with available telephone numbers, in which they could be contacted. Unrelated population-based controls were randomly selected according to age distribution (in 5-year age intervals) of the eligible cases by using lists of licensed drivers from the Washington State Department of Licensing (for individuals aged 50 to 64 years) and rosters from the Health Care Financing Administration (now the Centers for Medicare and Medicaid, for individuals older than 64 years). Participation included completing a baseline epidemiologic questionnaire, optional donation of a saliva sample for genetic analysis, and (for cases only) optional consent to release of medical records and tissue specimens related to their diagnosis. Tumor specimens were tested for epigenetic and other molecular characteristics. The ACCESS study was supported by funding from the National Cancer Institute of the National Institutes of Health (NCI/NIH) (R01CA196337, PI: Newcomb, PA), as was the PMH Study (R01CA076366, PI: Newcomb, PA). Additional support for the PMH Study came from the Seattle site of the Colon Cancer Family Registry (SCCFR) (U01CA167551, PI: Jenkins, M, and U01/U24CA074794, PI: Newcomb, PA). Additional support for case ascertainment was provided by the Cancer Surveillance System of the Fred Hutchinson Cancer Center, which is funded by Contract Number HHSN261201300012I; NCI Control Number: N01 PC-2013-00012; Contract Number HHSN261201800004I; and NCI Control Number: N01 PC-2018-00004 from the Surveillance, Epidemiology and End Results (SEER) Program of the National Cancer Institute with additional support from the Fred Hutchinson Cancer Center and the State of Washington. This research was also supported by the Genomics and Bioinformatics, Comparative Medicine, Specialized Pathology, Collaborative Data Services, and Experimental Histopathology Shared Resources of the Fred Hutch/University of Washington Cancer Consortium (P30 CA015704).Tumor marker testing was performed using formalin-fixed paraffin-embedded diagnostic tumor tissue specimens, and DNA extracted from those specimens. Testing for microsatellite instability (MSI) was based on either a 10-gene panel (BAT25, BAT26, BAT40, MYCL, D5S346, D17S250, ACTC, D18S55, D10S197, BAT34C4) or a 4-marker immunohistochemistry panel of DNA mismatch repair proteins (MLH1, MSH2, MSH6, PMS2). CpG island methylator phenotype (CIMP) testing was based on a validated quantitative DNA methylation assay using a five-gene panel (CACNA1G, IGF2, NEUROG1, RUNX3, SOCS1) or eight-gene panel (CACNA1G, IGF2, NEUROG1, RUNX3, SOCS1, MLH1, CRABP1, CDKN2A). Somatic p.V600E BRAF mutation status was tested for using a fluorescent allele-specific PCR assay. KRAS mutations in codons 12 and 13 were also assessed through forward and reverse sequencing of amplified tumor DNA. DNA was extracted from blood/saliva samples using conventional methods. The genotyping panel completed was the Build37 OncoArray500K-C, including 1%-6% blinded duplicates to monitor the quality of the genotyping. Quality control procedures were performed to 1) make sure that there were no patterns of missing data by batch, study, or plate, 2) check for gender discrepancies and kinship, 3) complete Principal Component Analysis, and 4) test for Hardy-Weinberg equilibrium (HWE). Samples were excluded based on call rate, heterozygosity, unexpected duplicates, gender discrepancy, and unexpectedly high identity-by-descent or unexpected genotypic concordance (>65%) with another individual. In addition, variants were excluded based on call rate (98%), lack of HWE in controls (P