Congenital heart defects (CHDs) are the most common serious birth defect and a leading cause of infant mortality. Affected individuals who survive infancy require substantial medical care and experience lifelong morbidity and early mortality. Despite their prevalence and impact upon public health, the etiology of CHDs is poorly understood. Our long-term goal is to better define the genetic basis of CHDs and using this information, develop preventive measures and provide precise medical management based on genotype. As CHDs are a heterogeneous group of conditions, we have focused our studies on cases with conotruncal heart defects (CTDs) that comprise at least 36% of all CHDs and are a prominent feature of the 22q11.2 deletion syndrome (22q11DS). We have conducted both SNP- and CNV-based genome-wide association studies as part of our previous and current P01 (HD070454). Our current analyses of genes and gene-sets provide new insights regarding the contribution of common and rare inherited genetic variants to CTDs. Specifically, based on the results from our studies as well as recently published findings, we hypothesize that: (1) CTDs are influenced by the full range of genetic variability (rare to common; de novo and inherited) in relevant gene-sets; (2) the number of disease-related variants within these gene-sets varies directly with CTD severity; and (3) gene-sets associated with CTDs in nonsyndromic individuals are also associated with CTDs in individuals with the 22q11DS. To augment our existing cohorts and thus substantially improve the statistical power of our proposed analyses, application was made to complete array genotyping of a set of patients with CTDs and in particular, those with mild disease represented by isolated aortic arch anomalies. The new cases with CTD and some with an isolated aortic arch anomaly were ascertained at the Children's Hospital of Philadelphia (CHOP) while some of the cases with isolated aortic arch anomalies were ascertained by the Pediatric Cardiac Genomics Consortium (PCGC). A small group of cases with left-sided obstructive lesions from CHOP were included to combine with our existing datasets to serve as a comparison to those with CTD. The additional genotype data will augment the statistical power of the proposed common variant analyses. We anticipate that this work will reduce the gaps in our understanding of CTDs by identifying an explicit set of genes associated with CTD-risk by way of common variants. This work will also help define the genetic architecture for the full clinical spectrum of these common birth defects.
Analysis of the chronic lymphocytic leukemia coding genome: role of NOTCH1 mutational activation The pathogenesis of chronic lymphocytic leukemia (CLL), the most common leukemia in adults, is still largely unknown since the full spectrum of genetic lesions that are present in the CLL genome, and therefore the number and identity of dysregulated cellular pathways, have not been identified. By combining next-generation sequencing and copy number analysis, we show here that the typical CLL coding genome contains less than 20 clonally represented gene alterations/case, including predominantly non-silent mutations and fewer copy number aberrations. These analyses led to the discovery of several genes not previously known to be altered in CLL. While most of these genes were affected at low frequency in an expanded CLL screening cohort, mutational activation of NOTCH1, observed in 8.3% of CLL at diagnosis, was detected at significantly higher frequency during disease progression toward Richter transformation (31.0%) as well as in chemorefractory CLL (20.8%). Consistent with the association of NOTCH1 mutations with clinically aggressive forms of the disease, NOTCH1 activation at CLL diagnosis emerged as an independent predictor of poor survival. These results provide initial data on the complexity of the CLL coding genome and identify a dysregulated pathway of diagnostic and therapeutic relevance. Genetic Lesions associated with Chronic Lymphocytic Leukemia transformation to Richter Syndrome Richter syndrome (RS) derives from the rare transformation of chronic lymphocytic leukemia (CLL) into an aggressive lymphoma, most commonly of the diffuse large B cell type (DLBCL). The molecular pathogenesis of RS is only partially understood. By combining whole-exome sequencing and copy-number analysis of 9 CLL-RS pairs and of an extended panel of 43 RS cases, we show that this aggressive disease typically arises from the predominant CLL clone by acquiring an average of ~20 genetic lesions/case. RS lesions are heterogeneous in terms of load and spectrum among patients, and include those involved in CLL progression and chemorefractoriness (TP53 disruption and NOTCH1 activation) as well as some not previously implicated in CLL or RS pathogenesis. In particular, disruption of the CDKN2A/B cell cycle regulator locus is associated with ~30% of RS cases. Finally, we report that the genomic landscape of RS is significantly different from that of de novo DLBCL, suggesting that they represent distinct disease entities. These results provide insights into RS pathogenesis, and identify dysregulated pathways of potential diagnostic and therapeutic relevance.
Dynamic approaches that integrate population-based research and molecular biology are needed to explain the mechanisms underlying pediatric rhabdomyosarcoma (RMS) and to determine novel prevention strategies. RMS, the most common soft-tissue sarcoma in children and adolescents, has one of the poorest 5-year survival rates among all pediatric cancers (less than 65%). One of the strongest risk factors for RMS is having a cancer predisposition syndrome. The syndromes that are most commonly seen among those with RMS are Li-Fraumeni, neurofibromatosis type 1, Costello, Noonan, and DICER1. Based on smaller clinic-based studies, only about 7% of RMS cases are thought to be associated with the genes responsible for these syndromes. However, there have been no population-based assessments to support this estimate. Even in the most recent large-scale evaluations of germline mutations in predisposition genes among children with cancer, very few RMS cases were included (43 cases). Furthermore, no distinctions were made between the major histologic subtypes of RMS: embryonal (eRMS) and alveolar (aRMS), which display differences in terms of age distribution, incidence, and cytogenetics. For instance, nearly 80% of alveolar cases are driven by a chromosomal translocation between either PAX3 or PAX7 and FOXO1, whereas these fusions are not seen in embryonal cases. In fact, RMS research is shifting from categorization based on histology to fusion status (eRMS is overwhelmingly fusion-negative). Another limitation in previous studies has been the inability to evaluate the frequency of de novo germline mutations (DNMs) in cancer predisposition genes due to the absence of any well-characterized cohorts of RMS case-parent trios. Therefore, a major gap in our understanding of the role of cancer predisposition in pediatric RMS that limit translational impact is there have been no population-based assessments to determine the true impact of these mutations on pediatric RMS, which limits clinical sequencing guidelines and surveillance protocols in these children.Overall Project Strategy: The objective of this project is to advance our understanding of the relationship between cancer predisposition genes and pediatric RMS. Our central hypotheses are: 1) mutations in cancer predisposition genes are more common than expected in children with RMS; and 2) children with fusion-negative tumors have a higher burden of germline mutations than those with fusion-positive tumors. The framework for this study relies on >600 well annotated samples collected from newly diagnosed RMS patients and stored in the Children’s Oncology Group (COG) Biopathology Center.
Background: Understanding the cancer genome is seen as a key step in improving outcomes for cancer patients. Genomic assays are emerging as a possible avenue to personalised medicine in breast cancer. The majority of work in this area has targeted primary tumours however, and very few studies have performed comprehensive profiling of advanced disease. Evolution of the cancer genome during the natural history of breast cancer is largely unknown, as is the profile of disease at death. We sought to study in detail these aspects of advanced breast cancers that have resulted in lethal disease. Methods and Findings: Three patients with ER-positive, HER2-negative breast cancer and one patient with triple negative breast cancer underwent rapid autopsy as part of an institutional prospective community-based rapid autopsy program. Cases represented a range of management problems in breast cancer, including late relapse after early stage disease; de novo metastatic disease; discordant disease response and disease refractory to treatment. Between 5 and 12 metastatic sites were collected at autopsy together with available primary tumours and longitudinal metastatic biopsies taken during life. Samples underwent paired tumour-normal whole exome sequencing and single nucleotide polymorphism arrays. Subclonal architectures were inferred by jointly analysing all samples from each patient. Mutations were validated using high depth amplicon sequencing.Between cases, there were significant differences in mutational burden, driver mutations, mutational processes and copy number variation. Within each case, we found dramatic heterogeneity in subclonal structure from primary to metastatic disease and between metastatic sites, such that no single lesion captured the breadth of disease. Metastatic cross seeding was found in each case and treatment drove subclonal diversification. Subclones displayed parallel evolution of treatment resistance in some cases, and apparent augmentation of key oncogenic drivers as an alternative resistance mechanism. We also observed the key role of mutational processes in subclonal evolution.Limitations of this study include the potential for bias introduced by joint analysis of formalin fixed archival specimens with fresh specimens, and the difficulties in resolving subclones with whole exome sequencing. Other alterations that could define subclones such as structural variants or epigenetic modifications were not assessed. Conclusions: This study highlights the variety of mechanisms that shape the genome of metastatic breast cancer, and the value of studying advanced disease in detail. Treatment drives significant genomic heterogeneity in breast cancers which has implications for disease monitoring and treatment selection in the personalised medicine paradigm.
The role of DNA sequence in determining replication timing (RT) and chromatin higher order organization remains elusive. To address this question, we have developed an extra-chromosomal replication system consisting of ~200kb human bacteria artificial chromosomes (BACs) modified with Epstein-Barr virus (EBV) replication origin elements (E-BACs). E-BACs were stably maintained as autonomous mini-chromosomes in both HeLa and human induced pluripotent stem cells (hiPSCs) and established distinct RT patterns. An E-BAC harboring an early replicating chromosomal region replicated early during S phase, while E-BACs derived from RT transition regions (TTRs) and late replicating regions replicated in mid to late S phase. Analysis of E-BAC interactions with cellular chromatin (4C-seq) revealed that the early replicating E-BAC interacted broadly throughout the genome and preferentially with the early replicating compartment of the nucleus. In contrast, mid- to late-replicating E-BACs interacted with more specific late replicating chromosomal segments, some of which were shared between different E-BACs. Together, we describe a versatile system in which to study the structure and function of chromosomal segments that are stably maintained separately from the influence of cellular chromosome context.
The ELLIPSE Consortium is an international effort to discover risk loci for prostate cancer. It includes the meta-analysis of existing GWAS data as well as novel GWAS, exome, and iCOGS genotyping. The GWAS meta-analysis includes the following cases and controls from studies of European ancestry: UK GWAS stage 1 (Illumina Infinium HumanHap 550 Array: 1854 cases and 1894 controls), UK GWAS stage 2 (Illumina iSELECT: 3706 cases and 3884 controls), CAPS1 (Affymetrix GeneChip 500K: 474 cases and 482 controls), CAPS2 (Affymetrix GeneChip 5.0K: 1458 cases and 512 controls), BPC3 (Illumina Human610 Illumina: 2068 cases and 3011 controls), PEGASUS (HumanOmni2.5: 4600 cases and 2941 controls). The OMNI 2.5M genotyping was conducted for 977 prostate cancer cases from UKGPCS. The Exome SNP array genotyping was conducted for 4741 subjects from UKGPCS. The iCOGs genotyping was conducted for 10366 subjects which includes the Multiethnic Cohort (n=1648) and UKGPCS (n=8718). Below is a description of each study that contributed to the meta-analysis of men of European ancestry. Information about the studies that contributed to the multiethnic meta-analysis can be found on the associated study page and also in Conti et al (Nature Genetics, PMID:33398198). UK GWAS Stage 1 (UK1) and Stage 2 (UK2): The UK Genetic Prostate Cancer Study (UKGPCS) was first established in 1993 and is the largest prostate cancer study of its kind in the UK, involving nearly 189 hospitals. We are based at The Institute of Cancer Research in Sutton, Surrey, and collaborate with the Royal Marsden NHS Foundation Trust. Our aim is to find genetic changes which are associated with prostate cancer risk. Our target is to recruit 26,000 gentlemen into the UKGPCS by 2017. Men are eligible to take part if they fit into at least one of the following groups: They have been diagnosed with prostate cancer at 60 years of age or under (up to their 61st birthday). They have been diagnosed with prostate cancer and a first, second or third degree relative where at least one of these men were diagnosed with prostate cancer at 65 years of age or under. They are affected and have 3 or more cases of prostate cancer on one side of their family. They are a prostate cancer patient at the Royal Marsden NHS Foundation Trust. We have to date recruited around 16,000 men on whom we have germline DNA and clinical data at diagnosis. The UK GWAS is based on genotyping of 541,129 SNPs in 1,854 individuals with clinically detected (non-PSA-screened) prostate cancer (cases) and 1,894 controls. 43,671 SNPs showing strong evidence of association in stage 1 were followed up by genotyping a further 3,268 cases and 3,366 controls from UK and Melbourne in stage2. CAPS1 and CAPS2: The CAPS (Cancer of the Prostate in Sweden) study represents a large Swedish population-based cancer study, comprising 3,161 cases and 2,149 controls, recruited between 2001 and 2003. Biopsy confirmed prostate cancer cases were identified and recruited from four out of six regional cancer registries in Sweden, diagnosed between July 2001 and October 2003. Clinical data including TNM stage, Gleason grade and PSA levels at time for diagnosis were retrieved through record linkage to the National Prostate Cancer Registry. Control subjects, who were recruited concurrently with case subjects, were randomly selected from the Swedish Population Registry and matched according to the expected age distribution of cases (groups of 5-year intervals) and geographic region. Whole blood was collected from all individuals for extraction of genomic DNA. A GWAS was conducted in two parts. In the first phase (CAPS1) 498 cases and 502 controls were genotyped, in the second phase 1,483 cases and 519 controls were genotyped. Genotyping was performed using the GeneChip Human Mapping 500K (CAPS1) and 5.0K (CAPS2) Array Set from Affymetrix (Santa Clara, CA). The National Cancer Institute Breast and Prostate Cancer Cohort Consortium, BPC3: BPC3 was a consortium of prospective cohort studies investigating genetic and gene-environmental risk factors for breast and prostate cancer. Each study selected cases and controls for this study as described below. The clinical criteria defining advanced prostate cancer (Gleason = 8 or stage C/D) were either obtained from medical records or cancer registries. The Gleason score source was either surgical specimens (radical prostatectomy or autopsy) or the diagnostic biopsy (needle biopsy or TURP). When multiple Gleason scores were available the surgical value was used. PLCO was removed from the analysis as the samples were included in the Pegasus GWAS described below. In total 2,473 advanced prostate cancer cases and 3,534 controls were included in the analysis following QC. ATBC, Alpha-Tocopherol, Beta-Carotene Cancer Prevention Study: ATBC was a randomized, placebo-controlled primary prevention trial to investigate whether α-tocopherol or ß-carotene supplementation reduced the incidence of lung or other cancers in male smokers. Between 1985 and 1988, 29,133 men ages 50 to 69 years were enrolled in the trial from Finland and randomized to supplementation (50 mg α-tocopherol, 20mg ß-carotene, or both) or placebo. Men with a prior history of cancer, other than non-melanoma skin cancer or carcinoma in situ, were excluded from participating. Incident cancer cases are identified through linkage with the Finnish Cancer Registry, which has ~100% ascertainment of cancer cases nationwide. Cases included 249 men diagnosed with advanced prostate cancer (Gleason = 8 or stage C/D) from 1985 to 2003 with DNA available. Controls were 1,271 men selected previously for a GWAS of lung cancer in ATBC without a diagnosis of prostate cancer. CPSII, Cancer Prevention Study II: CPSII is a cohort study started in 1982 to investigate the relationship between dietary, lifestyle and other etiologic factors and cancer mortality. Approximately 1.2 million men and women enrolled in the study from 50 states in the U.S. In 1992, a subset of these participants (n= ~184,000) were enrolled in the CPSII Nutrition Cohort to examine the relationship between dietary and other exposures and cancer incidence. Blood samples were drawn from approximately 39,376 members of the Nutritional Cohort from 1998 to 2001, and buccal cells were collected from 69,467 members from 2001 to 2002. Cancer cases are identified by self-report through follow-up questionnaires followed by verification through medical records and/or linkage to state cancer registries as well as death certificates. A total of 660 advanced prostate cancer cases (Gleason = 8 or stage III/IV) with a source of DNA were identified for this study. Controls were 660 men matched on ethnicity, date of birth, sample collection date and DNA type. EPIC, European Prospective Investigation into Cancer and Nutrition: EPIC is a prospective study designed to investigate both genetic and non-genetic risk factors for different forms of cancer. Study participants were almost all white Europeans. Approximately 500,000 individuals (150,000 men) in EPIC were recruited between 1992 and 2000, from 23 centers in 10 European countries. Overall approximately 400,000 subjects also provided a blood sample at recruitment. The methods of recruitment and details of the study design are described in detail elsewhere. In brief, study participants completed an extensive questionnaire on both dietary and nondietary data at recruitment. The present study includes subjects from advanced prostate cancer cases (Gleason = 8 or stage III/IV) matched to controls based on study center, length of follow-up, age at enrollment (± 6 months), fasting and time of day of blood collection (± 1 hour). The advanced prostate cancer subjects were from 8 of the 10 participating countries: Denmark, Germany, Greece, Italy, the Netherlands, Spain, Sweden and the United Kingdom (UK). France and Norway were not included in the current study because these cohorts only included female subjects. All participants gave written consent for the research and approval for the study was obtained from the ethical review board from all local institutions in the regions where participants had been recruited for the EPIC study. HPFS, Health Professionals Follow-up Study: HPFS began in 1986 and is an ongoing prospective cohort study of 51,529 United States male dentists, optometrists, osteopaths, podiatrists, pharmacists, and veterinarians 40 to 75 years of age. The baseline questionnaire provided information on age, marital status, height and weight, ancestry, medications, smoking history, disease history, physical activity, and diet. At baseline the cohort was 97% white, 2% Asian American, and 1% African American. The median follow-up through 2005 was 10.5 years (range 2-19 years). Self-reported prostate cancer diagnoses were confirmed by obtaining medical and/or pathology records. Prostate cancer deaths are either reported by family members in response to follow-up questionnaires, discovered by the postal system, or the National Death Index. Questionnaires are sent every two years to surviving men to update exposure and medical history. In 1993 and 1994, a blood specimen was collected from 18,018 men without a prior diagnosis of cancer. Prostate cancer cases are matched to controls on birth year (+/-1) and ethnicity. Controls are selected from those who are cancer-free at the time of the case’s diagnosis, and had a prostate-specific antigen test after the date of blood draw. MEC, Multiethnic Cohort: The Multiethnic Cohort Study is a population-based prospective cohort study that was initiated between 1993 and 1996 and includes subjects from various ethnic groups - African Americans and Latinos primarily from Californian (great Los Angeles area) and Native Hawaiians, Japanese-Americans, and European Americans primarily from Hawaii. State drivers’ license files were the primary sources used to identify study subjects in Hawaii and California. Additionally, in Hawaii, state voter’s registration files were used, and, in California, Health Care Financing Administration (HCFA) files were used to identify additional African American men. All participants (n=215,251) returned a 26-page self-administered baseline questionnaire that obtained general demographic, medical and risk factor information. In the cohort, incident cancer cases are identified annually through cohort linkage to population-based cancer Surveillance, Epidemiology, and End Results (SEER) registries in Hawaii and Los Angeles County as well as to the California State cancer registry. Information on stage and grade of disease are also obtained through the SEER registries. Blood sample collection in the MEC began in 1994 and targeted incident prostate cancer cases and a random sample of study participants to serve as controls for genetic analyses. PHS, Physicians Health Study:PHS was a randomized trial of aspirin and ß carotene for cardiovascular disease and cancer among 22,071 U.S. male physicians ages 40-84 years at randomization; none had a cancer diagnosis at baseline. The original trial ended, but the men are followed. From 1982 to 1984, blood samples were collected from 14,916 physicians before randomization. Participants are sent yearly questionnaires to ascertain endpoints. Whenever a physician reports cancer, we request permission to obtain the medical records, and cancers are confirmed by pathology report. We obtain death certificates and pertinent medical records for all deaths. Follow-up for nonfatal outcomes in PHS is over 97% complete, and for mortality, over 99%. PLCO, Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial:PLCO is a multicenter, randomized trial to evaluate screening methods for the early detection of prostate, lung, colorectal and ovarian cancer. Between 1993 and 2001, over 150,000 men and women ages 55-74 years were recruited from ten centers in the United States (Birmingham, AL; Denver, CO; Detroit, MI; Honolulu, HI; Marshfield, WI; Minneapolis, MN; Pittsburgh, PA; Salt Lake City, UT; St. Louis, MO; and Washington, D.C.). Men randomized to the screening arm underwent prostate cancer screening with prostate-specific antigen (PSA) annually for six years and digital rectal exam annually for four years. Blood specimens were collected from participants randomized to the screening arm of the trial, and buccal cell specimens were obtained from participants randomized to the control arm. Cases included 754 men diagnosed with advanced prostate cancer (Gleason = 8 or stage III/IV) from either arm of the trial. Of these cases, 317 were genotyped previously as part of Cancer Genetic Markers of Susceptibility (CGEMS), a GWAS for prostate cancer. Controls included 1,491 men without a diagnosis of prostate cancer from the screening arm of the PLCO trial. All subjects provided informed consent to participate in genetic etiology studies of cancer and other traits. This study was approved by the institutional review boards at the ten centers and the National Cancer Institute. PLCO was removed from the meta-analysis of the BPC3 studies as a consequence of PEGASUS below. PEGASUS, Prostate cancer Genome-wide Association Study of Uncommon Susceptibility loci: Pegasus is a genome-wide association nested within the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial. PLCO is a multicenter, randomized trial to evaluate screening methods for the early detection of prostate, lung, colorectal and ovarian cancer. Between 1993 and 2001, over 150,000 men and women ages 55-74 years were recruited from ten centers in the United States (Birmingham, AL; Denver, CO; Detroit, MI; Honolulu, HI; Marshfield, WI; Minneapolis, MN; Pittsburgh, PA; Salt Lake City, UT; St. Louis, MO; and Washington, D.C.). Men randomized to the screening arm underwent prostate cancer screening with prostate-specific antigen annually for six years and digital rectal exam annually for four years. Blood specimens were collected from participants randomized to the screening arm of the trial, and buccal cell specimens were obtained from participants randomized to the control arm. Cases included 4,598 men of European ancestry diagnosed with prostate cancer from either arm of the trial and controls included 2,941 men of European ancestry without a diagnosis of cancer from the screening arm, matched on age and year of randomization. All subjects provided informed consent, and the study approved by the institutional review board at the National Cancer Institute. Funding:This work was supported by the GAME-ON U19 initiative for prostate cancer (ELLIPSE): U19 CA148537. The BPC3 was supported by the U.S. National Institutes of Health, National Cancer Institute (cooperative agreements U01-CA98233, U01-CA98710, U01-CA98216, and U01-CA98758, and Intramural Research Program of NIH/National Cancer Institute, Division of Cancer Epidemiology and Genetics). The ATBC study and PEGASUS was supported in part by the Intramural Research Program of the NIH and the National Cancer Institute. Additionally, this research was supported by U.S. Public Health Service contracts N01-CN-45165, N01-RC-45035, N01-RC-37004 and HHSN261201000006C from the National Cancer Institute, Department of Health and Human Services. CAPS: The Department of Medical Epidemiology and Biostatistics, Karolinska Institute, Stockholm, Sweden was supported by the Cancer Risk Prediction Center (CRisP; www.crispcenter.org), a Linneus Centre (Contract ID 70867902) financed by the Swedish Research Council, Swedish Research Council (grant: K2010-70X-20430-04-3), the Swedish Cancer Foundation (grant: 09-0677), the Hedlund Foundation, the Söderberg Foundation, the Enqvist Foundation, ALF funds from the Stockholm County Council. Stiftelsen Johanna Hagstrand och Sigfrid Linnér’s Minne, Karlsson’s Fund for urological and surgical research. We thank and acknowledge all of the participants in the Stockholm-1 study. We thank Carin Cavalli-Björkman and Ami Rönnberg Karlsson for their dedicated work in the collection of data. Michael Broms is acknowledged for his skillful work with the databases. KI Biobank is acknowledged for handling the samples and for DNA extraction. Hans Wallinder at Aleris Medilab and Sven Gustafsson at Karolinska University Laboratory are thanked for their good cooperation in providing historical laboratory results. UKGPCS would like to acknowledge the NCRN nurses and Consultants for their work in the UKGPCS study. We thank all the patients who took part in this study. This work was supported by Cancer Research UK (grants: C5047/A7357, C1287/A10118, C1287/A5260, C5047/A3354, C5047/A10692, C16913/A6135 and C16913/A6835). We would also like to thank the following for funding support: Prostate Research Campaign UK (now Prostate Cancer UK), The Institute of Cancer Research and The Everyman Campaign, The National Cancer Research Network UK, The National Cancer Research Institute (NCRI) UK. We are grateful for support of NIHR funding to the NIHR Biomedical Research Centre at The Institute of Cancer Research and The Royal Marsden NHS Foundation Trust. The MEC was supported by NIH grants CA63464, CA54281 and CA098758.
Age-related Macular Degeneration (AMD) is a leading cause of incurable blindness in people over the age of 65. AMD is a late-onset multi-factorial neurodegenerative disease and its pathogenesis involves interaction of genetic and environmental factors. Several chromosomal regions have been associated with AMD susceptibility through linkage analysis (Swaroop et al., 2009). More recent studies provide strong evidence that variants within the CFH gene cluster on chromosome 1 and at/near LOC387715/ARMS2 on chromosome 10 are strongly associated with the disease. Variants at other genes including C2/BF, C3, CFI and APOE4, also contribute to AMD susceptibility. Our primary goals are to identify genetic variants and haplotypes that are associated with AMD. The underlying hypothesis is that DNA variation(s) in multiple genetic susceptibility loci will predispose individuals to AMD pathogenesis, and comparison of DNA of cases and controls should identify these susceptibility variants. Our studies are focused on the genetic analysis of advanced AMD and should provide novel insights into disease diagnosis, progression and pathology. We have assembled a collaborative group of researchers from the University of Michigan, Mayo Clinic, University of Pennsylvania, and the AREDS group including National Eye Institute intramural investigators, who collected clinical data and DNA from a large number of patients affected with AMD and from unaffected controls. The primary source of funding was National Eye Institute. Study 1: To identify genetic variants and haplotypes that are associated with AMD, we submitted and obtained usable genotyping data on 2185 patients and 1155 controls from the Center for Inherited Disease Research (CIDR). Study 2: To identify rare coding variants associated with a large increase in risk of AMD, 10 candidate loci spanning 57 genes were sequenced in 2,335 cases and 789 controls. Probes were designed to capture 96.5% of the coding sequence and 35% of total locus sequence, generating an average 123Mb of on-target sequence per individual at 127x average depth. Substudies: phs000182 AMD-MMAP Cohort Study: A Joint Genome-Wide Asscociation Study phs000246 Fuchs' Corneal Dystrophy GWAS phs000457 MMAP Methylation in AMD phs000685 Age-Related Macular Degeneration Targeted Sequencing Study
Sickle cell disease (SCD) is caused by homozygosity for a single mutation of the beta hemoglobin gene. Despite the constancy of this genetic abnormality, the clinical course of patients with SCD is remarkably variable. SCD can affect the function and cause the failure of multiple organ systems through the pathophysiologic processes of vaso-occlusion and hemolysis. These pathophysiological processes are complex and expected to impact multiple organ systems in a variety of ways. This study, therefore, was designed to identify genetic factors that predispose SCD patients to develop specific end-organ complications and to experience more or less severe clinical courses. We enrolled > 700 patients with Hb SS, Hb S-beta0 thalassemia and HbSC being followed primarily at three southeastern U.S. regional institutions (Duke University Medical Center, University of North Carolina Medical Center, and Emory University Medical Center). Medical information obtained included the presence or absence of specific targeted outcomes (overall disease severity as well as specific types of end organ damage). Clinical data include medical status (history, physical, examination, and laboratory results) and information regarding potentially confounding environmental factors. Limited plasma samples are available for correlative studies (e.g. of cytokine levels, coagulation activation). Targeted SNP for candidate gene analysis as well as GWAS has been performed on most samples. Whole genome sequencing has been conducted through the TOPMed Consortium. The subjects in this analysis were collected as part of a larger study, "Outcome Modifying Genes in Sickle Cell Disease" (OMG-SCD) aimed at identifying genetic modifiers for sickle cell disease. More information about the study can be found in Elmariah et al. (2014), PMID: 24478166. Clinical and genetic data have been used to identify genetic characteristics predisposing patients with SCD to a more or less severe overall clinical course as well as to individual organ-specific complications. It is anticipated that identification of such genetic factors will reveal new therapeutic targets individualized to specific complications of SCD, leading to improved outcomes and increased life expectancy for patients with SCD.
A whole-exome sequencing (WES) study was conducted in 3,233 cases diagnosed with multiple primary cancers and 3,229 matched cancer-free controls (90% non-Hispanic white, 3% African-American, 3% East Asian, and 4% Latino) selected from individuals in the Kaiser Permanente Research Bank (KPRB) who were members of the Kaiser Permanente Northern California (KPNC) health plan. Cancer-free controls were matched to cases on age at specimen collection (within 2 years), sex, genotyping array (which matched on self-reported race/ethnicity), closest distance using the first two principal components for genetic ancestry, and reagent kit. Cases and controls were drawn from two prospective KPRB cohorts: the Research Program on Genes, Environment and Health (RPGEH) and the ProHealth study. Participants were sequenced by the Regeneron Genetics Center using the Illumina NovaSeq 6000 platform, and sample preparation and quality control were performed using a high-throughput, fully-automated system [PMID: 33087929]. Reads were aligned to the GRCh38 reference genome, and variants were called using WeCall [PMID: 33087929]. Participants with sex discordance, 20x coverage at less than 80% of targeted sites, and/or contamination greater than 5% were excluded. After quality control, we retained n = 6,247 (3,111 cases, 3,136 controls) individuals for downstream analyses. Among participants selected for this WES study, n = 5,432 (2,299 cases; 3,133 controls) consented to deposition of data to the National Institutes of Health (NIH).Further quality control was applied to filter low quality variants. Genotype calls with low depth of coverage (DP) were updated to missing (DP < 7 for SNPs and DP < 10 for indels), after which sites with low allele balance (AB) - variants without at least one sample having AB ≥ 15% for SNPs or AB ≥ 20% for indels - were removed. Lastly, variants with missingness > 10% and Hardy-Weinberg equilibrium p-value < 10-15 were excluded. Further description of quality control and downstream single-variant and gene-based analyses is available in Cavazos et al, 2022 [medRxiv].
Glioblastoma is the most common brain tumour. Characterised by a poor prognosis and its recurrence after multimodal treatments, the search for preventable risk factors has been mainly inconclusive up to date. Recently, the data merge from datasetsdeposited at the EGA allowed Aaron Diaz’s team to discover that the glioblastoma cells shift toward a mesenchymal phenotype when the tumour is recurring. Challenges in glioblastoma research As for virtually all cancer types, many efforts have been made to unveil the molecular features responsible of the disease, and several cellular pathways have indeed been identified as being frequently mutated in glioblastoma. Nonetheless, targeted therapies based on identified genes have so far failed to improve outcome, thus survival mostly relies on a standard treatment unchanged since 2005. This is a frustrating situation for both the scientific and medical communities, and above all for the patients, still facing a dreadful path. Some researchers hypothesised that this may be due to the inability to efficiently target cancer stem cells, the originators of the other cell types, thus inducing cancer relapse. In 2019 Charles P. Couturier and colleagues sequenced RNA from single cells of freshly excised glioblastomas of 16 patients, and demonstrated that glioblastoma cells replicate normal brain cell development with a conserved neural cancer cell hierarchy centered around glial progenitor-like cells. In this way, they helped identify the possible target cells to improve efficacy and durability of treatment. Data upcycling at the EGA: the glioblastoma case study Single cell RNA sequencing is generated with a laborious and expensive protocol. The quality of the starting material is crucial (cells sample freshly extracted from the patients) and often several attempts are needed before producing reliable quality sequencing results. Collecting big numbers of patients is also challenging. The sequencing data produced by Kevin Petrecca’s group in Montreal, Canada and deposited at the EGA (EGAS00001004422) was recently upcycled by Lin Wang in San Francisco, California and pooled with their freshly produced ones, and then deposited at the EGA as Dataset EGAS00001004909 The data merge allowed Aaron Diaz’s team to discover that the glioblastoma cells shift toward a mesenchymal phenotype when the tumour is recurring. They profiled 86 primary-recurrent patient-matched paired glioblastoma samples with single-nucleus RNA, among other techniques. These very comprehensive results lead the team to challenge the findings from several other cancer fields where chemotherapy standard chemo-radiation therapy selection pressure at the level of genomic alterations; this is indeed not the case for glioblastoma, where the pressure results in phenotypic transition between cellular states. Several technical controls were made to ensure that the merge of the data was not introducing a bias in the results, and an Inter-table analysis demonstrated nearly equal contribution to overall variance from each of the studies included, indicating that the findings were not due to inter-laboratory technical effect. A novel principal-component analysis showed that the largest contribution to variation in primary glioblastoma neoplastic cells was an axis between MES (mesenchymal) and proneural expression programs. In summary, treating glioblastoma often makes a MES, as commented by Lucy Stead about this remarkable work. Check Aaron Diaz’s team Nature Cancer paper, with strong technical tools, state of the art analysis and data from different sources converging to the same outcomes that made possible a significant step toward a better handling of a frightful cancer. References Couturier, C.P., Ayyadhury, S., Le, P.U. et al. Single-cell RNA-seq reveals that glioblastoma recapitulates a normal neurodevelopmental hierarchy. Nat Commun 11, 3406 (2020). Stead, L.F. Treating glioblastoma often makes a MES. Nat Cancer 3, 1446–1448 (2022). Wang, L., Jung, J., Babikir, H. et al. A single-cell atlas of glioblastoma evolution under therapy reveals cell-intrinsic and cell-extrinsic therapeutic targets. Nat Cancer 3, 1534–1552 (2022). Related links: Kevin Petrecca’s group dataset deposited at the EGA: EGAS00001004422 Lin Wang dataset deposited at the EGA: EGAS00001004909