Original description of the study: From ELLIPSE (linked to the PRACTICAL consortium), we contributed ~78,000 SNPs to the OncoArray. A large fraction of the content was derived from the GWAS meta-analyses in European ancestry populations (overall and aggressive disease; ~27K SNPs). We also selected just over 10,000 SNPs from the meta-analyses in the non-European populations, with a majority of these SNPs coming from the analysis of overall prostate cancer in African ancestry populations as well as from the multiethnic meta-analysis. A substantial fraction of SNPs (~28,000) were also selected for fine-mapping of 53 loci not included in the common fine-mapping regions (tagging at r2>0.9 across ±500kb regions). We also selected a few thousand SNPs related with PSA levels and/or disease survival as well as SNPs from candidate lists provided by study collaborators, as well as from meta-analyses of exome SNP chip data from the Multiethnic Cohort and UK studies. The Contributing Studies: Aarhus: Hospital-based, Retrospective, Observational. Source of cases: Patients treated for prostate adenocarcinoma at Department of Urology, Aarhus University Hospital, Skejby (Aarhus, Denmark). Source of controls: Age-matched males treated for myocardial infarction or undergoing coronary angioplasty, but with no prostate cancer diagnosis based on information retrieved from the Danish Cancer Register and the Danish Cause of Death Register. AHS: Nested case-control study within prospective cohort. Source of cases: linkage to cancer registries in study states. Source of controls: matched controls from cohort ATBC: Prospective, nested case-control. Source of cases: Finnish male smokers aged 50-69 years at baseline. Source of controls: Finnish male smokers aged 50-69 years at baseline BioVu: Cases identified in a biobank linked to electronic health records. Source of cases: A total of 214 cases were identified in the VUMC de-identified electronic health records database (the Synthetic Derivative) and shipped to USC for genotyping in April 2014. The following criteria were used to identify cases: Age 18 or greater; male; African Americans (Black) only. Note that African ancestry is not self-identified, it is administratively or third-party assigned (which has been shown to be highly correlated with genetic ancestry for African Americans in BioVU; see references). Source of controls: Controls were identified in the de-identified electronic health record. Unfortunately, they were not age matched to the cases, and therefore cannot be used for this study. Canary PASS: Prospective, Multi-site, Observational Active Surveillance Study. Source of cases: clinic based from Beth Israel Deaconness Medical Center, Eastern Virginia Medical School, University of California at San Francisco, University of Texas Health Sciences Center San Antonio, University of Washington, VA Puget Sound. Source of controls: N/A CCI: Case series, Hospital-based. Source of cases: Cases identified through clinics at the Cross Cancer Institute. Source of controls: N/A CerePP French Prostate Cancer Case-Control Study (ProGene): Case-Control, Prospective, Observational, Hospital-based. Source of cases: Patients, treated in French departments of Urology, who had histologically confirmed prostate cancer. Source of controls: Controls were recruited as participating in a systematic health screening program and found unaffected (normal digital rectal examination and total PSA < 4 ng/ml, or negative biopsy if PSA > 4 ng/ml). COH: hospital-based cases and controls from outside. Source of cases: Consented prostate cancer cases at City of Hope. Source of controls: Consented unaffected males that were part of other studies where they consented to have their DNA used for other research studies. COSM: Population-based cohort. Source of cases: General population. Source of controls: General population CPCS1: Case-control - Denmark. Source of cases: Hospital referrals. Source of controls: Copenhagen General Population Study CPCS2: Source of cases: Hospital referrals. Source of controls: Copenhagen General Population Study CPDR: Retrospective cohort. Source of cases: Walter Reed National Military Medical Center. Source of controls: Walter Reed National Military Medical Center ACS_CPS-II: Nested case-control derived from a prospective cohort study. Source of cases: Identified through self-report on follow-up questionnaires and verified through medical records or cancer registries, identified through cancer registries or the National Death Index (with prostate cancer as the primary cause of death). Source of controls: Cohort participants who were cancer-free at the time of diagnosis of the matched case, also matched on age (±6 mo) and date of biospecimen donation (±6 mo). EPIC: Case-control - Germany, Greece, Italy, Netherlands, Spain, Sweden, UK. Source of cases: Identified through record linkage with population-based cancer registries in Italy, the Netherlands, Spain, Sweden and UK. In Germany and Greece, follow-up is active and achieved through checks of insurance records and cancer and pathology registries as well as via self-reported questionnaires; self-reported incident cancers are verified through medical records. Source of controls: Cohort participants without a diagnosis of cancer EPICAP: Case-control, Population-based, ages less than 75 years at diagnosis, Hérault, France. Source of cases: Prostate cancer cases in all public hospitals and private urology clinics of département of Hérault in France. Cases validation by the Hérault Cancer Registry. Source of controls: Population-based controls, frequency age matched (5-year groups). Quotas by socio-economic status (SES) in order to obtain a distribution by SES among controls identical to the SES distribution among general population men, conditionally to age. ERSPC: Population-based randomized trial. Source of cases: Men with PrCa from screening arm ERSPC Rotterdam. Source of controls: Men without PrCa from screening arm ERSPC Rotterdam ESTHER: Case-control, Prospective, Observational, Population-based. Source of cases: Prostate cancer cases in all hospitals in the state of Saarland, from 2001-2003. Source of controls: Random sample of participants from routine health check-up in Saarland, in 2000-2002 FHCRC: Population-based, case-control, ages 35-74 years at diagnosis, King County, WA, USA. Source of cases: Identified through the Seattle-Puget Sound SEER cancer registry. Source of controls: Randomly selected, age-frequency matched residents from the same county as cases Gene-PARE: Hospital-based. Source of cases: Patients that received radiotherapy for treatment of prostate cancer. Source of controls: n/a Hamburg-Zagreb: Hospital-based, Prospective. Source of cases: Prostate cancer cases seen at the Department of Oncology, University Hospital Center Zagreb, Croatia. Source of controls: Population-based (Croatia), healthy men, older than 50, with no medical record of cancer, and no family history of cancer (1st & 2nd degree relatives) HPFS: Nested case-control. Source of cases: Participants of the HPFS cohort. Source of controls: Participants of the HPFS cohort IMPACT: Observational. Source of cases: Carriers and non-carriers (with a known mutation in the family) of the BRCA1 and BRCA2 genes, aged between 40 and 69, who are undergoing prostate screening with annual PSA testing. This cohort has been diagnosed with prostate cancer during the study. Source of controls: Carriers and non-carriers (with a known mutation in the family) of the BRCA1 and BRCA2 genes, aged between 40 and 69, who are undergoing prostate screening with annual PSA testing. This cohort has not been diagnosed with prostate cancer during the study. IPO-Porto: Hospital-based. Source of cases: Early onset and/or familial prostate cancer. Source of controls: Blood donors Karuprostate: Case-control, Retrospective, Population-based. Source of cases: From FWI (Guadeloupe): 237 consecutive incident patients with histologically confirmed prostate cancer attending public and private urology clinics; From Democratic Republic of Congo: 148 consecutive incident patients with histologically confirmed prostate cancer attending the University Clinic of Kinshasa. Source of controls: From FWI (Guadeloupe): 277 controls recruited from men participating in a free systematic health screening program open to the general population; From Democratic Republic of Congo: 134 controls recruited from subjects attending the University Clinic of Kinshasa KULEUVEN: Hospital-based, Prospective, Observational. Source of cases: Prostate cancer cases recruited at the University Hospital Leuven. Source of controls: Healthy males with no history of prostate cancer recruited at the University Hospitals, Leuven. LAAPC: Subjects were participants in a population-based case-control study of aggressive prostate cancer conducted in Los Angeles County. Cases were identified through the Los Angeles County Cancer Surveillance Program rapid case ascertainment system. Eligible cases included African American, Hispanic, and non-Hispanic White men diagnosed with a first primary prostate cancer between January 1, 1999 and December 31, 2003. Eligible cases also had (a) prostatectomy with documented tumor extension outside the prostate, (b) metastatic prostate cancer in sites other than prostate, (c) needle biopsy of the prostate with Gleason grade ≥8, or (d) needle biopsy with Gleason grade 7 and tumor in more than two thirds of the biopsy cores. Eligible controls were men never diagnosed with prostate cancer, living in the same neighborhood as a case, and were frequency matched to cases on age (± 5 y) and race/ethnicity. Controls were identified by a neighborhood walk algorithm, which proceeds through an obligatory sequence of adjacent houses or residential units beginning at a specific residence that has a specific geographic relationship to the residence where the case lived at diagnosis. Malaysia: Case-control. Source of cases: Patients attended the outpatient urology or uro-onco clinic at University Malaya Medical Center. Source of controls: Population-based, age matched (5-year groups), ascertained through electoral register, Subang Jaya, Selangor, Malaysia MCC-Spain: Case-control. Source of cases: Identified through the urology departments of the participating hospitals. Source of controls: Population-based, frequency age and region matched, ascertained through the rosters of the primary health care centers MCCS: Nested case-control, Melbourne, Victoria. Source of cases: Identified by linkage to the Victorian Cancer Registry. Source of controls: Cohort participants without a diagnosis of cancer MD Anderson: Participants in this study were identified from epidemiological prostate cancer studies conducted at the University of Texas MD Anderson Cancer Center in the Houston Metropolitan area. Cases were accrued in the Houston Medical Center and were not restricted with respect to Gleason score, stage or PSA. Controls were identified via random-digit-dialing or among hospital visitors and they were frequency matched to cases on age and race. Lifestyle, demographic, and family history data were collected using a standardized questionnaire. MDACC_AS: A prospective cohort study. Source of cases: Men with clinically organ-confined prostate cancer meeting eligibility criteria for a prospective cohort study of active surveillance at MD Anderson Cancer Center. Source of controls: N/A MEC: The Multiethnic Cohort (MEC) is comprised of over 215,000 men and women recruited from Hawaii and the Los Angeles area between 1993 and 1996. Between 1995 and 2006, over 65,000 blood samples were collected from participants for genetic analyses. To identify incident cancer cases, the MEC was cross-linked with the population-based Surveillance, Epidemiology and End Results (SEER) registries in California and Hawaii, and unaffected cohort participants with blood samples were selected as controls MIAMI (WFPCS): Prostate cancer cases and controls were recruited from the Departments of Urology and Internal Medicine of the Wake Forest University School of Medicine using sequential patient populations as described previously (PMID:15342424). All study subjects received a detailed description of the study protocol and signed their informed consent, as approved by the medical center's Institutional Review Board. The general eligibility criteria were (i) able to comprehend informed consent and (ii) without previously diagnosed cancer. The exclusion criteria were (i) clinical diagnosis of autoimmune diseases; (ii) chronic inflammatory conditions; and (iii) infections within the past 6 weeks. Blood samples were collected from all subjects. MOFFITT: Hospital-based. Source of cases: clinic based from Moffitt Cancer Center. Source of controls: Moffitt Cancer Center affiliated Lifetime cancer screening center NMHS: Case-control, clinic based, Nashville TN. Source of cases: All urology clinics in Nashville, TN. Source of controls: Men without prostate cancer at prostate biopsy. PCaP: The North Carolina-Louisiana Prostate Cancer Project (PCaP) is a multidisciplinary population-based case-only study designed to address racial differences in prostate cancer through a comprehensive evaluation of social, individual and tumor level influences on prostate cancer aggressiveness. PCaP enrolled approximately equal numbers of African Americans and Caucasian Americans with newly-diagnosed prostate cancer from North Carolina (42 counties) and Louisiana (30 parishes) identified through state tumor registries. African American PCaP subjects with DNA, who agreed to future use of specimens for research, participated in OncoArray analysis. PCMUS: Case-control - Sofia, Bulgaria. Source of cases: Patients of Clinic of Urology, Alexandrovska University Hospital, Sofia, Bulgaria, PrCa histopathologically confirmed. Source of controls: 72 patients with verified BPH and PSA<3,5; 78 healthy controls from the MMC Biobank, no history of PrCa PHS: Nested case-control. Source of cases: Participants of the PHS1 trial/cohort. Source of controls: Participants of the PHS1 trial/cohort PLCO: Nested case-control. Source of cases: Men with a confirmed diagnosis of prostate cancer from the PLCO Cancer Screening Trial. Source of controls: Controls were men enrolled in the PLCO Cancer Screening Trial without a diagnosis of cancer at the time of case ascertainment. Poland: Case-control. Source of cases: men with unselected prostate cancer, diagnosed in north-western Poland at the University Hospital in Szczecin. Source of controls: cancer-free men from the same population, taken from the healthy adult patients of family doctors in the Szczecin region PROCAP: Population-based, Retrospective, Observational. Source of cases: Cases were ascertained from the National Prostate Cancer Register of Sweden Follow-Up Study, a retrospective nationwide cohort study of patients with localized prostate cancer. Source of controls: Controls were selected among men referred for PSA testing in laboratories in Stockholm County, Sweden, between 2010 and 2012. PROGReSS: Hospital-based, Prospective, Observational. Source of cases: Prostate cancer cases from the Hospital Clínico Universitario de Santiago de Compostela, Galicia, Spain. Source of controls: Cancer-free men from the same population ProMPT: A study to collect samples and data from subjects with and without prostate cancer. Retrospective, Experimental. Source of cases: Subjects attending outpatient clinics in hospitals. Source of controls: Subjects attending outpatient clinics in hospitals ProtecT: Trial of treatment. Samples taken from subjects invited for PSA testing from the community at nine centers across United Kingdom. Source of cases: Subjects who have a proven diagnosis of prostate cancer following testing. Source of controls: Identified through invitation of subjects in the community. PROtEuS: Case-control, population-based. Source of cases: All new histologically-confirmed cases, aged less or equal to 75 years, diagnosed between 2005 and 2009, actively ascertained across Montreal French hospitals. Source of controls: Randomly selected from the Provincial electoral list of French-speaking men between 2005 and 2009, from the same area of residence as cases and frequency-matched on age. QLD: Case-control. Source of cases: A longitudinal cohort study (Prostate Cancer Supportive Care and Patient Outcomes Project: ProsCan) conducted in Queensland, through which men newly diagnosed with prostate cancer from 26 private practices and 10 public hospitals were directly referred to ProsCan at the time of diagnosis by their treating clinician (age range 43-88 years). All cases had histopathologically confirmed prostate cancer, following presentation with an abnormal serum PSA and/or lower urinary tract symptoms. Source of controls: Controls comprised healthy male blood donors with no personal history of prostate cancer, recruited through (i) the Australian Red Cross Blood Services in Brisbane (age range 19-76 years) and (ii) the Australian Electoral Commission (AEC) (age and post-code/ area matched to ProsCan, age range 54-90 years). RAPPER: Multi-centre, hospital based blood sample collection study in patients enrolled in clinical trials with prospective collection of radiotherapy toxicity data. Source of cases: Prostate cancer patients enrolled in radiotherapy trials: CHHiP, RT01, Dose Escalation, RADICALS, Pelvic IMRT, PIVOTAL. Source of controls: N/A SABOR: Prostate Cancer Screening Cohort. Source of cases: Men >45 yrs of age participating in annual PSA screening. Source of controls: Males participating in annual PSA prostate cancer risk evaluations (funded by NCI biomarkers discovery and validation grant), recruited through University of Texas Health Science Center at San Antonio and affiliated sites or through study advertisements, enrolment open to the community SCCS: Case-control in cohort, Southeastern USA. Prospective, Observational, Population-based. Source of cases: SCCS entry population. Source of controls: SCCS entry population SCPCS: Population-based, Retrospective, Observational. Source of cases: South Carolina Central Cancer Registry. Source of controls: Health Care Financing Administration beneficiary file SEARCH: Case-control - East Anglia, UK. Source of cases: Men < 70 years of age registered with prostate cancer at the population-based cancer registry, Eastern Cancer Registration and Information Centre, East Anglia, UK. Source of controls: Men attending general practice in East Anglia with no known prostate cancer diagnosis, frequency matched to cases by age and geographic region SNP_Prostate_Ghent: Hospital-based, Retrospective, Observational. Source of cases: Men treated with IMRT as primary or postoperative treatment for prostate cancer at the Ghent University Hospital between 2000 and 2010. Source of controls: Employees of the University hospital and members of social activity clubs, without a history of any cancer. SPAG: Hospital-based, Retrospective, Observational. Source of cases: Guernsey. Source of controls: Guernsey STHM2: Population-based, Retrospective, Observational. Source of cases: Cases were selected among men referred for PSA testing in laboratories in Stockholm County, Sweden, between 2010 and 2012. Source of controls: Controls were selected among men referred for PSA testing in laboratories in Stockholm County, Sweden, between 2010 and 2012. PCPT: Case-control from a randomized clinical trial. Source of cases: Randomized clinical trial. Source of controls: Randomized clinical trial SELECT: Case-cohort from a randomized clinical trial. Source of cases: Randomized clinical trial. Source of controls: Randomized clinical trial TAMPERE: Case-control - Finland, Retrospective, Observational, Population-based. Source of cases: Identified through linkage to the Finnish Cancer Registry and patient records; and the Finnish arm of the ERSPC study. Source of controls: Cohort participants without a diagnosis of cancer UGANDA: Uganda Prostate Cancer Study: Uganda is a case-control study of prostate cancer in Kampala Uganda that was initiated in 2011. Men with prostate cancer were enrolled from the Urology unit at Mulago Hospital and men without prostate cancer (i.e. controls) were enrolled from other clinics (i.e. surgery) at the hospital. UKGPCS: ICR, UK. Source of cases: Cases identified through clinics at the Royal Marsden hospital and nationwide NCRN hospitals. Source of controls: Ken Muir's control- 2000 ULM: Case-control - Germany. Source of cases: familial cases (n=162): identified through questionnaires for family history by collaborating urologists all over Germany; sporadic cases (n=308): prostatectomy series performed in the Clinic of Urology Ulm between 2012 and 2014. Source of controls: age-matched controls (n=188): age-matched men without prostate cancer and negative family history collected in hospitals of Ulm WUGS/WUPCS: Cases Series, USA. Source of cases: Identified through clinics at Washington University in St. Louis. Source of controls: Men diagnosed and managed with prostate cancer in University based clinic. Acknowledgement Statements: Aarhus: This study was supported by the Danish Strategic Research Council (now Innovation Fund Denmark) and the Danish Cancer Society. The Danish Cancer Biobank (DCB) is acknowledged for biological material. AHS: This work was supported by the Intramural Research Program of the NIH, National Cancer Institute, Division of Cancer Epidemiology and Genetics (Z01CP010119). ATBC: This research was supported in part by the Intramural Research Program of the NIH and the National Cancer Institute. Additionally, this research was supported by U.S. Public Health Service contracts N01-CN-45165, N01-RC-45035, N01-RC-37004, HHSN261201000006C, and HHSN261201500005C from the National Cancer Institute, Department of Health and Human Services. BioVu: The dataset(s) used for the analyses described were obtained from Vanderbilt University Medical Center's BioVU which is supported by institutional funding and by the National Center for Research Resources, Grant UL1 RR024975-01 (which is now at the National Center for Advancing Translational Sciences, Grant 2 UL1 TR000445-06). Canary PASS: PASS was supported by Canary Foundation and the National Cancer Institute's Early Detection Research Network (U01 CA086402) CCI: This work was awarded by Prostate Cancer Canada and is proudly funded by the Movember Foundation - Grant # D2013-36.The CCI group would like to thank David Murray, Razmik Mirzayans, and April Scott for their contribution to this work. CerePP French Prostate Cancer Case-Control Study (ProGene): None reported COH: SLN is partially supported by the Morris and Horowitz Families Endowed Professorship COSM: The Swedish Research Council, the Swedish Cancer Foundation CPCS1 & CPCS2: Department of Clinical Biochemistry, Herlev and Gentofte Hospital, Copenhagen University Hospital, Herlev Ringvej 75, DK-2730 Herlev, DenmarkCPCS1 would like to thank the participants and staff of the Copenhagen General Population Study for their important contributions. CPDR: Uniformed Services University for the Health Sciences HU0001-10-2-0002 (PI: David G. McLeod, MD) CPS-II: The American Cancer Society funds the creation, maintenance, and updating of the Cancer Prevention Study II cohort. CPS-II thanks the participants and Study Management Group for their invaluable contributions to this research. We would also like to acknowledge the contribution to this study from central cancer registries supported through the Centers for Disease Control and Prevention National Program of Cancer Registries, and cancer registries supported by the National Cancer Institute Surveillance Epidemiology and End Results program. EPIC: The coordination of EPIC is financially supported by the European Commission (DG-SANCO) and the International Agency for Research on Cancer. The national cohorts are supported by the Danish Cancer Society (Denmark); the Deutsche Krebshilfe, Deutsches Krebsforschungszentrum and Federal Ministry of Education and Research (Germany); the Hellenic Health Foundation, Greek Ministry of Health; Greek Ministry of Education (Greece); the Italian Association for Research on Cancer (AIRC) and National Research Council (Italy); the Dutch Ministry of Public Health, Welfare and Sports (VWS), Netherlands Cancer Registry (NKR), LK Research Funds, Dutch Prevention Funds, Dutch ZON (Zorg Onderzoek Nederland), World Cancer Research Fund (WCRF); the Statistics Netherlands (The Netherlands); the Health Research Fund (FIS), Regional Governments of Andalucía, Asturias, Basque Country, Murcia and Navarra, Spanish Ministry of Health ISCIII RETIC (RD06/0020), Red de Centros RCESP, C03/09 (Spain); the Swedish Cancer Society, Swedish Scientific Council and Regional Government of Skåne and Västerbotten, Fundacion Federico SA (Sweden); the Cancer Research UK, Medical Research Council (United Kingdom). EPICAP: The EPICAP study was supported by grants from Ligue Nationale Contre le Cancer, Ligue départementale du Val de Marne; Fondation de France; Agence Nationale de sécurité sanitaire de l'alimentation, de l'environnement et du travail (ANSES). The EPICAP study group would like to thank all urologists, Antoinette Anger and Hasina Randrianasolo (study monitors), Anne-Laure Astolfi, Coline Bernard, Oriane Noyer, Marie-Hélène De Campo, Sandrine Margaroline, Louise N'Diaye, and Sabine Perrier-Bonnet (Clinical Research nurses). ERSPC: This study was supported by the DutchCancerSociety (KWF94-869,98-1657,2002-277,2006-3518, 2010-4800), The Netherlands Organisation for Health Research and Development (ZonMW-002822820, 22000106, 50-50110-98-311, 62300035), The Dutch Cancer Research Foundation (SWOP), and an unconditional grant from Beckman-Coulter-HybritechInc. ESTHER: The ESTHER study was supported by a grant from the Baden Württemberg Ministry of Science, Research and Arts. The ESTHER group would like to thank Hartwig Ziegler, Sonja Wolf, Volker Hermann, Heiko Müller, Karina Dieffenbach, Katja Butterbach for valuable contributions to the study. FHCRC: The FHCRC studies were supported by grants R01-CA056678, R01-CA082664, and R01-CA092579 from the US National Cancer Institute, National Institutes of Health, with additional support from the Fred Hutchinson Cancer Research Center. FHCRC would like to thank all the men who participated in these studies. Gene-PARE: The Gene-PARE study was supported by grants 1R01CA134444 from the U.S. National Institutes of Health, PC074201 and W81XWH-15-1-0680 from the Prostate Cancer Research Program of the Department of Defense and RSGT-05-200-01-CCE from the American Cancer Society. Hamburg-Zagreb: None reported HPFS: The Health Professionals Follow-up Study was supported by grants UM1CA167552, CA133891, CA141298, and P01CA055075. HPFS are grateful to the participants and staff of the Physicians' Health Study and Health Professionals Follow-Up Study for their valuable contributions, as well as the following state cancer registries for their help: AL, AZ, AR, CA, CO, CT, DE, FL, GA, ID, IL, IN, IA, KY, LA, ME, MD, MA, MI, NE, NH, NJ, NY, NC, ND, OH, OK, OR, PA, RI, SC, TN, TX, VA, WA, and WY. IMPACT: The IMPACT study was funded by The Ronald and Rita McAulay Foundation, CR-UK Project grant (C5047/A1232), Cancer Australia, AICR Netherlands A10-0227, Cancer Australia and Cancer Council Tasmania, NIHR, EU Framework 6, Cancer Councils of Victoria and South Australia, and Philanthropic donation to Northshore University Health System. We acknowledge support from the National Institute for Health Research (NIHR) to the Biomedical Research Centre at The Institute of Cancer Research and Royal Marsden Foundation NHS Trust. IMPACT acknowledges the IMPACT study steering committee, collaborating centres, and participants. IPO-Porto: The IPO-Porto study was funded by Fundaçäo para a Ciência e a Tecnologia (FCT; UID/DTP/00776/2013 and PTDC/DTP-PIC/1308/2014) and by IPO-Porto Research Center (CI-IPOP-16-2012 and CI-IPOP-24-2015). MC and MPS are research fellows from Liga Portuguesa Contra o Cancro, Núcleo Regional do Norte. SM is a research fellow from FCT (SFRH/BD/71397/2010). IPO-Porto would like to express our gratitude to all patients and families who have participated in this study. Karuprostate: The Karuprostate study was supported by the the Frech National Health Directorate and by the Association pour la Recherche sur les Tumeurs de la ProstateKarusprostate thanks Séverine Ferdinand. KULEUVEN: F.C. and S.J. are holders of grants from FWO Vlaanderen (G.0684.12N and G.0830.13N), the Belgian federal government (National Cancer Plan KPC_29_023), and a Concerted Research Action of the KU Leuven (GOA/15/017). TVDB is holder of a doctoral fellowship of the FWO. LAAPC: This study was funded by grant R01CA84979 (to S.A. Ingles) from the National Cancer Institute, National Institutes of Health. Malaysia: The study was funded by the University Malaya High Impact Research Grant (HIR/MOHE/MED/35). Malaysia thanks all associates in the Urology Unit, University of Malaya, Cancer Research Initiatives Foundation (CARIF) and the Malaysian Men's Health Initiative (MMHI). MCCS: MCCS cohort recruitment was funded by VicHealth and Cancer Council Victoria. The MCCS was further supported by Australian NHMRC grants 209057, 251553, and 504711, and by infrastructure provided by Cancer Council Victoria. Cases and their vital status were ascertained through the Victorian Cancer Registry (VCR) and the Australian Institute of Health and Welfare (AIHW), including the National Death Index and the Australian Cancer Database. MCC-Spain: The study was partially funded by the Accion Transversal del Cancer, approved on the Spanish Ministry Council on the 11th October 2007, by the Instituto de Salud Carlos III-FEDER (PI08/1770, PI09/00773-Cantabria, PI11/01889-FEDER, PI12/00265, PI12/01270, and PI12/00715), by the Fundación Marqués de Valdecilla (API 10/09), by the Spanish Association Against Cancer (AECC) Scientific Foundation and by the Catalan Government DURSI grant 2009SGR1489. Samples: Biological samples were stored at the Parc de Salut MAR Biobank (MARBiobanc; Barcelona) which is supported by Instituto de Salud Carlos III FEDER (RD09/0076/00036). Also sample collection was supported by the Xarxa de Bancs de Tumors de Catalunya sponsored by Pla Director d'Oncologia de Catalunya (XBTC). MCC-Spain acknowledges the contribution from Esther Gracia-Lavedan in preparing the data. We thank all the subjects who participated in the study and all MCC-Spain collaborators. MD Anderson: Prostate Cancer Case-Control Studies at MD Anderson (MDA) supported by grants CA68578, ES007784, DAMD W81XWH-07-1-0645, and CA140388. MDACC_AS: None reported MEC: Funding provided by NIH grant U19CA148537 and grant U01CA164973. MIAMI (WFPCS): ACS MOFFITT: The Moffitt group was supported by the US National Cancer Institute (R01CA128813, PI: J.Y. Park). NMHS: Funding for the Nashville Men's Health Study (NMHS) was provided by the National Institutes of Health Grant numbers: RO1CA121060. PCaP only data: The North Carolina - Louisiana Prostate Cancer Project (PCaP) is carried out as a collaborative study supported by the Department of Defense contract DAMD 17-03-2-0052. For HCaP-NC follow-up data: The Health Care Access and Prostate Cancer Treatment in North Carolina (HCaP-NC) study is carried out as a collaborative study supported by the American Cancer Society award RSGT-08-008-01-CPHPS. For studies using both PCaP and HCaP-NC follow-up data please use: The North Carolina - Louisiana Prostate Cancer Project (PCaP) and the Health Care Access and Prostate Cancer Treatment in North Carolina (HCaP-NC) study are carried out as collaborative studies supported by the Department of Defense contract DAMD 17-03-2-0052 and the American Cancer Society award RSGT-08-008-01-CPHPS, respectively. For any PCaP data, please include: The authors thank the staff, advisory committees and research subjects participating in the PCaP study for their important contributions. For studies using PCaP DNA/genotyping data, please include: We would like to acknowledge the UNC BioSpecimen Facility and LSUHSC Pathology Lab for our DNA extractions, blood processing, storage and sample disbursement (https://genome.unc.edu/bsp). For studies using PCaP tissue, please include: We would like to acknowledge the RPCI Department of Urology Tissue Microarray and Immunoanalysis Core for our tissue processing, storage and sample disbursement. For studies using HCaP-NC follow-up data, please use: The Health Care Access and Prostate Cancer Treatment in North Carolina (HCaP-NC) study is carried out as a collaborative study supported by the American Cancer Society award RSGT-08-008-01-CPHPS. The authors thank the staff, advisory committees and research subjects participating in the HCaP-NC study for their important contributions. For studies that use both PCaP and HCaP-NC, please use: The authors thank the staff, advisory committees and research subjects participating in the PCaP and HCaP-NC studies for their important contributions. PCMUS: The PCMUS study was supported by the Bulgarian National Science Fund, Ministry of Education and Science (contract DOO-119/2009; DUNK01/2-2009; DFNI-B01/28/2012) with additional support from the Science Fund of Medical University - Sofia (contract 51/2009; 8I/2009; 28/2010). PHS: The Physicians' Health Study was supported by grants CA34944, CA40360, CA097193, HL26490, and HL34595. PHS members are grateful to the participants and staff of the Physicians' Health Study and Health Professionals Follow-Up Study for their valuable contributions, as well as the following state cancer registries for their help: AL, AZ, AR, CA, CO, CT, DE, FL, GA, ID, IL, IN, IA, KY, LA, ME, MD, MA, MI, NE, NH, NJ, NY, NC, ND, OH, OK, OR, PA, RI, SC, TN, TX, VA, WA, and WY. PLCO: This PLCO study was supported by the Intramural Research Program of the Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIHPLCO thanks Drs. Christine Berg and Philip Prorok, Division of Cancer Prevention at the National Cancer Institute, the screening center investigators and staff of the PLCO Cancer Screening Trial for their contributions to the PLCO Cancer Screening Trial. We thank Mr. Thomas Riley, Mr. Craig Williams, Mr. Matthew Moore, and Ms. Shannon Merkle at Information Management Services, Inc., for their management of the data and Ms. Barbara O'Brien and staff at Westat, Inc. for their contributions to the PLCO Cancer Screening Trial. We also thank the PLCO study participants for their contributions to making this study possible. Poland: None reported PROCAP: PROCAP was supported by the Swedish Cancer Foundation (08-708, 09-0677). PROCAP thanks and acknowledges all of the participants in the PROCAP study. We thank Carin Cavalli-Björkman and Ami Rönnberg Karlsson for their dedicated work in the collection of data. Michael Broms is acknowledged for his skilful work with the databases. KI Biobank is acknowledged for handling the samples and for DNA extraction. We acknowledge The NPCR steering group: Pär Stattin (chair), Anders Widmark, Stefan Karlsson, Magnus Törnblom, Jan Adolfsson, Anna Bill-Axelson, Ove Andrén, David Robinson, Bill Pettersson, Jonas Hugosson, Jan-Erik Damber, Ola Bratt, Göran Ahlgren, Lars Egevad, and Roy Ehrnström. PROGReSS: The PROGReSS study is founded by grants from the Spanish Ministry of Health (INT15/00070; INT16/00154; FIS PI10/00164, FIS PI13/02030; FIS PI16/00046); the Spanish Ministry of Economy and Competitiveness (PTA2014-10228-I), and Fondo Europeo de Desarrollo Regional (FEDER 2007-2013). ProMPT: Founded by CRUK, NIHR, MRC, Cambride Biomedical Research Centre ProtecT: Founded by NIHR. ProtecT and ProMPT would like to acknowledge the support of The University of Cambridge, Cancer Research UK. Cancer Research UK grants (C8197/A10123) and (C8197/A10865) supported the genotyping team. We would also like to acknowledge the support of the National Institute for Health Research which funds the Cambridge Bio-medical Research Centre, Cambridge, UK. We would also like to acknowledge the support of the National Cancer Research Prostate Cancer: Mechanisms of Progression and Treatment (PROMPT) collaborative (grant code G0500966/75466) which has funded tissue and urine collections in Cambridge. We are grateful to staff at the Welcome Trust Clinical Research Facility, Addenbrooke's Clinical Research Centre, Cambridge, UK for their help in conducting the ProtecT study. We also acknowledge the support of the NIHR Cambridge Biomedical Research Centre, the DOH HTA (ProtecT grant), and the NCRI/MRC (ProMPT grant) for help with the bio-repository. The UK Department of Health funded the ProtecT study through the NIHR Health Technology Assessment Programme (projects 96/20/06, 96/20/99). The ProtecT trial and its linked ProMPT and CAP (Comparison Arm for ProtecT) studies are supported by Department of Health, England; Cancer Research UK grant number C522/A8649, Medical Research Council of England grant number G0500966, ID 75466, and The NCRI, UK. The epidemiological data for ProtecT were generated though funding from the Southwest National Health Service Research and Development. DNA extraction in ProtecT was supported by USA Dept of Defense award W81XWH-04-1-0280, Yorkshire Cancer Research and Cancer Research UK. The authors would like to acknowledge the contribution of all members of the ProtecT study research group. The views and opinions expressed therein are those of the authors and do not necessarily reflect those of the Department of Health of England. The bio-repository from ProtecT is supported by the NCRI (ProMPT) Prostate Cancer Collaborative and the Cambridge BMRC grant from NIHR. We thank the National Institute for Health Research, Hutchison Whampoa Limited, the Human Research Tissue Bank (Addenbrooke's Hospital), and Cancer Research UK. PROtEuS: PROtEuS was supported financially through grants from the Canadian Cancer Society (13149, 19500, 19864, 19865) and the Cancer Research Society, in partnership with the Ministère de l'enseignement supérieur, de la recherche, de la science et de la technologie du Québec, and the Fonds de la recherche du Québec - Santé.PROtEuS would like to thank its collaborators and research personnel, and the urologists involved in subjects recruitment. We also wish to acknowledge the special contribution made by Ann Hsing and Anand Chokkalingam to the conception of the genetic component of PROtEuS. QLD: The QLD research is supported by The National Health and Medical Research Council (NHMRC) Australia Project Grants (390130, 1009458) and NHMRC Career Development Fellowship and Cancer Australia PdCCRS funding to J Batra. The QLD team would like to acknowledge and sincerely thank the urologists, pathologists, data managers and patient participants who have generously and altruistically supported the QLD cohort. RAPPER: RAPPER is funded by Cancer Research UK (C1094/A11728; C1094/A18504) and Experimental Cancer Medicine Centre funding (C1467/A7286). The RAPPER group thank Rebecca Elliott for project management. SABOR: The SABOR research is supported by NIH/NCI Early Detection Research Network, grant U01 CA0866402-12. Also supported by the Cancer Center Support Grant to the Cancer Therapy and Research Center from the National Cancer Institute (US) P30 CA054174. SCCS: SCCS is funded by NIH grant R01 CA092447, and SCCS sample preparation was conducted at the Epidemiology Biospecimen Core Lab that is supported in part by the Vanderbilt-Ingram Cancer Center (P30 CA68485). Data on SCCS cancer cases used in this publication were provided by the Alabama Statewide Cancer Registry; Kentucky Cancer Registry, Lexington, KY; Tennessee Department of Health, Office of Cancer Surveillance; Florida Cancer Data System; North Carolina Central Cancer Registry, North Carolina Division of Public Health; Georgia Comprehensive Cancer Registry; Louisiana Tumor Registry; Mississippi Cancer Registry; South Carolina Central Cancer Registry; Virginia Department of Health, Virginia Cancer Registry; Arkansas Department of Health, Cancer Registry, 4815 W. Markham, Little Rock, AR 72205. The Arkansas Central Cancer Registry is fully funded by a grant from National Program of Cancer Registries, Centers for Disease Control and Prevention (CDC). Data on SCCS cancer cases from Mississippi were collected by the Mississippi Cancer Registry which participates in the National Program of Cancer Registries (NPCR) of the Centers for Disease Control and Prevention (CDC). The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official views of the CDC or the Mississippi Cancer Registry. SCPCS: SCPCS is funded by CDC grant S1135-19/19, and SCPCS sample preparation was conducted at the Epidemiology Biospecimen Core Lab that is supported in part by the Vanderbilt-Ingram Cancer Center (P30 CA68485). SEARCH: SEARCH is funded by a program grant from Cancer Research UK (C490/A10124) and supported by the UK National Institute for Health Research Biomedical Research Centre at the University of Cambridge. SNP_Prostate_Ghent: The study was supported by the National Cancer Plan, financed by the Federal Office of Health and Social Affairs, Belgium. SPAG: Wessex Medical ResearchHope for Guernsey, MUG, HSSD, MSG, Roger Allsopp STHM2: STHM2 was supported by grants from The Strategic Research Programme on Cancer (StratCan), Karolinska Institutet; the Linné Centre for Breast and Prostate Cancer (CRISP, number 70867901), Karolinska Institutet; The Swedish Research Council (number K2010-70X-20430-04-3) and The Swedish Cancer Society (numbers 11-0287 and 11-0624); Stiftelsen Johanna Hagstrand och Sigfrid Linnérs minne; Swedish Council for Working Life and Social Research (FAS), number 2012-0073STHM2 acknowledges the Karolinska University Laboratory, Aleris Medilab, Unilabs and the Regional Prostate Cancer Registry for performing analyses and help to retrieve data. Carin Cavalli-Björkman and Britt-Marie Hune for their enthusiastic work as research nurses. Astrid Björklund for skilful data management. We wish to thank the BBMRI.se biobank facility at Karolinska Institutet for biobank services. PCPT & SELECT are funded by Public Health Service grants U10CA37429 and 5UM1CA182883 from the National Cancer Institute. SWOG and SELECT thank the site investigators and staff and, most importantly, the participants who donated their time to this trial. TAMPERE: The Tampere (Finland) study was supported by the Academy of Finland (251074), The Finnish Cancer Organisations, Sigrid Juselius Foundation, and the Competitive Research Funding of the Tampere University Hospital (X51003). The PSA screening samples were collected by the Finnish part of ERSPC (European Study of Screening for Prostate Cancer). TAMPERE would like to thank Riina Liikanen, Liisa Maeaettaenen and Kirsi Talala for their work on samples and databases. UGANDA: None reported UKGPCS: UKGPCS would also like to thank the following for funding support: The Institute of Cancer Research and The Everyman Campaign, The Prostate Cancer Research Foundation, Prostate Research Campaign UK (now Prostate Action), The Orchid Cancer Appeal, The National Cancer Research Network UK, The National Cancer Research Institute (NCRI) UK. We are grateful for support of NIHR funding to the NIHR Biomedical Research Centre at The Institute of Cancer Research and The Royal Marsden NHS Foundation Trust. UKGPCS should also like to acknowledge the NCRN nurses, data managers, and consultants for their work in the UKGPCS study. UKGPCS would like to thank all urologists and other persons involved in the planning, coordination, and data collection of the study. ULM: The Ulm group received funds from the German Cancer Aid (Deutsche Krebshilfe). WUGS/WUPCS: WUGS would like to thank the following for funding support: The Anthony DeNovi Fund, the Donald C. McGraw Foundation, and the St. Louis Men's Group Against Cancer.
Following the demonstration of its life-saving effect in clinical trials, dexamethasone quickly became standard-of-care in the treatment of severe COVID-19. Beneficial effects were reported for patients requiring supplemental oxygen or invasive mechanical ventilation. Yet, a substantial proportion of patients still progressed to a critical condition or succumbed to the disease, despite timely initiation of glucocorticoid treatment. No molecular or cellular correlate of this beneficial treatment response has been defined. Here, we identify distinct cellular and molecular changes in circulating immune cells in patients with COVID-19 elicited in response to dexamethasone treatment. The most profound transcriptional changes in response to dexamethasone treatment were noted in monocytes and in B cells. In monocytes, we observed a reversal of hallmark signatures previously associated with COVID-19 severity, and the induction of a specific monocyte substate enriched in glucocorticoid response gene expression which we here refer to as dexamethasone response cluster. Molecular responses to dexamethasone treatment were directly linked to clinical outcome, since the reversal of pathogenic signatures and induction of glucocorticoid response genes were enriched in patients who survived the disease in response to dexamethasone treatment. Cellular responsiveness of circulating monocytes was thus identified as a correlate of clinical response to dexamethasone treatment. Changes in monocyte transcriptomes could also be linked to epigenetic alterations highlighting early differences in monocytes of dexamethasone responders and non-responders. Further, monocyte single-cell transcriptome-derived signatures were enriched in whole blood transcriptomes from patients with fatal outcome in two independent cohorts, highlighting the potential clinical value for early identification of non-responders who are refractory to dexamethasone treatment. Overall, our findings indicate that the life-saving effects of dexamethasone in COVID-19 patients are linked to a specific immunomodulatory effect based on the reversion of monocyte dysregulation. Our study elucidates the potential mechanism of action of one of the most efficient and cost-effective drugs for the treatment of COVID-19. It further highlights the potential of single-cell omics for monitoring target engagement of immunomodulatory drugs and for the stratification of patients for precision medicine approaches.
IDH-mutant lower-grade gliomas can undergo malignant progression via temozolomide-driven hypermutation. Patient-derived cells (PDC) that model the genetically distinct hypermutated (HM) tumor subgroup are generally lacking, and few if any human brain tumor cell models are from defined evolutionary time points. Here, we characterize multiple PDC derived from independent surgical specimens of IDH1-mutant recurrences, including an ATRX and TP53-mutant astrocytoma and a 1p/19q co-deleted and TERT promoter-mutant oligodendroglioma. We determined the evolutionary time points represented by each PDC using exome sequencing and phylogenetic reconstruction, comparing the PDC and single cell clones of the PDC (scPDC) to multiple spatiotemporal tumor tissue samples, PDC-derived xenografts (PDX) and patient-matched blood. The tumor samples exhibited TMZ-induced mutagenesis and a branching pattern of evolution. We found clear evidence of two fully independent founder HM clones in the tumor tissue that are faithfully represented by independent PDC. The PDC, scPDC and PDX also shared the mutagenesis signature and represent the mid and later evolutionary time points of their corresponding tumors. The PDC maintained the tumor subtype-defining features over many passages, including heterozygosity of the IDH1 R132H mutation, production of 2-hydroxyglutarate (2-HG), and subtype-specific telomere maintenance mechanisms. The PDC from both tumors exhibited anchorage-independent growth in soft agar. The oligodendroglioma PDC formed infiltrative intracranial tumors with characteristic oligodendroglioma histology, initially with a long period to tumor formation. We conclude that the PDC, scPDC and PDX faithfully model the heterogeneous clonal origins of the corresponding tumor tissue. The multilevel analysis also provides new insight into the intratumoral heterogeneity and vast mutational load of HM glioma. The PDC from multiple evolutionary time points presented in the context of full clinical timelines may be useful to model evolution and intratumoral heterogeneity, important sources of therapeutic failure.
This study is a collaboration between the Center for Applied Genomics (CAG) at Children's Hospital of Philadelphia (CHOP) and the Brain Behavior Laboratory at the University of Pennsylvania (Penn). The cohort consists of youths aged 8-21 years who consulted the CHOP network and volunteered to participate in genomic studies of complex pediatric disorders. All participants underwent clinical assessment, including a neuropsychiatric structured interview and review of electronic medical records. They were also administered a neuroscience based computerized neurocognitive battery (CNB) and a subsample underwent neuroimaging. These are described separately below. Clinical Testing: GOASSESS, a computerized, structured screener developed from a modified version of the Kiddie-Schedule for Affective Disorders and Schizophrenia (K-SADS, Kaufman et al. 1997, PMID: 9204677). Components of the interview include a timeline of life events, demographics and medical history, Global Assessment of Functioning, and Interviewer Observations. A psychopathology symptom and criterion-related assessment of mood disorders (depression, mania/hypomania), anxiety disorders (overanxious/generalized-, separation-, social-anxiety, specific phobia, panic disorder, agoraphobia, obsessive compulsive disorder, post-traumatic stress disorder), behavioral disorders (attention deficit hyperactivity disorder, oppositional defiant disorder, conduct disorder), psychosis spectrum (psychosis and prodromal symptoms), eating disorders, suicidal thinking and behavior, and treatment history. This is also based on K-SADS. An abbreviated Family Interview for Genetics Studies (FIGS) to assess major domains of psychopathology in the proband's first-degree relatives. Computerized Neurocognitive Battery: The CNB, developed for large-scale studies, yields measures of accuracy and speed for domains of executive-control functions (abstraction, attention, working memory), episodic memory (verbal, facial, spatial), complex cognitive processing (language reasoning, nonverbal reasoning, spatial processing), social cognition (emotion identification, emotion intensity differentiation, age differentiation) and sensorimotor and motor speed. The following neurobehavioral domains were assessed: Penn Conditional Exclusion Test is a measure of abstraction and concept formation. Participants decide which of 4 objects does not belong with the other 3, based on one of three sorting principles, which change. Feedback is used. Attention: The Penn Continuous Performance Test. Participants respond to a set of 7-segment displays whenever they form a digit or letter. Working Memory: The Letter N-back Test displays sequences of uppercase letters with a stimulus duration of 500 ms (ISI 2,500 ms.) In the 0-back condition, participants respond to a single target (i.e., X). In the 1-back condition they respond if the letter is identical to that preceding it. In the 2-back condition, they respond if the letter is identical to that presented two trials back. Verbal Memory: The Penn Word Memory Test presents 20 target words that are then mixed with 20 distracters equated for frequency, length, concreteness and low imageability. A 20 min delayed recall procedure is also administered. Face Memory: The Penn Face Memory Test presents 20 digitized faces that are then mixed with 20 distracters equated for age, gender and ethnicity. The procedure is repeated at 20 min delay. Spatial Memory: The Visual Object Learning Test uses Euclidean shapes as stimuli with the same paradigm as the word and face. Language and Analogical Reasoning: The Penn Verbal Reasoning consists of verbal analogy problems. Spatial Processing: Penn Line Orientation Test presents two lines at an angle, and participants click on a button that makes one line rotate until it has the same angle as the other. Emotion Processing: Facial displays of 4 emotions (Happy, Sad, Anger, Fear) and Neutral faces, 8 each, are presented and the subject identifies the emotion in a multiple-choice format. The facial stimuli are balanced for gender, age, and ethnicity. Sensory-motor Processing Speed: The task requires moving the mouse and clicking on a green square that disappears after the click. The square gets increasingly small and appears in unpredictable locations. Neuroimaging Protocol: Studies were performed at Penn using a Siemens Trio (Erlangen, Germany) 3T scanner equipped with 40mT/m gradients and 200 mT/m/s slew-rates. RF transmission utilized a quadrature body-coil, and reception a 12-channel head coil optimized for parallel imaging. Total image acquisition time was about 45 min. Structural Imaging: The T1-weighted protocol utilized a 3D, inversion-recovery, and magnetization-prepared rapid acquisition gradient echo. Relevant imaging procedures include: Structural magnetic imaging Diffusion tensor imaging ASL perfusion BOLD fMRI Neuroimaging tasks: Fractal N-Back Task of Spatial Working Memory Face Emotion Identification Task Neuroimages: The current data release includes over 9700 MRI images that may be downloaded through Authorized Access.
The Electronic Medical Records and Genomics (eMERGE) Network is a National Institutes of Health (NIH)-organized and funded consortium of U.S. medical research institutions. The primary goal of the eMERGE Network is to develop, disseminate, and apply approaches to research that combine biorepositories with electronic medical record (EMR) systems for genomic discovery and genomic medicine implementation research. eMERGE was announced in September 2007 and began its third phase in September 2015. eMERGE III consists of nine study sites, two central sequencing and genotyping facilities, and a coordinating center. eMERGE Phase III aims to: 1) sequence and assess the phenotypic implication of rare variants in a custom designed eMERGEseq panel consisting of 109 genes (including 56 ACMG actionable finding list genes and the top 6 genes from each site relevant to their specific aims), as well as approximately 1400 SNPs; 2) assess the phenotypic implications of these variants by developing, validating and implementing new phenotype algorithms, 3) integrate genetic variants into EMRs to inform clinical care; and 4) create community resources. Included in this study are: ~24,000 eMERGE participants from 10 eMERGE III study sites. Corresponding demographics, body mass index measurements. Top PheWAS codes generated from a collated list of ICD codes from all study sites. Study sites and participants include: Cincinnati Children's Hospital Medical Center (CCHMC): Cincinnati Children's Hospital Medical Center (CCHMC) is a not-for-profit hospital and research center pioneering breakthrough treatments, providing outstanding family-centered patient care and training healthcare professionals for the future, and dedicated to improving health and welfare of children and to the shared purpose of discovery and practical application of new genomic information to the ordinary care of children. We bring a comprehensive electronic health record (EPIC), a deidentified i2b2 data warehouse of 680K patient records, a biobank with >261,000 consents that allow return of results to >84,000 patients and guardians who have provided DNA samples, and hundreds of faculty and senior staff who make genomics or informatics an active focus of their research. CCHMC will help the eMERGE III Steering Committee identify genes for the eMERGE III targeted sequencing panel, provide 3,000 DNA samples from CCHMC patients to be sequenced, review targeted gene panels from clinical care at CCHMC for somatic mosaicism and reinterpretation, and further develop and disseminate a software workflow suite for sequence analysis. We will also extend our work generating phenotype algorithms using heuristic and machine learning methods to many new childhood diseases. We will develop tools to evaluate adolescent return of results preferences, examine the ethical and legal obligations and potential to reanalyze results, and develop clinical decision support for phenotyping, test ordering, and returning sequencing results. Children's Hospital of Philadelphia (CHOP): The Center for Applied Genomics (CAG) is a specialized Center of Emphasis at the Children's Hospital of Philadelphia (CHOP), and one of the world's largest genetics research programs, with to state-of-the-art high-throughput sequencing and genotyping technology. Our primary goal is to translate basic research findings to medical innovations. We aim to develop new and better ways to diagnose and treat children affected by rare and complex medical disorders, including asthma, autism, epilepsy, pediatric cancer, learning disabilities, and a range of rare diseases. Ultimately, our objective is to generate new diagnostic tests and to guide physicians to the most appropriate therapies. Participants were recruited from the CAG biorepository (n>450,000), specifically from >100,000 CHOP pediatric patients and family members, which is enriched for rare-diseases (n>12,000). Center for Applied Genomics, The Children's Hospital of Philadelphia We gratefully thank all the children and their families who enrolled in this study, and all individuals who donated blood samples for research purposes. Genotyping for this project was performed at the Center for Applied Genomics and supported by an Institutional Development Award from The Children's Hospital of Philadelphia. Sequencing was supported by the National Institutes of Health through an award from the National Human Genome Research Institute's Electronic Medical Records and Genomics (eMERGE) program (U01HG008684). Columbia University: The goal of the Columbia eMERGE III project is to develop methods for integrating genomic data in EHRs and to study the impact of such genomic informatics interventions on the health of a diverse, underserved urban adult English- and Spanish-speaking patient population in Northern Manhattan served by Columbia University Medical Center/New York-Presbyterian Hospital system. The study group is 2500 patients recruited from diverse clinics and community outreach centers of self-reported White (~61%), Asian (~11%), African-American (~11%), American Indian/Alaska Native (<1%) racial and Hispanic (~33%) ethnic backgrounds. There are two subgroups in the study cohort - a retrospective group (N=1052) that includes patients from oncology and nephrology clinics, and a prospective one (N=1448) that includes healthy individuals as well as participants with diverse medical conditions. Confirmed pathogenic variants in 70 selected genes will be returned to participants and their healthcare providers through the EHR integration. Participants are able to choose the results they receive and will have the freedom to meet with a genetic counselor and a geneticist to review results. The impact of genetic testing on clinical care is determined by periodic monitoring of EHRs. Geisinger: Samples and phenotype data in this study were provided by the Geisinger MyCode® Community Health Initiative. Participants are recruited across the Geisinger System via online consents or in-person consents at a hospital or clinic visit. Enrollment is ongoing with over 100,000 individuals currently consented. Partners Healthcare (Harvard University): The Partners HealthCare Biobank is a large research program designed to help researchers understand how people's health is affected by their genes, lifestyle, and environment. This large research data and sample repository provides access to high-quality, consented blood samples to help foster research, advance our understanding of the causes of common diseases, and advance the practice of medicine. For the Partners research community (Massachusetts General Hospital and Brigham and Women's Hospital), the Biobank provides: Banked samples (plasma, serum, and DNA) collected from consented patients Blood samples that were discarded after clinical testing in the Crimson Cores maintained in the Brigham and Women's Hospital and Massachusetts General Hospital Pathology Departments Sample handling and preparation services Link to the biobank data to the Partners Research Patient Data Registry (RPDR) a research instance of our electronic clinical chart Data access through our research portal. To date, over 70,000 Partners patients have given their consent to enroll, give a blood sample, receive research results and agreed to be re-contacted for additional research studies. The Biobank has enabled Partners investigators to compete for nationally recognized grants in personalized medicine such as a clinical electronic Medical Records and Genomics network (eMERGE) site and the national All of US program. The Biobank currently supports over 120 Partners investigators and over 130 million dollars in NIH research. Kaiser Permanente Washington/ (KPWA) / University of Washington (UW): KPWA participants were enrolled in the eMERGE Network through the Northwest Institute of Genetic Medicine (NWIGM) biorepository, and provided the appropriate consent to receive clinically relevant genetic results (N=2,500.) NWIGM is based at the University of Washington and co-managed by the University of Washington and KPWA. The purpose of the NWIGM biorepository is to build infrastructure and resources to carry out a broad range of future genetic research. KPWA members enrolled in the biorepository are asked to provide informed consent to providing a DNA sample for storage in the NWIGM biorepository. The consent is purposefully broad to serve the dual purpose of reducing the burden on researchers who wish to use this biorepository and the IRB committees who will be responsible for reviewing these requests in the future. Participants were eligible if aged 50 - 65 years old at the time of their enrollment into the NWIGM repository, living, enrolled in KPWA's integrated group practice, and had completed an online Health Risk Appraisal. The selection algorithm was based on several data sources from the EHR at KPWA. 1) Demographics - participants with self-reported race as Asian ancestry were prioritized and selected to enrich for non-European ancestry. The KPWA eMERGE cohort includes N=1,245 members of Asian ancestry. 2) Participants were also selected for a history of colorectal cancer (N=1,255), in order to allow us to enrich germline pathogenic variants. Mayo Clinic: The Return of Actionable Variants Empirical (RAVE) Study was approved by the Mayo Clinic IRB. We recruited 2537 participants from Mayo Clinic biobanks in Rochester, MN, who had hypercholesterolemia or colon polyps, thereby enriching for Familial hypercholesterolemia (FH) and monogenic causes of colorectal cancer (CRC). Additional eligibility criteria were: 1) residents of Southeast MN who were alive and aged 18-70 years; 2) LDL-C level >155 or >120 mg/dl while on lipid-lowering therapy; 3) no known cause of secondary hyperlipidemia; and 4) no cognitive impairment or dementia that would compromise their ability to give written informed consent. Based on these criteria, we identified 5270 eligible patients and obtained informed consent from 3030 participants. Recruitment was conducted in waves and utilized mailed recruitment packets consisting of a study brochure, a written informed consent form, a baseline psychosocial questionnaire, and a return postage-paid envelope. DNA of 2537 participants was sent for CLIA-certified targeted sequencing of 109 genes including genes associated with FH and CRC. Targeted sequencing and genotyping was performed in a Central Laboratory Improvement Amendment (CLIA)-certified laboratory. Northwestern University: Samples and data used in this study were obtained from patients from Northwestern Medicine, an integrated healthcare system, formed through a partnership of Northwestern Memorial HealthCare and Northwestern University Feinberg School of Medicine. Participants include a retrospective cohort from the Northwestern Pharmacogenomics Study, funded through the eMERGE II project, NHGRI (3U01HG006388-02S1) and a prospective cohort from the Genetic Testing and Your Health Study, funded through the eMERGE III project, NHGRI (U01HG008673). Patients were eligible to participate if they were18 years or older and see a physician at Northwestern Medicine. Patients consented to genetic testing and to allow their results to be placed in their electronic medical record. Vanderbilt University Medical Center: Vanderbilt University Medical Center (VUMC) participants were enrolled in the eMERGE Network through the Vanderbilt Genome-Electronic Records (VGER) project. Patients were provided the appropriate consent to receive clinically relevant genetic results (N=2,700). Participants were eligible if aged 21 or over, had a healthcare provider at VUMC, and visited the provider at least 3 times in the past 3 years. Meharry Medical College: Inclusion of ethnic groups in genomic research is critical to identify possible reasons for health disparities. African-Americans are being enrolled in various outpatient clinics of Nashville General Hospital at Meharry, an inner city hospital primary serving a poorer patient group. A total of 500 African Americans with four cancer types demonstrating health disparities in this population - prostate, colon, breast, lung are identified and approached by clinical research coordinators. The purpose of the study is to determine if any genetic information can be identified from these patients who have or are at high risk of one of these disparate cancers. All participants provide written informed consent and HIPAA authorization to provide blood samples for broad research use and permission to access data in their hospital electronic medical record for research now and in the future. An extensive demographic profile is obtained and entered into a REDCap database. Blood samples are obtained for a panel of alleles from extracted DNA at Baylor. In addition, de-identified coded samples are processed and stored in a central biorepository for further DNA, RNA and proteomic analyses. The survey and phlebotomy are performed at the time of the initial contact and agreement to participate. Nearly all patients approached willingly agree to participate for potential benefit to themselves, family members, or humankind. Little concern is voiced of providing samples for genetic analysis. Study investigators will share results with the participants and providers if testing does not indicate high risk. Results indicating increased risk or actionable alleles for the patient and/or family will be returned by a genetic counselor. Monitoring of the patients' health in this cohort will continue to be followed in the EMR to identify any future associations that might explain health disparities in African Americans. Proposals will be reviewed from investigators to study the genetic or proteomic samples as well as the clinical and demographic information in the repository. Please note that this version of the dataset has a handful of mismatches between genotyped and provided sex. Data with the following IDs should be removed prior to analysis: 420252874213744142412243424569384245694642672223
Uploading files Users who hold an ega-box-XXX account can upload files using either INBOX or FTP. Users who have a Submitter role associated with their email will only be able to upload files using INBOX. Before uploading your files, please make sure that any files that will be uploaded to EGA do not use special characters in their naming convention, such as # ? ( ) [ ] / \ = + < > : ; " ' , * ^ | &. This can cause issues with the archiving process, leading to problems for end users. The EGA is a shared, public service with limited storage. To manage the available resources, we enforce a limit of 10TB per submission account at any one time. If you exceed this limit, a “permission denied” message will be displayed. This will prevent you from uploading more files, but connecting to your inbox.For submissions larger than 10TB, please perform uploads in 10TB batches: register all the metadata and then finalise the submission. Upload the next batch of files and repeat the same metadata registration and finalisation process until you have completed the file upload. Further information can be found in the SP documentation. INBOX FTP The INBOX is only compatible with files encrypted using the Crypt4gh tool Before uploading If you are not a registered EGA user, you will first need an EGA user account. Please note that it may take a few days for your account to be activated, as it needs to be vouched for by the EGA Helpdesk. Once your account is validated, you will be able to request a submitter role. [Optional] Meanwhile, you can create and add your public key to your EGA account profile. This option is not available for old submission accounts (e.g., ega-box-NNN). As soon as you have been granted a submitter role, you will be able to connect with your username and password to the EGA inbox using the SFTP protocol. If you have also registered a public key in your profile, you can also connect using this key. To upload files to your account, you can use the graphical user interface (GUI) or the command line. Graphical User Interface (GUI)We recommend using FileZilla, a free, open-source FTP client. However, you can use any other GUI that allows connecting over the SFTP protocol. For FileZilla as your GUI, follow these steps to upload files: Create a new connection in Site Manager (File > Site Manager) and select the following options (Figure 1): Protocol: SFTP - SSH File Transfer ProtocolHost: __EGA_INBOX_DOMAIN__Logon Type: Key fileUser: your EGA usernameKey file: Path/to/your/private_keyFigure 1: Process of establishing a new connection to __EGA_INBOX_DOMAIN__ using a key file as the logon method in FileZilla. The figure showcases the FileZilla version 3.52.2 operating on IOS v11.2.3. By following the depicted steps, users can create a secure and efficient connection to the inbox, ensuring seamless data transfers.Click Connect, and you will log in remotely to your home directory. You can think of this folder as a storage "in the EGA cloud" in which you will add your files for the EGA. The uploading area has three folders:To-encrypt: Files uploaded in this folder will be encrypted automatically on the fly.Encrypted: Files uploaded in this folder must already be encrypted with Crypt4gh. Upload your files here if your connection is unstable or you have problems completing the upload into-encrypt.Etc: This folder contains two files that allow the server to show you your username and group instead of some internal numbers. Please do not upload files here; otherwise, you will obtain a permission denied error. Find the files you want to upload by browsing your local storage (left side of your screen in FileZilla). Select all the files you want to upload, then right-click on them and select Upload (Figure 2). Figure 2: Step-by-step process of manually uploading files to __EGA_INBOX_DOMAIN__ using FileZilla, with FileZilla version 3.52.2 operating on IOS v11.2.3. The figure demonstrates how users can transfer data from their local storage to the "EGA cloud" by following the depicted steps Please note that regardless of which folder you upload your files in, both folders (to-encrypt, encrypted) will point to the same path (/) (Figure 3). Therefore, you will see your files in both folders. Figure 3: Both folders, to-encrypt and encrypted, point to the same path (/)" If your connection is unstable, please encrypt your files first using Crypt4gh. Then upload them to the ‘encrypted’ folder. The example above shows how to connect to __EGA_INBOX_DOMAIN__ using the private key. However, if you prefer to log in using your credentials, you can do so. Please go to the Frequently Asked Questions (FAQs) for more information. SFTP command line To upload files securely to your private area of the EGA, you can use SFTP(Secure File Transfer Protocol) with your favorite FTP client. Here's what you need to know to get started: Connect to the target host __EGA_INBOX_DOMAIN__. This is the new hostname for the EGA SFTP service. Log in with your EGA username and key files (or password). Upload files to your private EGA inbox to ensure that only you can access the files. By following these steps, you can securely upload your files to the EGA for safe storage and sharing. Using the SFTP command line client in Linux/Unix Open a terminal and type sftp username@hostnameEnter your EGA passwordTo see a list of available SFTP commands, type helpsftp> put – Upload filesftp> get – Download filesftp> cd path – Change remote directory to ‘path’sftp> pwd – Display remote working directorysftp> lcd path – Change the local directory to ‘path’sftp> lpwd – Display local working directorysftp> ls – Display the contents of the remote working directorysftp> lls – Display the contents of the local working directoryType the "put" command to upload files. For example: put *.bamUse the bye command to close the connection (SFTP session). After uploading- Once you have uploaded files to the inbox, please bear in mind that the checksum needs to be calculated, which can take up to two days. You will only be able to link your files to a run/analysis once the encrypted checksum has been calculated.- When linking your files to the 'Run' or 'Analysis', ensure that the file name matches the file path '/name' in the INBOX folder.- Please delete the files from your SFTP INBOX after all the runs/analyses have been registered and files are ingested (SP > Files > Files ingested). This will clear your inbox space an allow you to upload more files. This will also prevent the files from reappearing in your Submitter Portal inbox. Frequently Asked Questions Specific to the inbox What username should I use to log in to my inbox? The authentication process for logging in to the EGA website, as well as accessing your inbox and outbox, requires the use of your username. If you have forgotten your registered username, please contact our Helpdesk team for assistance. How are checksums calculated in your inbox? If you encrypt the file beforehand and upload it to the "encrypted" folder, the unencrypted checksum will not be calculated until the file is ingested (i.e., until it is used in a run/analysis). If the file is uploaded to the "to-encrypt" folder, then both checksums are calculated.Please bear in mind that after files have been uploaded to the inbox, the checksum must be calculated, which can take from a few hours to two days. Specific to using keys to authenticate Can I access one EGA account from different devices? Yes, you can access your account from different devices by linking several public keys to your EGA account. Each device can generate a unique public-private key pair, and the corresponding public keys can be linked to the same account. This way, you can use different public keys on different devices and still have access to the same account and data. I have several keys and I don't remember which one is which When generating SSH keys, it's a good practice to add a comment using the -C flag. This will allow you to add a descriptive tag to your key, making it easier to identify later on. Here's an example command that generates an SSH key with a comment: ssh-keygen -t ed25519 -C work-pass In this example, we're generating an ed25519 SSH key with the comment work-pass. Once you have multiple keys with different comments, you can use the comments to easily identify each key. To view the comments for your existing SSH keys, you can use the following command: ssh-keygen -l -f /path/to/key This will display the key fingerprint and the associated comment. By checking the comments, you should be able to identify which key is which. What if I can't find my SSH keys for uploading files with a key file, and how can I use new keys? If you can't find your SSH keys, don't worry - you can make new ones. To do this, open your terminal or command prompt and type a command to make a new SSH key. You can pick a name for the key, and choose a password to keep it safe. After making the key, you can add the new key to your account or server where you want to upload files using the key file. This usually involves copying and pasting the key's "public" (e.g. file.pub) part to the right place. If you lose track of the key again, just make a new one and add it again. Keep in mind that SSH keys belong to you and your computer, so if you switch computers or accounts, you'll need to make new keys. I don't want to type the passphrase every time I use the key. What can I do? You can use an ssh-agent to avoid typing the passphrase every time you use the key. An ssh-agent is a program that stores your private keys in memory and provides them to ssh when needed. You can add your key to the ssh-agent using the command ssh-add followed by the path to your key file.Here's an example of the steps to follow: Open a terminal window.Start the ssh-agent by typing the command eval $(ssh-agent).Add your key to the ssh-agent by typing the command ssh-add [key filepath]. For instance, if your key file is located in the home directory with the name mykey, the command will look like this: ssh-add ~/mykey After adding your, key to the ssh-agent, you should be able to use ssh without having to enter your passphrase every time. Can I use my password for authentication (without my private key)? If you prefer to use your username and password for authentication instead of your private key, you can still do so. When using a Graphical User Interface (GUI) such as FileZilla, you can select Ask for password as your Logon Type (Figure 3). This option will prompt you to enter your password when you click Connect, instead of using your private key. Figure 3: This option will prompt you to enter your password when you click "Connect", instead of using your private key. Figure 3: Process of establishing a new connection to __EGA_INBOX_DOMAIN__ using your password as the logon method in FileZilla. The figure showcases the FileZilla version 3.52.2 operating on IOS v11.2.3. By following the depicted steps, users can create a secure and efficient connection to the inbox, ensuring seamless data transfers. It's worth noting that using a password for authentication can be less secure than using an SSH key, as passwords can be more easily compromised through various means. However, if you choose to use your password for authentication, selecting "Ask for password" as your Logon Type is a good way to do so securely via a GUI. Why is it better to use my key and not my password? SSH keys for authentication is generally considered to be more secure and convenient than using passwords. SSH keys are more difficult to crack than passwords, and they can be restricted to specific users and machines, giving you more control over access. Once you set up your SSH keys, you can use them to authenticate quickly and easily, without having to enter a password every time. This makes automation of tasks, such as uploading encrypted files, much simpler. Additionally, SSH keys provide better logging, allowing you to keep track of who is accessing your systems and when. All in all, using SSH keys is a good practice for improving security and convenience in your authentication process.
Data Use Ontology Data Use Ontology at EGA The EGA is committed to its involvement in the work of GA4GH. In an effort to enhance data discoverability and streamline data access, EGA have implemented the use of the Data Use Ontology (DUO), based on consent codes as described in Dyke et al. 2017. The Data Use Ontology codes will be displayed on the live dataset page of your submission to advise any would be requestor on how the data can be used and also to enhance data discoverability as users will be able to search on these codes to find applicable datasets. DUO can be browsed online via the Ontology Lookup Service Learn more reading the Data Use Ontology publication and the GA4GH Machine-Readable Consent Guidance! Check our DAC Portal Take The Tour and learn how to add DUO codes to your policy! Term Shorthand Label Description DUO:0000004 NRES no restriction This data use permission indicates there is no restriction on use. DUO:0000042 GRU general research use This data use permission indicates that use is allowed for general research use for any research purpose. DUO:0000006 HMB health or medical or biomedical research This data use permission indicates that use is allowed for health/medical/biomedical purposes; does not include the study of population origins or ancestry. DUO:0000007 DS disease specific research This data use permission indicates that use is allowed provided it is related to the specified disease. DUO:0000011 POA population origins or ancestry research only This data use permission indicates that use of the data is limited to the study of population origins or ancestry. DUO:0000012 RS research specific restrictions This data use modifier indicates that use is limited to studies of a certain research type. DUO:0000015 NMDS no general methods research This data use modifier indicates that use does not allow methods development research (e.g., development of software or algorithms). DUO:0000016 GSO genetic studies only This data use modifier indicates that use is limited to genetic studies only (i.e., studies that include genotype research alone or both genotype and phenotype research, but not phenotype research exclusively) DUO:0000018 NPUNCU not for profit, non commercial use only This data use modifier indicates that use of the data is limited to not-for-profit organizations and not-for-profit use, non-commercial use. DUO:0000019 PUB publication required This data use modifier indicates that requestor agrees to make results of studies using the data available to the larger scientific community. DUO:0000020 COL collaboration required This data use modifier indicates that the requestor must agree to collaboration with the primary study investigator(s). DUO:0000021 IRB ethics approval required This data use modifier indicates that the requestor must provide documentation of local IRB/ERB approval. DUO:0000022 GS geographical restriction This data use modifier indicates that use is limited to within a specific geographic region. DUO:0000024 MOR publication moratorium This data use modifier indicates that requestor agrees not to publish results of studies until a specific date. DUO:0000025 TS time limit on use This data use modifier indicates that use is approved for a specific number of months. DUO:0000026 US user specific restriction This data use modifier indicates that use is limited to use by approved users. DUO:0000027 PS project specific restriction This data use modifier indicates that use is limited to use within an approved project. DUO:0000028 IS institution specific restriction This data use modifier indicates that use is limited to use within an approved institution. DUO:0000029 RTN return to database or resource This data use modifier indicates that the requestor must return derived/enriched data to the database/resource. DUO:0000043 CC clinical care use This data use modifier indicates that use is allowed for clinical use and care. DUO:0000044 NPOA population origins or ancestry research prohibited This data use modifier indicates use for purposes of population, origin, or ancestry research is prohibited. DUO:0000045 NPU not for profit organisation use only This data use modifier indicates that use of the data is limited to not-for-profit organisations. DUO:0000046 NCU non-commercial use only This data use modifier indicates that use of the data is limited to not-for-profit use. Point to Notice: For the consent code DUO:0000007 where data is restricted to use on a specific disease, please accompany it with an appropriate ontology from MONDO e.g., If the data is restricted to the use of research into juvenile idiopathic arthritis the code should be displayed as DUO:0000007; MONDO:0011429.
Data Protection 1 About the EGA The European Genome-phenome Archive (EGA) was formally launched in 2008 at the European Bioinformatics Institute (EMBL-EBI), an outstation of the European Molecular Biology Laboratory (EMBL), to address an identified need for archiving and sharing the results of genome-wide association studies from the Wellcome Trust Case Control Consortium. In late 2012, with the signing of a memorandum of understanding (and subsequent formal agreement in 2016) between EMBL-EBI and the Centre for Genomic Regulation (CRG), the EGA formally became a joint project of the two institutes. The two institutes work together to support the EGA services, including supporting submissions, web site, strategic leadership, and data infrastructure developments. 2 EMBL-EBI & GDPR The EGA is co-managed by EMBL-EBI and CRG. EMBL-EBI is an international organisation established by treaty and has certain privileges and immunities (e.g. exemptions from the application of national law) and also may self-regulate its activities (e.g. establish its own institutional legal framework) within the framework of its founding act of 1973. The General Data Protection Regulation (GDPR) is a European Union (EU) regulation that legislates how organisations can share and process personal data of EU citizens. EMBL places great value in maintaining collaboration with researchers who are subject to GDPR. For that reason, it is of utmost importance for EMBL to handle data received from those collaborators in a secure and responsible manner. Mindful of its public mandate and the sensitivity of the data it handles, EMBL has always ensured a high level of data protection in its activities. Since the introduction of GDPR in May 2018, EMBL has established its internal policy on General Data Protection (IP68), exercising its right to self-regulate its operations,., IP 68 establishes a robust personal data protection framework that provides for data protection principles, enforceable data subject rights and oversight and redress mechanisms offering a level of protection comparable with GDPR. 3 CRG & GDPR The Centre for Genomic Regulation (CRG) is an international biomedical research institute of excellence, created in July 2000 and mainly participated by the Catalan Government. It is a non-profit foundation and its mission is to discover and advance knowledge for the benefit of society, public health and economic prosperity. The CRG is a CERCA center. CERCA is the collective organisation for all research centres of excellence in Catalonia. CERCA ensures these centres develop successfully by promoting synergies and strategic cooperation improving their visibility and the impact of their research and promoting the dialogue amongst both public and private stakeholders. As a legal entity based in Spain and operating within the EU, the CRG ensures the compliance with the GDPR and the legal regulations on personal data protection applicable at the national level, as well as any other legislation that may replace, modify or supplement the above-mentioned in terms of personal data protection. 4 EGA & GDPR EGA GDPR Schema 4.1 Genetic and phenotypic data Within GDPR, there are two main actors: data controllers and data processors. Data controllers are persons or entities which determine the purposes and means that the personal data may be processed, e.g. companies, researchers, or universities. For EGA, the data controller is ultimately the data producer and the submitter(s) who submit the data to EGA. The data controller also creates a Data Access Committee (DAC) who will decide on data access permissions at EGA. Data processors are the persons or entities which process the data on behalf of a data controller. With regard to GDPR, EGA is a data processor as it processes data as instructed by the data controller. GDPR applies to any organisation which accesses personal data from an individual within the EU. Under GDPR, personal data is defined as any data that is identifiable, including names and email addresses as well as health-related and genetic data. EGA does not accept personally identifiable data except genetic and phenotypic data, so all other data submitted to EGA, such as names and addresses, must be pseudonymised. GDPR requires that data controllers implement data protection principles, such as data minimisation, to minimise the risk of data leakage, and protect the rights of the data subjects. As a data processor, EGA has a set of security policies that are followed to minimise the risk of unauthorised data access or data loss. In its role as a data processor, EGA requires all submitters to sign a Data Processing Agreement (DPA) when the submission account is first created. This agreement is only required to be signed once per submitter, and will remain valid for future submissions to EGA. 4.2 Other personal data The EGA also collects personal data as part of our interactions with submitters, data access committees, and researchers accessing data distributed by EGA. The below privacy notices explain what personal data is collected by the specific service you are requesting, for what purposes, how it is processed, and how we keep it secure. Privacy Notices for EGA Title Version Last Updated EGA Data Access Committee Account Privacy Notice for EGA Data Access Committee Account 1.0 February 6, 2019 EGA User Account Privacy Notice for EGA User Account 1.0 February 6, 2019 EGA Helpdesk Service Privacy Notice for EGA Helpdesk Service 1.0 February 6, 2019 EGA Website Service Privacy Notice for EGA Website Service 1.0 February 6, 2019 Documentation Title Version Description EGA Security Overview Security Document 1.1 The EGA Security Document provides an overview of EGA’s practices in ensuring the security of data stored at EGA. EGA Data Processing Agreement Data Processing Agreement 1.5 The Data Processing Agreement must be completed and returned as part of the submission process. Please note that this document is non-negotiable. Authorised Submitters Authorised Submitters Formulary 1.0 The Authorised Submitters Form must be completed and returned as part of the submission process. Please list all those that should have access to the submission account in order to submit to the EGA should be detailed here. Dispute Resolution Any controversy or claim arising out of, or relating to, the DPA (including the enforceability or breach thereof, any question regarding its existence, validity or termination) or relating to the EGA Service shall be resolved using the internal dispute resolution mechanisms of EGA including those related to Data Protection. The EGA’s internal dispute resolution mechanism has the following procedure: EGA OPERATIONAL PHASE: Meetings between EGA staff and the Data Controller. LEGAL MANAGEMENT PHASE: Meetings between legal teams of EMBL, CRG and the Data Controller. DIRECTION MANAGEMENT PHASE: Negotiation between the legal representatives of EMBL, the CRG and the Data Controller. If the internal dispute resolution mechanism doesn’t resolve the controversy or claim the next phase is: ARBITRATION PHASE: Resolution by arbitration under the WIPO Expedited Arbitration Rules (“Rules”).
The Kaiser Permanente Research Program on Genes, Environment, and Health (RPGEH) is a resource developed to facilitate research on genetic and environmental factors on common diseases and healthy aging. The RPGEH resource links biospecimens, health surveys, and comprehensive electronic medical records on broadly consented adult members of Kaiser Permanente Medical Care Plan, Northern California Region (KPNC). KPNC is an integrated health care delivery system with a membership of approximately 3.3 million people in northern California. The membership of KPNC is representative of the general population in the 14 county area in which facilities are located, although extremes of income are underrepresented. At the end of 2013, the RPGEH resource included: (1) demographic and behavioral surveys from over 430,000 participants; (2) biospecimens (DNA, serum, plasma, and/or saliva) from over 204,000 participants, including over 13,000 pregnant women; (3) genome-wide genotype data (70 billion SNP genotypes) on over 100,000 participants, including the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort; and (4) the longitudinal electronic medical records of the participants. The RPGEH was developed beginning in 2005 at the Division of Research of Kaiser Permanente Northern California by Catherine Schaefer (Director), Neil Risch (Co-Director), Lisa Croen, Eric Jorgenson, Lawrence Kushi, Charles Quesenberry, Sarah Rowell, Carol Somkin, Stephen Van den Eeden, Larry Walter, and Rachel Whitmer. Funding of the RPGEH was provided to C. Schaefer (PI) and N. Risch (co-PI) by the Wayne and Gladys Valley Foundation, The Ellison Medical Foundation, the Robert Wood Johnson Foundation, Kaiser Permanente Northern California, and the Kaiser Permanente National and Regional Community Benefit Programs. The GERA cohort was funded by a grant from NIH to RPGEH and UCSF (RC2 AG036607; C. Schaefer and N. Risch, PIs). At the time of the award of the RC2 project in late 2009, the RPGEH had established a cohort of about 140,000 individuals who had answered a detailed survey, provided saliva samples for extraction of DNA, and given broad consent for the use of their data in studies of health and disease. Survey and Cohort Recruitment. Initially, the RPGEH developed electronic disease registries to enable identification of phenotypes, using algorithms applied to EMR data. In 2007, the RPGEH mailed a four page survey to 1.9 million adult (≥ 18 years old) members of KPNC who had been members for two years or more, to obtain data on demographic and behavioral factors complementary to the clinical data in the EMR. The survey materials included a cover letter introducing the RPGEH, a two page list of Frequently Asked Questions, and the survey, which included questions on demographic factors such as education, race-ethnicity, income and marital status, dietary factors, physical activity, smoking, and alcohol consumption, as well as reproductive history and reproductive health. Members whose electronic medical records indicated a preference for written communications in Chinese or Spanish received survey materials both in English and a Chinese or Spanish translation. Approximately 400,000 completed surveys were returned. Saliva Sample Collection. Beginning in July 2008, respondents to the survey were asked to sign and return a consent form and authorization for use and disclosure of protected health information. The consent form authorized broad use of biospecimens, survey data, and data from participants' electronic health records for use in studies of genetic and environmental influences on health and disease. Respondents who returned completed consent forms were mailed (Oragene) saliva collection kits; more than 132,000 saliva samples were collected in two years. Completed saliva kits were scanned and archived in a temporary biorepository at the KPNC Division of Research. In late 2009, the RPGEH began collection of saliva samples from the California Men's Health Study (CMHS), a cohort that had been previously assembled in 2002-2003 and had been excluded from the RPGEH survey mailing with the intent of later adding CMHS participants to the assembled RPGEH cohort. The CMHS was developed to facilitate research on prostate cancer and other conditions in older men; the study protocol is described in Enger, et al., 2006. It enrolled and surveyed more than 40,000 men in KPNC, ages 45-69 years, who were members of KPNC during 2002-2003. CMHS men completed two mailed surveys with demographic and behavioral data similar to that of the RPGEH. The data on analogous variables were reconciled and integrated with the data derived from the RPGEH cohort for use in the RPGEH resource. By 2011, RPGEH collected approximately 15,400 saliva samples from men participating in the CMHS. RPGEH Access and Collaborations Website and Procedures. The RPGEH maintains a web portal for inquiries and applications for collaboration and access to data. The url is: https://rpgehportal.kaiser.org/. RPGEH has an application process and an Access Review Committee that reviews applications for collaboration and use. For more information, please contact RPGEH through the website.
Data Access NOTE: Please refer to the “Authorized Access” section below for information about how access to the data from this accession differs from many other dbGaP accessions.Related StudiesParent cohort phenotype data can be accessed through ARIC-BioLINCC, Framingham-BioLINCC, and CHS-BioLINCC. Objectives To determine the cardiovascular and other consequences of sleep-disordered breathing and to test whether sleep-disordered breathing is associated with an increased risk of coronary heart disease, stroke, all-cause mortality and hypertension by examining subjects from well-characterized and established epidemiologic cohorts. Background Obstructive sleep apnea syndrome (OSA) is a potentially debilitating condition characterized by repetitive episodes of apnea while asleep, nocturnal oxygen desaturation, excessive daytime sleepiness, and loud disruptive snoring. Epidemiologic data from middle-aged adults indicate that OSA is common, with prevalence rates of 4% in men and 2% in women. Prior studies implicated OSA as a risk factor for the development of hypertension, ischemic heart disease, congestive heart failure, stroke and consequently premature death. Questions arose as to whether an increased propensity for cardiovascular and cerebrovascular diseases was limited to only those with frank OSA or whether more subtle forms of sleep-disordered breathing (SDB) would also confer elevated risk. Further evidence was also needed to clarify whether, SDB, including OSA, is an independent risk factor for the development of cardiovascular or cerebrovascular disease. Known cardiovascular and cerebrovascular disease risk factors such as obesity and smoking are commonly present in those with SDB; therefore, apparent associations between SDB and cardiovascular and cerebrovascular diseases may have resulted from the effects of these concomitant risk factors. Moreover, there was no understanding as to whether such factors as race, age, gender, and prevalent cardiovascular or cerebrovascular disease might interact with SDB to alter future cardiovascular and cerebrovascular disease risk. Mechanisms underlying any propensity to develop cardiovascular or cerebrovascular disease with SDB had not been firmly established (Quan, et al., 1997, PMID: 9493915). Participants Participants in SHHS were recruited from nine existing NHLBI epidemiological studies in which data on cardiovascular risk factors had been collected previously. The “parent” cohorts included: Two sites of the Atherosclerosis Risk in Communities Study (ARIC) Three sites of the Cardiovascular Health Study (CHS) The Framingham Offspring Cohort The Strong Heart Study (SHS) sites in South Dakota, Oklahoma, and Arizona The New York Hypertension Cohorts The Tucson Epidemiologic Study of Airways Obstructive Diseases and the Health and Environment Study From these parent cohorts, a sample of participants who met the inclusion criteria (age 40 years or older; no history of treatment of sleep apnea; no tracheostomy; no current home oxygen therapy) was invited to participate in the baseline examination of the SHHS, which included an initial polysomnogram (SHHS-1). Several cohorts over-sampled snorers in order to increase the study-wide prevalence of sleep-disordered breathing. In all, 6441 individuals were enrolled between November 1, 1995 and January 31, 1998. During exam cycle 3 (January 2001-June 2003), a second polysomnogram (SHHS-2) was obtained in 3295 of the participants. Due to sovereignty issues, Strong Heart Study participants are not included in the shared SHHS data. Data from a total of 5839 participants (1920 ARIC, 1249 CHS, 997 Framingham Offspring and OMNI 1, and 1673 from other studies), consenting to share data are available. Design The Sleep Heart Health Study added in-home polysomnography to the data collected in each of the parent studies at a baseline SHHS exam and a follow-up approximately 4 years later. Using the Compumedics PS polysomnograph, sleep studies were obtained in an unattended setting, usually in the homes of the participants, by trained and certified technicians. The recording montage consisted of: C3/A2 and C4/A1 EEGs, sampled at 125 Hz right and left electrooculograms (EOGs), sampled at 50 Hz a bipolar submental electromyogram (EMG), sampled at 125 Hz thoracic and abdominal excursions (THOR and ABDO), recorded by inductive plethysmography bands and sampled at 10 Hz "airflow" detected by a nasal-oral thermocouple (Protec, Woodinville, WA), sampled at 10 Hz finger-tip pulse oximetry (Nonin, Minneapolis, MN) sampled at 1 Hz ECG from a bipolar lead, sampled at 125 Hz for most SHHS-1 studies and 250 Hz for SHHS-2 studies Heart rate (PR) derived from the ECG and sampled at 1 Hz body position (using a mercury gauge sensor) ambient light (on/off, by a light sensor secured to the recording garment)This montage provides data on the occurrence of sleep-disordered breathing, sleep stages, heart rate, oximetry and on arousals. Each participant in the parent studies was also asked to complete the Sleep Habits Questionnaire which covers usual sleep pattern, snoring, and sleepiness.