North Carolina Clinical Genomic Evaluation by Next-generation Exome Sequencing This study is part of a larger consortium project investigating the validity and best use of next-generation sequencing (in particular, whole exome sequencing, or WES) in clinical care. Participants are patients who were either seen in the UNC Cancer and Adult Genetics Clinic or referred to the study by their physician. They will be approached by their physician or a genetic counselor for recruitment. Once enrolled, a clinical geneticist or genetic counselor will obtain consent and collect blood samples to be analyzed using WES. Results may include information related to a diagnosis and incidental information. Medically actionable incidental findings will be CLIA-certified and returned to participants in a routine genetic counseling session, along with diagnostic findings. Eligible adult participants will be randomized to have the opportunity to choose to get certain types of non-medically actionable incidental findings, as well. Their decisions will be investigated, as will psychosocial and behavioral responses to sequencing and receiving sequencing information. This is a longitudinal, mixed methods study (i.e., multiple assessments pre- and post-return of results, with both quantitative and qualitative methods used to gather data). Because only the quantitative component of the study uses randomization, only measures and procedures associated with that component are included here. The third study release includes data of additional n=189 subjects.
Cognitive impairment is a common and disabling problem in Parkinson's disease (PD). Identification of genetic variants that influence the presence or severity of cognitive deficits in PD might provide a clearer understanding of the pathophysiology underlying this important nonmotor feature. We are presently undertaking a large-scale, two-stage study designed to identify genetic risk factors for cognitive impairment in PD. The study population is divided into a discovery (Stage I) and a validation (Stage II) sample of patients enrolled in the PD Cognitive Genetics Consortium (PDCGC). Each patient has undergone a detailed neurological evaluation and cognitive testing. Clinical and genetic data for the project are stored and managed at the Coordinating Center at the University of Washington and VA Puget Sound Health Care System in Seattle. Stage I of the project is now complete; 1,219 PD patients were genotyped for 249,336 variants using the NeuroX array. Participants underwent assessments of learning and memory (Hopkins Verbal Learning Test-Revised [HVLT-R]), working memory/executive function (Letter-Number Sequencing and Trail Making Test [TMT] A and B), language processing (semantic and phonemic verbal fluency), visuospatial abilities (Benton Judgment of Line Orientation [JoLO]), and global cognitive function (Montreal Cognitive Assessment). We excluded individuals who were of non-European ancestry, failed genotyping, were missing data for one or more covariates, were related to another participant in the cohort, or failed to complete at least half of the cognitive tests. After these quality control measures were implemented, 1,105 participants remained and were included in all subsequent analyses. For common variants we used linear regression to test for association between genotype and cognitive performance with adjustment for important covariates. Rare variants were analyzed using the optimal unified sequence kernel association test. The significance threshold was defined as a false discovery rate corrected P-value (PFDR) of 0.05. Eighteen common variants in 13 genomic regions exceeded the significance threshold for one of the cognitive tests. These included GBA rs2230288 (E326K; PFDR = 2.7 x 10-4) for JoLO, PARP4 rs9318600 (PFDR = 0.006) and rs9581094 (PFDR = 0.006) for HVLT-R total recall, and MTCL1 rs34877994 (PFDR = 0.01) for TMT B-A. Analysis of rare variants did not yield any significant gene regions. We have conducted the first large-scale PD cognitive genetics analysis and nominated several new putative susceptibility genes for cognitive impairment in PD. These results will require replication in independent PD cohorts, and efforts to validate the findings in PDCGC Stage II are in progress.
Ensuring the comparability of data generated across different sequencing platforms has become a pressing concern in efforts to uncover robust links between the microbiome and human health. In this study, we conducted a comprehensive comparison of taxonomic and functional profiles from matched human gut microbiome sample pairs, sequenced using both the MGISEQ-2000 (MGI) and NovaSeq 6000 (Illumina NovaSeq) platforms.
This repository contains raw sequencing data for sputum fungal microbiota in an overall healthy population in Guangdong province, China. The aim of the study is to provide a comprehensive insight on the multi-kingdom airway microbiome, how it may underlie the interplay between exposure and healthy outcomes, and whether it can be employed to inform airway health and diseases.
The phenotypic data for ~12500 samples of the AWI-Gen Phase 1 Population cross-sectional study of older adults (mostly between 40 and 60 years), men and women. Six study sites in four sub-Saharan African counties including Ghana, Burkina Faso, Kenya and South Africa. Some groups are missing data for specific variables. Data includes questionnaire data (demography, health history, family health history, behaviour and infection data); anthropometry; and laboratory assays on blood and urine.
The Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO) is a collaborative effort comprised of a coordinating center and scientific researchers from well-characterized cohort and case-control studies conducted in North America and Europe. This international consortium aims to accelerate the discovery of common and rare genetic risk variants for colorectal cancer by conducting large-scale meta-analyses of existing and newly generated genome-wide association study (GWAS) data, replicating and fine-mapping of GWAS discoveries, and investigating how genetic risk variants are modified by environmental risk factors. To expand these efforts, we assembled case-control sets or nested case-control sets from 20 different North American or European studies. Summary descriptions and study participant inclusions/exclusion criteria for each of these studies are detailed below. The Black Women's Health Study (BWHS): Is the largest follow-up study of the health of African-American women (Cozier et al., 2004; Rosenberg et al., 1995) [PMID: 15018884; PMID: 7722208]. The purpose is to identify and evaluate causes and preventives of cancers and other serious illnesses in African-American women. Among the diseases being studied are breast cancer, colorectal cancer, type 2 diabetes, uterine fibroids, systemic lupus erythematosus, and cardiovascular disease. The study began in 1995, when 59,000 black women from all parts of the United States enrolled through postal questionnaires. The women provided demographic and health data on the 1995 baseline questionnaire, including information on weight, height, smoking, drinking, contraceptive use, use of other selected medications, illnesses, reproductive history, physical activity, diet, use of health care, and other factors. The participants are followed through biennial questionnaires to determine the occurrence of cancers and other illnesses and to update information on risk factors. Self-reports of cancer are confirmed through medical records and state cancer registry records. Mouthwash-swish samples, as a source of DNA, were obtained from ~26,000 BWHS participants in 2002-2007. DNA was isolated from the mouthwash-swish samples at the Boston University Molecular Core Genetics Laboratory using the QIAAMP DNA Mini Kit (Qiagen). All incident colorectal cancer cases with a DNA sample were included in the present analysis. Two controls per case, selected from among BWHS participants free of colorectal cancer at end of follow-up, were matched to cases on year of birth (+/- 2 years) and geographical region of residence (Northeast, South, Midwest, and West). A total 209 colorectal cancer cases and 423 controls were sent for genotyping. Campaign Against Cancer and Heart Disease (CLUE II): The Campaign Against Cancer and Heart Disease, is a prospective cohort designed to identify biomarkers and other factors associated with risk of cancer, heart disease, and other conditions (Kakourou et al., 2015) [PMID: 26220152]. 32,894 participants were recruited from May through October 1989 from Washington County, Maryland and surrounding communities. Colorectal cancer cases (n = 297) and matched controls (n = 296) were identified between 1989 and 2000 among participants in the CLUE II cohort of Washington County, Maryland. Colorectal Cancer Study of Austria (CORSA): In the ongoing colorectal cancer study of Austria (CORSA), more than 13,000 Caucasian participants have been recruited within the province-wide screening project "Burgenland Prevention Trial of Colorectal Disease with Immunological Testing" (B-PREDICT) since 2003 (Hofer et al., 2011) [PMID: 21422235]. All inhabitants of the Austrian province Burgenland aged between 40 and 80 years are annually invited to participate in fecal immunochemical testing and haemoccult positive screening participants are invited for colonoscopy. CORSA includes genomic DNA and plasma of colorectal cancer cases, low-risk and high-risk adenomas, and colonoscopy-negative controls. Controls received a complete colonoscopy and were free of colorectal cancer or polyps. CORSA participants have been recruited in the four KRAGES hospitals in Burgenland, Austria, and additionally, at the Medical University of Vienna (Department of Surgery), the Viennese hospitals "Rudolfstiftung" and the "Sozialmedizinisches Zentrum Sud", and at the Medical University of Graz (Department of Internal Medicine). 1403 colorectal cancer and advanced colorectal adenoma cases, and 1404 matched controls were selected for the study. Distribution of factors sex and age (5 year strata) were evenly matched between cases and controls. Cancer Prevention Study II (CPS II): The CPS II Nutrition cohort is a prospective study of cancer incidence and mortality in the United States, established in 1992 and described in detail elsewhere (Calle et al., 2002; Campbell et al., 2014) [PMID: 12015775; PMID: 25472679]. At enrollment, participants completed a mailed self-administered questionnaire including information on demographic, medical, diet, and lifestyle factors. Follow-up questionnaires to update exposure information and to ascertain newly diagnosed cancers were sent biennially starting in 1997. Reported cancers were verified through medical records, state cancer registry linkage, or death certificates. The Emory University Institutional Review Board approves all aspects of the CPS II Nutrition Cohort. A total of 360 cases and 359 controls were selected for this study. Czech Republic Colorectal Cancer Study (Czech Republic CCS): Cases with positive colonoscopy results for malignancy, confirmed by histology as colon or rectal carcinomas, were recruited between September 2003 and May 2012 in several oncological departments in the Czech Republic (Prague, Pilsen, Benesov, Brno, Liberec, Ples, Pribram, Usti and Labem, and Zlin). Two control groups, sampled at the same time of cases recruitment, were included in the study. The first group consisted of hospital-based individuals with a negative colonoscopy result for malignancy or idiopathic bowel diseases. The reasons for the colonoscopy were: i) positive fecal occult blood test, ii) hemorrhoids, iii) abdominal pain of unknown origin, and iv) macroscopic bleeding. The second control group consisted of healthy blood donor volunteers from a blood donor center in Prague. All individuals were subjected to standard examinations to verify the health status for blood donation and were cancer-free at the time of the sampling. Details of CRC cases and controls have been reported previously (Vymetalkova et al., 2014; Naccarati et al., 2016; Vymetalkova et al., 2016) [PMID: 24755277; PMID: 26735576; PMID: 27803053]. All subjects were informed and provided written consent to participate in the study. They approved the use of their biological samples for genetic analyses, according to the Declaration of Helsinki. The design of the study was approved by the Ethics Committee of the Institute of Experimental Medicine, Prague, Czech Republic. All subjects included in the study were Caucasians and comprised 1792 cases and 1764 matched controls. Controls were matched to CRC cases as 1:1 ratio. Matching was done on age and sex. Age was matched on +-5 years, whereas sex was matched exactly. For the cases without matched controls, matching was done only on sex. Early Detection Research Network (EDRN): The aim of the EDRN initiative is to develop and sustain a biorepository for support of translational research (Amin et al., 2010) [PMID: 21031013]. High-quality biospecimens were accrued and annotated with pertinent clinical, epidemiologic, molecular and genomic information. A user-friendly annotation tool and query tool was developed for this purpose. The various components of this annotation tool include: CDEs are developed from the College of American Pathologists (CAP) Cancer Checklists and North American Association of Central Cancer Registries (NAACR) standards. The CDEs provides semantic and syntactic interoperability of the data sets by describing them in the form of metadata or data descriptor. A total of 352 colorectal case samples and 399 controls were selected for this study. Controls were matched to CRC cases based on age and sex. The EPICOLON Consortium (EPICOLON): The EPICOLON Consortium comprises a prospective, multicentre and population-based epidemiology survey of the incidence and features of CRC in the Spanish population (Fernandez-Rozadilla et al., 2013) [PMID: 23350875]. Cases were selected as patients with de novo histologically confirmed diagnosis of colorectal adenocarcinoma. Patients with familial adenomatous polyposis, Lynch syndrome or inflammatory bowel disease-related CRC, and cases where patients or family refused to participate in the study were excluded. Hospital-based controls were recruited through the blood collection unit of each hospital, together with cases. All of the controls were confirmed to have no history of cancer or other neoplasm and no reported family history of CRC. Controls were randomly selected and matched with cases for hospital, sex and age (+- 5 years). A total of 370 cases and 370 controls were selected for genotyping. Hawaii Adenoma Study: For this adenoma study, two flexible-sigmoidoscopy screening clinics were first used to recruit participants on Oahu, Hawaii. Adenoma cases were identified either from the baseline examination at the Hawaii site of the Prostate Lung Colorectal and Ovarian cancer screening trial during 1996-2000 or at the Kaiser Permanente Hawaii's Gastroenterology Screening Clinic during 1995-2007. In addition, starting in 2002 and up to 2007, we also approached for recruitment all eligible patients who underwent a colonoscopy in the Kaiser Permanente Hawaii Gastroenterology Department. Cases were patients with histologically confirmed first-time adenoma(s) of the colorectum and were of Japanese, Caucasian or Hawaiian race/ethnicity. Controls were selected among patients with a normal colorectum and were individually matched to the cases on age at exam, sex, race/ethnicity, screening date (+-3 months) and clinic and type of examination (colonoscopy or flexible sigmoidoscopy). We recruited 1016 adenoma cases (67.8% of all eligible) and 1355 controls (69.2% of all eligible); 889 cases and 1169 controls agreed to give a blood and 29 cases and 34 controls, a mouthwash sample. A total of 989 cases and 1185 controls were genotyped for this study. Columbus-area HNPCC Study (HNPCC, OSUMC): Patients with colorectal adenocarcinoma diagnosed at six participating hospitals were eligible for this study, regardless of age at diagnosis or family history of cancer. Patients with a clinical diagnosis of familial adenomatous polyposis were not eligible for this study. These six hospitals perform the vast majority of all operations for CRC in the Columbus metropolitan area (population 1.7 million). The institutional review board at all participating hospitals approved the research protocol and consent form in accordance with assurances filed with and approved by the United States Department of Health and Human Services. Briefly, during the period of January 1999 through August 2004, 1,566 eligible patients with CRC were accrued to the study (Hampel et al., 2008) [PMID 18809606]. A total of 1472 colorectal cancer samples had enough blood DNA remaining to be sent for genotyping. Control samples were provided by the Ohio State University Medical Center%#39;s (OSUMC) Human Genetics Sample Bank. The Columbus Area Controls Sample Bank is a collection of control samples for use in human genetics research that includes both donors' anonymized biological specimens and linked phenotypic data. The data and samples are collected under the protocol "Collection and Storage of Controls for Genetics Research Studies", which is approved by the Biomedical Sciences Institutional Review Board at OSUMC. Recruitment takes place in OSUMC primary care and internal medicine clinics. If individuals agree to participate, they provide written informed consent, complete a questionnaire that includes demographic, medical and family history information, and donate a blood sample. 4-7 ml of blood is drawn into each of 3 ACD Solution A tubes and is used for genomic DNA extraction and the establishment of an EBV-transformed lymphoblastoid cell culture, cell pellet in Trizol, and plasma. Controls were matched to CRC cases as 1:1. Matching was done on age at reference time (age_ref), race, and sex. Age_ref was matched on +-5 years. Sex and race were matched exactly. For the cases without matched controls, matching was done only on sex and race with 1:1 ratio. Since controls are fewer than cases, one control is matched on 2 cases at most. Health Professionals Follow-up Study (HPFS): A parallel prospective study to the NHS (Nurses' Health Study). The HPFS cohort comprised 51,529 men aged 40-75 who, in 1986, responded to a mailed questionnaire (Rimm et al., 1990) [PMID: 2090285]. Participants provided information on health related exposures, including current and past smoking history, age, weight, height, diet, physical activity, aspirin use, and family history of colorectal cancer. Colorectal cancer and other outcomes were reported by participants or next-of-kin and were followed up through review of the medical and pathology record by physicians. Overall, more than 97% of self-reported colorectal cancers were confirmed by medical record review. Information was abstracted on histology and primary location. Incident cases were defined as those occurring after the subject provided the blood sample. Prevalent cases were defined as those occurring after enrollment in the study but before the subject provided the blood sample. Follow-up evaluation has been excellent, with 94% of the men responding to date. Colorectal cancer cases were ascertained through January 1, 2008. In 1993-1995, 18,825 men in the HPFS mailed blood samples by overnight courier, which were aliquoted into buffy coat and stored in liquid nitrogen. In 2001-2004, 13,956 men in the HPFS who had not provided a blood sample previously mailed in a swish-and-spit sample of buccal cells. Incident cases were defined as those occurring after the subject provided a blood or buccal sample. Prevalent cases were defined as those occurring after enrollment in the study in 1986, but before the subject provided either a blood or buccal sample. After excluding participants with histories of cancer (except nonmelanoma skin cancer), ulcerative colitis, or familial polyposis, case-control sets were previously constructed. In addition to colorectal cancer cases and controls, a set of adenoma cases and matched controls with available DNA from buffy coat were selected for genotyping. Over the follow-up period, data were collected on endoscopic screening practices and, if individuals had been diagnosed with a polyp, the polyps were confirmed to be adenomatous by medical record review. Adenoma cases were ascertained through January 1, 2008. A separate case-control set was constructed of participants diagnosed with advanced adenoma matched to control participants who underwent a lower endoscopy in the same time period and did not have an adenoma. Advanced adenoma was defined as an adenoma 1 cm or larger in diameter and/or with tubulovillous, villous, or highgrade dysplasia/carcinoma-in-situ histology. Matching criteria included year of birth (within 1 year) and month/ year of blood sampling (within 6 months), the reason for their lower endoscopy (screening, family history, or symptoms), and the time period of any prior endoscopy (within 2 years). Controls matched to cases with a distal adenoma either had a negative sigmoidoscopy or colonoscopy examination, and controls matched to cases with proximal adenoma all had a negative colonoscopy. In total, 159 advanced adenoma cases and 109 controls were selected for genotyping. Leeds Colorectal Cancer Study (LCCS): Following local ethical approval, colorectal cancer cases were recruited from 1997 until 2012 in Leeds, UK through surgical clinics. Initially, funding was provided by the UK Ministry of Agriculture, Farming and Fisheries (subsequently the Food Standards Agency) and Imperial Cancer Research Fund (subsequently Cancer Research UK). Recruitment also occurred similarly in Dundee, Perth and York between the periods of 1997 and 2001 using the same protocol and the data and samples were combined. Pathologically confirmed cases were consented at outpatient clinics, providing information on known and postulated risk factors for colorectal cancer (diet, lifestyle and family history) as well as providing a blood sample for DNA. Exclusion criteria included pre-existing diverticular disease and an inability to complete the questionnaire. The General Practitioners of cases (all UK residents have a nominated General Practitioner to whom to refer initial medical queries) and these GPs were asked to send letters to other persons on their patient list of the same gender and born within 5 years of the case. Subsequently to enhance the number of controls, we systematically invited patients from selected GP practices. Diet was assessed in cases and controls using an extensive dietary and lifestyle questionnaire modified by that produced by the European Prospective Investigation in Cancer (EPIC). The frequency that each specific food items were eaten was recorded and we also obtained average fruit and vegetable consumption as a cross-check. In total, 1591 cases and 739 controls provided a DNA sample. The North Carolina Colon Cancer Studies (NCCCS I/II): The North Carolina Colon Cancer Studies (NCCCS I- colon and NCCCS II-rectal) were population-based case-control studies conducted in 33 counties of North Carolina. Cases were identified using the rapid case ascertainment system of the North Carolina Central Cancer Registry. Patients with a first diagnosis of histologically confirmed invasive adenocarcinoma of the colon (cecum through sigmoid colon) between October 1996 and September 2000 were classified as potential cases in the NCCCS I. The NCCCS II included patients with a first diagnosis of histologically confirmed invasive adenocarcinoma of the sigmoid colon, rectosigmoid, or rectum (hereafter collectively referred to as rectal cancer) between May 2001 and September 2006. Additional eligibility requirements were: aged 40-80 years, residence in one of the 33 counties, ability to give informed consent and complete an interview, had a driver's license or identification card issued by the North Carolina Department of Motor Vehicles (if under the age of 65), and had no objections from the primary physician in regards to contacting the individual. Controls, identified and sampled during the respective study dates, were selected from two sources. Potential controls under the age of 65 were identified using the North Carolina Department of Motor Vehicles records. For those 65 years and older, records from the Center for Medicare and Medicaid Services were used. Controls were matched to cases using randomized recruitment strategies. Recruitment probabilities were done using strata of 5-year age, sex, and race groups. Dietary information was collected using a modified version of the semiquantitative food frequency questionnaire developed at the National Cancer Institute. In addition, participants were asked about vitamin and mineral supplementation, special diets, restaurant eating, sodium use, and fats used in cooking. In NCCCS I, 515 colorectal cases and 687 matched controls were sent for genotyping. In NCCCS II, 796 colorectal cases and 823 controls were sent from the NCCCS II for genotyping. Controls were matched to CRC cases as 1:1 ratio. Matching was done on age, race, and sex. Age was matched on +-5 years. Race and sex was matched exactly. For the cases without matched controls, matching was done only on sex and race. Nurses Health Study (NHS): The NHS cohort began in 1976 when 121,700 married female registered nurses age 30-55 years returned the initial questionnaire that ascertained a variety of important health-related exposures (Belanger et al., 1978) [PMID: 248266]. Since 1976, follow-up questionnaires have been mailed every 2 years. Colorectal cancer and other outcomes were reported by participants or next-of-kin and followed up through review of the medical and pathology record by physicians. Overall, more than 97% of self-reported colorectal cancers were confirmed by medical-record review. Information was abstracted on histology and primary location. The rate of follow-up evaluation has been high: as a proportion of the total possible follow-up time, follow-up evaluation has been more than 92%. Colorectal cancer cases were ascertained through June 1, 2008. In 1989 -1990, 32,826 women in NHS I mailed blood samples by overnight courier, which were aliquoted into buffy coat and stored in liquid nitrogen. In 2001-2004, 29,684 women in NHS I who did not previously provide a blood sample mailed a swish-and-spit sample of buccal cells. Incident cases were defined as those occurring after the subject provided a blood or buccal sample. Prevalent cases were defined as those occurring after enrollment in the study in 1976 but before the subject provided either a blood or buccal sample. After excluding participants with histories of cancer (except nonmelanoma skin cancer), ulcerative colitis, or familial polyposis, case-control sets were previously constructed from which DNA was isolated from either buffy coat or buccal cells for genotyping. In addition to colorectal cancer cases and controls, a set of advanced adenoma cases and matched controls with available DNA from buffy coat were selected for genotyping. Over the follow-up period, data were collected on endoscopic screening practices and, if individuals had been diagnosed with a polyp, the polyps were confirmed to be adenomatous by medical record review. Adenoma cases were ascertained through June 1, 2011. A separate case-control set was constructed of participants diagnosed with advanced adenoma matched to control participants who underwent a lower endoscopy in the same time period and did not have an adenoma. Advanced adenoma was defined as an adenoma more than 1 cm in diameter and/or with tubulovillous, villous, or high-grade dysplasia/carcinoma-in-situ histology. Matching criteria included year of birth (within 1 year) and month/year of blood sampling (within 6 months), the reason for their lower endoscopy (screening, family history, or symptoms), and the time period of any prior endoscopy (within 2 years). Controls matched to cases with a distal adenoma either had a negative sigmoidoscopy or colonoscopy examination, and controls matched to cases with proximal adenoma all had a negative colonoscopy. A total of 272 cases and 236 matched controls were sent to CIDR for the advanced adenoma case-control set. Northern Swedish Health and Disease Study (NSHDS): Comprises over 110,000 participants, including approximately one third with repeated sampling occasions, from three population-based cohorts (Dahlin et al., 2010; Myte et al., 2016) [PMID: 20197478; PMID: 27367522]. The largest is the ongoing Vasterbotten Intervention Programme, in which all residents of Vasterbotten County are invited to a health examination upon turning 30 (some years), 40, 50 and 60 years of age. Extensive measured and self-reported health and lifestyle data, as well as blood samples for central biobanking in Umea, Sweden, are collected at the health exam. Leucocyte DNA samples for 1:1-matched CRC case-control sets from the NSHDS, of which 878 samples are included in this study, have been selected for genotyping. This is in addition to 354 samples from the NSHDS previously analyzed as part of the multicenter EPIC cohort. Cancer-specific and overall survival data are available for all patients. For at least 425 patients, archival tumor tissue has been analyzed for the BRAF V600E mutation and by sequencing codon 12 and 13 for KRAS mutations, as well as for MSI screening status by immunohistochemistry and for an eight-gene CIMP panel using quantitative real-time PCR (MethyLight). Ohio Colorectal Cancer Prevention Initiative (OCCPI, OSUMC): OCCPI (ClinicalTrials.gov identifier: NCT01850654) is a population-based study of colorectal cancer patients diagnosed in one of 51 hospitals throughout the state of Ohio from January 1, 2013 through December 31, 2016. The OCCPI was created to decrease CRC incidence in Ohio by identifying patients with hereditary predisposition (statewide universal tumor screening for newly diagnosed CRC patients), increase colonoscopy compliance for first-degree relatives of CRC patients, and encourage future research through the creation of a biorepository. The 51 Ohio hospitals participating in the OCCPI were selected to represent a cross-section of clinical centers in the state based on high reported volume of CRC patients, affiliation with a high volume hospital, or interest in participation. Institutional Review Board (IRB) approval was obtained by the individual hospitals, Community Oncology Programs, or by ceding review to the OSU IRB. Written informed consent was obtained. A total of 2139 colorectal cases were genotyped. Patients were considered eligible for this study if they were age 18 or older at the time of enrollment, if they had a surgical resection (or biopsy if unresectable) in the state of Ohio demonstrating an adenocarcinoma of the colorectum from 1/1/13 - 12/31/16. Matched control samples were selected from the Ohio State University Medical Center's (OSUMC) Human Genetics Sample Bank in an identical way to the selection for the Columbus-area HNPCC Study (please refer to the description for the Columbus-area HNPCC Study). Prostate, Lung, Colorectal and Ovarian Cancer Screening Trail (PLCO): PLCO enrolled 154,934 participants (men and women, aged between 55 and 74 years) at ten centers into a large, randomized, two-arm trial to determine the effectiveness of screening to reduce cancer mortality. Sequential blood samples were collected from participants assigned to the screening arm. Participation was 93% at the baseline blood draw. In the observational (control) arm, buccal cells were collected via mail using the "swish-and-spit" protocol and participation rate was 65%. Details of this study have been previously described (Huang et al., 2016) [PMID: 27673363] and are available online (http://dcp.cancer.gov/plco). For this study 1651 advanced adenoma cases and 1392 controls were selected for genotyping. Selenium and Vitamin E Prevention Trial (SELECT): The Selenium and Vitamin E Cancer Prevention Trial (SELECT) was a double-blind, placebo controlled clinical trial which explored using selenium and vitamin E alone and in combination to prevent prostate cancer in healthy men (Lippman et al., 2009) [PMID: 19066370]. Secondary endpoints included the prevention of colorectal and lung cancers. SELECT was conducted at 427 sites and centers in the United States, Canada and Puerto Rico; 35,533 men 55 years and older (50 or older if African American) were randomized beginning August 22, 2001. Supplementation was discontinued on October 23, 2008 due to futility. 308 colorectal cancer cases and 308 matched controls were selected from the SELECT population and sent for genotyping. Screening Markers For Colorectal Disease Study and Colonoscopy and Health Study (SMS-REACH): Details on this study population were previously reported (Burnett-Hartman et al., 2014) [PMID: 24875374]. Participants were enrollees in an integrated health-care delivery system in western Washington State (Group Health Cooperative, Seattle, Washington) aged 24-79 years who underwent an index colonoscopy for any indication between 1998 and 2007 and donated a buccal-cell or blood sample for genotyping analysis. Study recruitment took place in 2 phases, with phase 1 occurring in 1998-2003 and phase 2 occurring in 2004-2007. Persons who had undergone a colonoscopy less than 1 year prior to the index colonoscopy, persons with inadequate bowel preparation for the index colonoscopy, and persons with a prior or new diagnosis of colorectal cancer, a familial colorectal cancer syndrome (such as familial adenomatous polyposis), or another colorectal disease were ineligible. Patients diagnosed with adenomas or serrated polyps and persons who were polyp-free at the index colonoscopy (controls) were systematically recruited during both phases of recruitment. Approximately 75% agreed to participate and provided written informed consent. Based on medical records, persons who agreed to participate and those who refused study participation were similar with respect to age, sex, and colorectal polyp status. Study protocols were approved by the institutional review boards of the Group Health Cooperative and the Fred Hutchinson Cancer Research Center (Seattle, Washington). A total of 575 cases and 508 matched were selected for the study. Controls were matched to CRC cases as 1:1 ratio. Matching was done on age_ref, race, and sex. Age_ref was matched on +-5 years. The Women's Health Initiative (WHI): WHI is a long-term national health study that has focused on strategies for preventing heart disease, breast and colorectal cancer, and osteoporotic fractures in postmenopausal women. The original WHI study included 161,808 postmenopausal women enrolled between 1993 and 1998. The Fred Hutchinson Cancer Research Center in Seattle, WA serves as the WHI Clinical Coordinating Center for data collection, management, and analysis of the WHI. The WHI has two major parts: a partial factorial randomized Clinical Trial (CT) and an Observational Study (OS); both were conducted at 40 Clinical Centers nationwide. The CT enrolled 68,132 postmenopausal women between the ages of 50-79 into trials testing three prevention strategies. If eligible, women could choose to enroll in one, two, or all three of the trial components. The components are: Hormone Therapy Trials (HT): This double-blind component examined the effects of combined hormones or estrogen alone on the prevention of coronary heart disease and osteoporotic fractures, and associated risk for breast cancer. Women participating in this component with an intact uterus were randomized to estrogen plus progestin (conjugated equine estrogens [CEE], 0.625 mg/d plus medroxyprogesterone acetate [MPA] 2.5 mg/d] or a matching placebo. Women with prior hysterectomy were randomized to CEE or placebo. Both trials were stopped early, in July 2002 and March 2004, respectively, based on adverse effects. All HT participants continued to be followed without intervention until close-out. Dietary Modification Trial (DM): The Dietary Modification component evaluated the effect of a low-fat and high fruit, vegetable and grain diet on the prevention of breast and colorectal cancers and coronary heart disease. Study participants were randomized to either their usual eating pattern or a low-fat dietary pattern. Calcium/Vitamin D Trial (CaD): This double-blind component began 1 to 2 years after a woman joined one or both of the other clinical trial components. It evaluated the effect of calcium and vitamin D supplementation on the prevention of osteoporotic fractures and colorectal cancer. Women in this component were randomized to calcium (1000 mg/d) and vitamin D (400 IU/d) supplements or a matching placebo. The Observational Study (OS)examines the relationship between lifestyle, environmental, medical and molecular risk factors and specific measures of health or disease outcomes. This component involves tracking the medical history and health habits of 93,676 women not participating in the CT. Recruitment for the observational study was completed in 1998 and participants were followed annually for 8 to 12 years. All centrally confirmed cases of invasive colorectal cancers, or deaths from colorectal cancer were selected as potential cases from September 30, 2015 database. Controls were participants free of colorectal cancer (invasive or in situ) as of September 30, 2015. Potential cases and controls were excluded if they (1) were non-White; (2) had history of colorectal cancers at baseline; (3) lost to follow-up after enrollment; (4) DbGAP ineligible; (5) had <1.25ug of DNA; (6) selected for WHI study M26 Phase I or II; (7) selected for WHI study AS224 and also included in the imputation project. A total of 578 cases and 104,429 controls met the eligibility criteria. Each case was matched with 1 control (1:1) that exactly met the following matching criteria: age (+-5 years), 40 randomization centers (exact), WHI date (+-3 years), CaD date (+-3 years), OS flag (exact), HRT assignments (exact), DM assignments (exact), and CaD assignments (exact). Control selection was done in a time-forward manner, selecting one control for each case from the risk set at the time of the case's event. The matching algorithm was allowed to select the closest match based on a criteria to minimize an overall distance measure (Bergstralh EJ, Kosanke JL. Computerized matching of cases to controls. Technical Report #56, Department of Health Sciences Research, Mayo Clinic, Rochester MN. April 1995). Each matching factor was given the same weight. When exact matches could not be found, the matching criteria were gradually relaxed among unmatched cases and controls until all cases had found matched controls. Using the matching criteria specified above, 559 of the 578 eligible cases found exact matches. The matching criteria was then relaxed to : Age+-5, randomization centers, WHI date +- 3 years, CaD date +- 3 years, OS flag, HRT flag, DM flag, CaD flag. 17 of the remaining 19 unmatched cases found matched controls. By matching on Age+-5, randomization centers, WHI date +- 3 years, CaD date +- 3 years, OS flag, HRT flag, the remaining 2 unmatched cases found their matches.
This was an open-label, single-arm Phase II study of lamivudine in patients who had progressed on systemic therapy for advanced colorectal cancer with TP53 mutations. The phase II study had a two-stage design, with a target accrual of 20 evaluable patients for the first stage and a total of 32 patients for the whole study. The first 9 patients were treated with lamivudine at 150 mg PO bid continuously for 28-day cycles. Subsequent patients (10-32) received a higher dose of 600 mg PO bid continuously for 28-day cycles. Tumor assessments were performed every 8 weeks until documented disease progression by Response Evaluation Criteria in Solid Tumors (RECIST) or drug intolerance. Whole genome sequencing (WGS) and total RNA-seq were performed on pre- and post- treatment biopsies on this trial.
Heterogeneity in the tumor microenvironment (TME) of follicular lymphoma (FL) can affect clinical outcomes. We developed a new organoid culture method for cultivating patient-derived lymphoma organoids (PDLOs), which include cells from the native FL TME. We generated organoids from 12 FL patients at diagnosis, clinical progression, or relapse. These organoids were profiled with targeted DNA sequencing and mRNA sequencing. We evaluated the stability of organoids in culture at serial weekly timepoints (D0, D7, D14, D21). We treated organoids with CD19:CD3 or CD20:CD3 bispecific antibodies or unconjugated anti-CD20, anti-CD19, and anti-CD3 as a control, and profiled DNA/RNA at D11 to evaluate treatment response. This system is intended as a platform for advancing precision medicine efforts in FL through patient-specific modeling, high-throughput screening, TME signature identification, and treatment response evaluation.
In this ERC-funded ProstOmics project we have used spatial and bulk multi-omics on fresh frozen prostate cancer samples to investigate cancer biology and find for biomarkers to improve patient treatment. All tissue material were collected from prostate cancer patients undergoing radical prostatectomy who had not received any prior cancer-specific treatment.Of the 498 prostate tissue samples (114 patients) included in our project, 176 samples (N=37 patients) have been analyzed with bulk transcriptomics (RNA-seq). Some of these 176 samples were also analyzed with spatial transcriptomics (n=32, N=8, Visium 10x RNA-seq) and DNA methylomics (n=96, N=24, array). All datasets include metadata for histopathological evaluation. Patient metadata include information of age at surgery, time (months) until reported relapse, pre-surgery PSA and post-surgery T-stage.
This study provides a comprehensive benchmarking resource for somatic variant detection in cell-free DNA (cfDNA) from cancer patients. Longitudinal plasma samples from colorectal and breast cancer cohorts were selected to create patient-matched dilution series spanning ultra-low to high circulating-tumour-DNA (ctDNA) fractions, while preserving each individual’s germline and clonal haematopoiesis background. Deep whole-genome sequencing (150×) and ultra-deep whole-exome sequencing (2,000×) generated a reference call set of ~37,000 single-nucleotide variants and ~58,000 insertions/deletions. These data enabled systematic evaluation of nine somatic variant callers across variable ctDNA levels and sequencing depths, and were further used to explore machine-learning–guided parameter tuning. The resulting dataset offers an openly accessible framework for developers and clinicians to assess and optimize somatic variant calling in liquid biopsy applications.
Sequencing technologies are providing increasingly detailed insight in genetic makeup are paving their way into molecular diagnostics. The field will benefit from rigorous and bias-free measures for the quality of sequence data and for the proper representation of the complexity of the original samples. While current methodologies rely on the availability of a well-characterized reference genome, we propose kMer profiling for alignment-free assessment of the quality, comparability, and complexity of sequencing datasets. We show that kMer detects technical artefacts such as high duplication rates, library chimaeras, and differences in library preparation protocols in whole-genome, whole-exome, and RNA sequencing data. Additionally, it successfully captures the complexity and diversity of microbiomes. Thus, kMer allows for a robust evaluation of the quality and complexity of sequencing data without relying on any prior information and opens the way to a more reliable biological reasoning.
The mechanism underlying the occurrence of lung cancer metastasis to different tissues/organs remains elusive. We investigated the genomic evolution and immune microenvironments of paired primary-metastatic tumors by employing multi-region whole-exome sequencing in 179 samples of 106 tumors from 51 lung cancer patients and subsequent immunohistochemistry assays in 70 of them. Our data revealed differences in genomic landscapes, molecular determinants, evolutionary dynamics, and lymphocyte infiltration among different metastatic sites. We demonstrated commonly late arising of metastatic seeding of lung cancer with quantitative evidence. Most distant metastases originated from independent origins of earlier lymph node spreads. Immune-heterogeneity and -homogeneity were primarily driven by arm-level and focal copy number events in primary tumors, respectively. These findings implied the combinatorial role of multiple factors in shaping patterns of dissemination and advanced the clinical evaluation and intervention of lung cancer metastasis.
Tumor heterogeneity is believed to represent a barrier to pre-operative genomic characterization in kidney cancer. Previous studies of heterogeneity in clear cell renal cell carcinoma (ccRCC) evaluated only large and metastatic tumors. In small renal tumors in which multiple biopsies are not feasible, the extent of heterogeneity remains unknown. In this study, we evaluated how the extent of genomic heterogeneity in small and large ccRCC. A total of 23 small (<4cm ) and 24 large (>7cm ) ccRCC had 3 regions sampled for evaluation of copy number, clear-cell A/clear-cell B (ccA/ccB) classification, and cell cycle progression (CCP) score. Small tumors have less genomic complexity and significantly fewer subclonal events. Pre-treatment genomic characterization based on a single biopsy in small ccRCC can provide insight into biologic potential to make clinical decisions.
The goal of this project is to develop a smartphone-based platform to monitor and support individuals with COVID-19 symptoms (who may need testing) and those who have already tested positive. The app will integrate a Bluetooth-enabled thermometer and pulse oximeter into an approach uniquely designed for low-resource settings and underserved populations. This project will focus on the American Indian/Alaskan Native (AI/AN) underserved population and will determine the feasibility of the digital health solution to collect symptom survey data and vital data including temperature, pulse, and oxygen saturation level. The complete study will include 300 participants 100% medically underserved with at least 51% from the AI/AN community. A combination of hard and soft data will be collected consisting of vital data, medical history and questionnaire data, and participant COVID-19 viral and antibody testing results collected during the study.DOI: https://rapids.ll.mit.edu/10.57895/wv88-by98
The Follow-up of Ovarian Cancer Genetic Association and Interaction Studies (FOCI) was one of five projects funded in 2010 as part of the NCI's Genetic Associations and Mechanisms in Oncology (GAME-ON) initiative (http://epi.grants.cancer.gov/gameon/). FOCI represents a collective effort that builds upon the strengths and history of collaboration inherent in the Ovarian Cancer Association Consortium (OCAC), a multidisciplinary group comprised of epidemiologists, genetic epidemiologists, statistical geneticists, molecular and cell biologists and clinicians that was formed in 2005. The other four funded GAME-ON projects were: the ColoRectal TransdisciplinaryStudy (CORECT), Elucidating Loci Involved in Prostate Cancer Susceptibility (ELLIPSE), Discovery, Biology, and Risk of Inherited Variants in Breast Cancer (DRIVE), and Transdisciplinary Research in Cancer of the Lung (TRICL). As part of our aim to discover, expand, and replicate ovarian cancer susceptibility loci, the GAME-ON projects and other consortia formed the OncoArray network (http://epi.grants.cancer.gov/oncoarray/) to develop and genotype a new custom genotyping array in large numbers of cancer cases and controls (over 400,000 samples) across multiple cancer types. The FOCI data includes over 50,000 ovarian cancer cases and controls genotyped with the Oncoarray at the Center for Inherited Disease Research (CIDR). Genotype calling and quality control procedures were performed under a standardized protocol across the Oncoarray consortium, and over 490,000 SNPs passed QC and are included under this dbGaP submission.
Cohort Description The Coronary Artery Risk Development in Young Adults (CARDIA) study is a study examining the development and determinants of clinical and subclinical cardiovascular disease and their risk factors. It began in 1985-6 with a group of 5115 black and white men and women aged 18-30 years. The participants were selected so that there would be approximately the same number of people in subgroups of race, gender, education (high school or less and more than high school), and age (18-24 and 25-30 years) in each of 4 centers: Birmingham, AL; Chicago, IL; Minneapolis, MN; and Oakland, CA. These same participants were asked to participate in follow-up examinations during 1987-1988 (Year 2), 1990-1991 (Year 5), 1992-1993 (Year 7), 1995-1996 (Year 10), 2000-2001 (Year 15), 2005-2006 (Year 20), 2010-2011 (Year 25), 2015-2016 (Year 30), and 2021-2022 (Year 35). Data Being Submitted Wave 1 questionnaire data includes 397 variables for up to 2434 CARDIA participants in C4RWave 2 questionnaire data includes 448 variables for up to 1901 CARDIA participants in C4RDried Blood Spot/Serosurvey data includes 7 variables for up to 1332 CARDIA participants in C4RDerived data includes 43 variables for up to 2723 CARDIA participants in C4RPhenotype data includes 113 variables for up to 2723 CARDIA participants in C4R
Schizophrenia is a chronic, severe, disabling brain disorder that affects approximately 1% of the population worldwide. Epidemiologic studies have clearly demonstrated that genetics play a strong role in etiology, but the inheritance is very complex. Innovative analytic approaches and creative ways of combining disparate data sets will be necessary for making breakthroughs in identifying causal pathways and ultimately new drug targets. This dbGaP Collection consists of all genetic studies of schizophrenia available in dbGaP that have been consented for general research use. The goal is to facilitate identification of datasets with related scientific content in order to expedite the application process and ascertainment of datasets of interest for increased scientific discovery. The Open Translational Science in Schizophrenia (OPTICS) Project, was launched by Janssen Research & Development, LLC, part of the Janssen Pharmaceutical Companies of Johnson & Johnson, with a group of leading research organizations including the National Institutes of Health, Yale University School of Medicine, Rutgers University and the Harvard T.H. Chan School of Public Health, to create a new forum for collaborative analysis of Janssen's schizophrenia clinical trial data and other publicly available data about schizophrenia with the goal of creating new models for conducting research. The OPTICS project is one part of a larger effort at Janssen and other Johnson & Johnson research companies to share clinical trials data to enhance public health and advance science and medicine. Qualified investigators and physicians may apply for access to anonymized clinical trials data from Janssen, for more information please visit https://sites.google.com/site/opticsschizophrenia/.
The Breast Cancer Family Registry (BCFR) is a multi-center prospective cohort, comprised of over 30,000 women and men from nearly 12,000 families from the United States, Canada, and Australia. Our BCFR resource has been used, and continues to be used, by breast cancer researchers around the world in order to find new ways to prevent, diagnose, and treat cancer. The BCFR provides an extensive and diverse range of resources, expertise, and specialized skills, and has several unique strengths: 1) the collection of a large number of individuals and families across a wide spectrum of breast cancer risk, including both affected and unaffected individuals; 2) the large collection of families with early-onset breast cancer; 3) the large collection of racial/ethnic minority families not replicated elsewhere; 4) the extensive molecular characterization performed to date; and 5) active follow-up of both probands and family members. Thus, the BCFR comprises a unique cohort of probands and family members at familial/genetic risk of breast cancer that will continue to facilitate a wide range of research studies, such as gene discovery, examination of cancer-related outcomes and risk factors in high-risk subjects, investigation of novel behavioral interventions, and cancer prevention trials among at-risk family members. Consequently, the BCFR Cohort, as one of the few cohorts available worldwide with biospecimens and extensive molecular and genetic characterization, combined with epidemiologic data and long-term follow up, will be an invaluable resource for translational research in the genetic epidemiology of breast cancer.
Endometriosis is a common gynecological disorder affecting 11% of reproductive aged women and is a leading cause of pain and infertility. We investigated global DNA methylation (DNAm) profiles in endometrium associated with endometriosis, menstruation, and genetic variation. Endometrial samples were collected from 984 patients with confirmed endometriosis (n=637) and women without endometriosis (n=347). Patients were recruited through the University of California San Francisco, USA, University of Melbourne, Australia, Endometriosis CaRe Centre in Oxford, UK, and EXPPECT Centre, The University of Edinburgh, UK, using the World Endometriosis Research Foundation Endometriosis Phenome and Biobanking Harmonization Project (WERF EPHect) standardized protocols for tissue collection and processing, and participant characteristics and clinical annotation. DNA was extracted from samples and was use to generate DNAm data using the Illumina Infinium MethylationEPIC Beadchip and genotype data using the Axiom Precision Medicine Research array. Estimates from this study suggest that 15.4% of the variation in endometriosis is captured by DNAm. DNAm analysis identified significant differences in DNAm profiles and DNAm networks, associated with endometriosis, endometriosis sub-phenotypes, and menstrual cycle phases. Integration of DNAm and genetic data in a DNAm quantitative trait locus (mQTL) analysis identified 118,185 independent cis-mQTLs including 51 associated with risk of endometriosis highlighting target genes contributing to disease risk. This study identified novel factors affecting epigenetic regulation in endometrium associated with endometriosis risk and disease heterogeneity and provides an important data resource for reproductive medicine. Genotype data generated as part of this study are available on dbGaP and methylation data are available on GEO.
MVP is an ongoing prospective cohort study and mega‐biobank in the Department of Veterans Affairs Healthcare System designed to study genetic influences on health and disease among veterans. This is the accession to hold publicly available results. Results from sensitive phenotypes or MVP population subsets can be found at accession phs001672 and will require an application for access.
Cardiomyocyte-derived induced pluripotent stem cells (iPSCs) may represent a promising therapeutic strategy for severely damaged myocardium. This study aimed to assess the efficacy and safety of clinical grade human iPSC-derived cardiomyocyte (hiPSC-CM) patches and conduct a pre-clinical proof-of-concept analysis.
Paired PCR-free whole genome sequencing data of a matched metastatic melanoma cell line (COLO829) and normal across three lineages and across separate institutions, with independent library preparations, sequencing, and analysis. The data was generated with mean mapped coverages of 99X for COLO829 and 103X for the paired normal across three institutions. Overall, common events include >35,000 point mutations, 446 small insertion/deletions, and >6,000 genes affected by copy number changes. We present this reference to the community as an initial standard for enabling quantitative evaluation of somatic mutation pipelines across institutions.
Blood plasma samples (n=168) and matched diagnostic formalin-fixed paraffin-embedded (FFPE) tissue samples (n=69) of DLBCL patients, PMBCL patients and healthy controls were collected between 2016-2021. Plasma samples were collected at diagnosis, at interim evaluation, after treatment, and in case of refractory or relapsed disease. RNA was extracted from 200 µl plasma using the miRNeasy serum/plasma kit and from FFPE tissue using the miRNeasy FFPE kit. RNA was subsequently sequenced on a NovaSeq 6000 instrument using the SMARTer Stranded Total RNA-seq pico v3 library preparation kit.
The 'Genome-Wide Associations Environmental Interactions in the Lung Health Study' at Johns Hopkins University aims to test for association between lung function decline as a primary outcome associated with chronic obstructive pulmonary disease (COPD) using banked DNA and phenotype data on 4,287 European Americans from the longitudinal, multicenter Lung Health Study (LHS). The broad goals of the LungGO/ESP-GO falls into two general categories: (i) discovery of all variants (i.e., common and rare) in all protein-coding regions of the human genome (i.e., the exome) conferring risk to complex pulmonary diseases including COPD, in a subset of the LHS cohort. The Johns Hopkins University LHS cohort offers a unique opportunity to elucidate genetic variants that cause COPD. The Lung Health Study I was a randomized multicenter clinical trial with 5887 participants carried out from October 1986 to April 1994, designed to test the effectiveness of smoking cessation and bronchodilator administration in smokers aged 35 to 60 with mild lung function impairment. Participants were randomly assigned to one of three groups: usual care, who received no intervention smoking intervention with the inhaled bronchodilator ipratroprium bromide smoking intervention with an inhaled placebo. The effect of intervention was evaluated by the rate of decline of forced expiratory volume in one second (FEV1). For the GWAS, only the subset of European American LHS participants for whom lung function data from three time points or more are available. Thus, the GWAS represents 73% of the 5,887 volunteers who participated in the LHS study. Importantly, LHS subjects included had similar demographics (including age, gender and BMI) and rates of lung function decline (mean annual change in FEV1% predicted: -0.96 %/yr vs. -0.99 %/yr, p=0.57) compared with those not included in the GWAS, reflecting little selection bias for our primary outcome. They were, however, more likely to have quit smoking after 5 years. This study is part of the Gene Environment Association Studies initiative (GENEVA, http://www.genevastudy.org) funded by the trans-NIH Genes, Environment, and Health Initiative (GEI). The overarching goal is to identify novel genetic factors that contribute to lung function through large-scale genome-wide association studies of smokers enrolled in a multicenter clinical trial. Genotyping was performed at the Johns Hopkins University Center for Inherited Disease Research (CIDR). Data cleaning and harmonization were done at the GEI-funded GENEVA Coordinating Center at the University of Washington.