The Kaiser Permanente Research Program on Genes, Environment, and Health (RPGEH) is a resource developed to facilitate research on genetic and environmental factors on common diseases and healthy aging. The RPGEH resource links biospecimens, health surveys, and comprehensive electronic medical records on broadly consented adult members of Kaiser Permanente Medical Care Plan, Northern California Region (KPNC). KPNC is an integrated health care delivery system with a membership of approximately 3.3 million people in northern California. The membership of KPNC is representative of the general population in the 14 county area in which facilities are located, although extremes of income are underrepresented. At the end of 2013, the RPGEH resource included: (1) demographic and behavioral surveys from over 430,000 participants; (2) biospecimens (DNA, serum, plasma, and/or saliva) from over 204,000 participants, including over 13,000 pregnant women; (3) genome-wide genotype data (70 billion SNP genotypes) on over 100,000 participants, including the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort; and (4) the longitudinal electronic medical records of the participants. The RPGEH was developed beginning in 2005 at the Division of Research of Kaiser Permanente Northern California by Catherine Schaefer (Director), Neil Risch (Co-Director), Lisa Croen, Eric Jorgenson, Lawrence Kushi, Charles Quesenberry, Sarah Rowell, Carol Somkin, Stephen Van den Eeden, Larry Walter, and Rachel Whitmer. Funding of the RPGEH was provided to C. Schaefer (PI) and N. Risch (co-PI) by the Wayne and Gladys Valley Foundation, The Ellison Medical Foundation, the Robert Wood Johnson Foundation, Kaiser Permanente Northern California, and the Kaiser Permanente National and Regional Community Benefit Programs. The GERA cohort was funded by a grant from NIH to RPGEH and UCSF (RC2 AG036607; C. Schaefer and N. Risch, PIs). At the time of the award of the RC2 project in late 2009, the RPGEH had established a cohort of about 140,000 individuals who had answered a detailed survey, provided saliva samples for extraction of DNA, and given broad consent for the use of their data in studies of health and disease. Survey and Cohort Recruitment. Initially, the RPGEH developed electronic disease registries to enable identification of phenotypes, using algorithms applied to EMR data. In 2007, the RPGEH mailed a four page survey to 1.9 million adult (≥ 18 years old) members of KPNC who had been members for two years or more, to obtain data on demographic and behavioral factors complementary to the clinical data in the EMR. The survey materials included a cover letter introducing the RPGEH, a two page list of Frequently Asked Questions, and the survey, which included questions on demographic factors such as education, race-ethnicity, income and marital status, dietary factors, physical activity, smoking, and alcohol consumption, as well as reproductive history and reproductive health. Members whose electronic medical records indicated a preference for written communications in Chinese or Spanish received survey materials both in English and a Chinese or Spanish translation. Approximately 400,000 completed surveys were returned. Saliva Sample Collection. Beginning in July 2008, respondents to the survey were asked to sign and return a consent form and authorization for use and disclosure of protected health information. The consent form authorized broad use of biospecimens, survey data, and data from participants' electronic health records for use in studies of genetic and environmental influences on health and disease. Respondents who returned completed consent forms were mailed (Oragene) saliva collection kits; more than 132,000 saliva samples were collected in two years. Completed saliva kits were scanned and archived in a temporary biorepository at the KPNC Division of Research. In late 2009, the RPGEH began collection of saliva samples from the California Men's Health Study (CMHS), a cohort that had been previously assembled in 2002-2003 and had been excluded from the RPGEH survey mailing with the intent of later adding CMHS participants to the assembled RPGEH cohort. The CMHS was developed to facilitate research on prostate cancer and other conditions in older men; the study protocol is described in Enger, et al., 2006. It enrolled and surveyed more than 40,000 men in KPNC, ages 45-69 years, who were members of KPNC during 2002-2003. CMHS men completed two mailed surveys with demographic and behavioral data similar to that of the RPGEH. The data on analogous variables were reconciled and integrated with the data derived from the RPGEH cohort for use in the RPGEH resource. By 2011, RPGEH collected approximately 15,400 saliva samples from men participating in the CMHS. RPGEH Access and Collaborations Website and Procedures. The RPGEH maintains a web portal for inquiries and applications for collaboration and access to data. The url is: https://rpgehportal.kaiser.org/. RPGEH has an application process and an Access Review Committee that reviews applications for collaboration and use. For more information, please contact RPGEH through the website.
Dataset contains WGS sequencing data from bulk sorted therapy-related myeloid neoplasms (t-MN) and reference cells (MSCs/B cells/T cells). In addition, from 4 patients, also WGS data from single hematopoietic stem and progenitor cells, obtained from samples of t-MN diagnosis, are included. These are either clonally expanded before WGS, or DNA was directly amplified via the primary template-directed amplification (PTA) protocol (mentioned in the sample name).
project aims to understand somatic variants, differential expressed genes and signaling pathways in HBV- and NASH related HCC tumors.
Aims: Identifying new therapeutic targets of small cell lung cancer (SCLC), genome-wide mutation analysis has been performed. Methods: Genomic DNA was extracted from formalin-fixed or methanol-fixed tissue samples. 71 Mb of DNA fragments containing whole coding exons were concentrated using SureSelect Human All Exon V4+UTRs Kit (Agilent Technologies) followed by 100-bp paired end sequencing by HiSeq 2000 (Illumina). Participants/Materials: 51 of 1042 cases of pathologically diagnosed small seen lung cancer that were registered to National Cancer Hospital East Lung Cancer Database in 1992-2012, and which surgically resected or biopsy samples were suitable for DNA extraction for further analyses.
The LifeLines-DEEP cohort is a sub-cohort of the LifeLines cohort (167,729 participants) that employs a broad range of investigative procedures to assess the biomedical, socio-demographic, behavioral, physical and psychological factors that contribute to health and disease in the general Dutch population, (Scholtens 2015). A subset of approximately 1,500 participants also took part in LifeLines-DEEP. For these participants, additional biological materials were collected, including analysis of the gut microbiome composition. The phenotyping and processing of LifeLines-DEEP has been described in Tigchelaar (2015).
Whole Genome Sequencing has been applied in 32 SRCC patients and the raw data have been subjected to standard procedures. Files with genomic variant calling were obtain at the last step.
Genome-wide association study performed on the EPICOLON2 cohort, comprising colorectal cancer cases and matched controls of Spanish origin.
The illumina exome chip genotyping data for 943 PDAC cases and 3,908 controls in the Chinese population. Genotypes were called by the Illumina GenomeStudio software, and the selected variants were re-called by zCall. Standard quality control were performed.
PURPOSE: Cancer of unknown primary (CUP) is a group of metastatic tumors in which the standard diagnostic work-up fails to identify the site of origin of the tumor. The potential impact of precision oncology on this group of patients is large since their tumors might have actionable driver mutations that can provide treatment options otherwise not available for patients with these fatal cancers. This study investigated if comprehensive genomics analyses could inform on the origin of the tumor. PATIENT AND METHODS: Here we describe a patient whose tumor was misdiagnosed at least three times. Next-generation sequencing, a PDX mouse model and bioinformatics was used to identify an actionable mutation, predict resistance development to the targeted therapy, and to correctly diagnose the origin of the tumor. The Cancer Genome Atlas was used to benchmark the bioinformatics workflow. RESULTS: Despite the lack of a known primary tumor site and the absence of diagnostic immunohistochemical markers, the origin of the patient's tumor was established using the novel bioinformatics workflow. This included a mutational signature analysis of the sequenced metastases and comparison of their transcriptomic profiles to a pan-cancer panel of tumors from The Cancer Genome Atlas. We further discuss the strengths and limitations of the latter approaches in the context of three potentially incorrectly diagnosed TCGA lung tumors. CONCLUSION: Comprehensive genomics analyses could inform on the origin of tumors in patients suffering from CUP.
Background Single-cell micro-metastases of solid tumors often occur in the bone marrow. These disseminated tumor cells (DTCs) may resist therapy and lay dormant or progress to cause overt bone and visceral metastases. The molecular nature of DTCs remains elusive, as well as when and from where in the tumor they originate. Here, we apply single-cell sequencing to identify and trace the origin of DTCs in breast cancer. Results We sequence the genomes of 63 single cells isolated from six non-metastatic breast cancer patients. By comparing the cells DNA copy number aberration (CNA) landscapes with those of the primary tumors and lymph node metastasis, we establish that 53% of the single cells morphologically classified as tumor cells are DTCs disseminating from the observed tumor. The remaining cells represent either non-aberrant ‘normal’ cells or aberrant cells of unknown origin that have CNA landscapes discordant from the tumor. Further analyses suggest that the prevalence of aberrant cells of unknown origin is age-dependent, and that at least a subset is hematopoietic in origin. Evolutionary reconstruction analysis of bulk tumor and DTC genomes enables ordering of CNA events in molecular pseudo-time and traced the origin of the DTCs to either the main tumor clone, primary tumor subclones, or subclones in an axillary lymph node metastasis. Conclusions Single-cell sequencing of bone marrow epithelial-like cells, in parallel with intra-tumor genetic heterogeneity profiling from bulk DNA, is a powerful approach to identify and study DTCs, yielding insight into metastatic processes. A heterogeneous population of CNA-positive cells is present in the bone marrow of non-metastatic breast cancer patients, only part of which are derived from the observed tumor lineages.