Affymetrix SNP6.0 genotype data for prostate cancer patients
21 unlinked autosomal microsatellite loci for 30 Central Asian populations
High Quality Variant Call files, generated by bioscope, converted to vcf format. Complete dataset for all 300 samples.
Summary data from Meta-analysis of Genome-Wide-Association Studies for plasma levels of Coagulation Factor XI (FXI)
Genomics to select patients with metastatic breast cancer for targeted therapy (microarray_agilent)
AS genotyping data for lead SNPs using Illuminia Global Array V2.0
exon 11 mutated UWB1.289 and COV362 cell lines
Whole exome sequencing and RNA sequencing of B-Cell Precursor Lymphoblastic Lymphoma tissue biopsies.
DAC for human embryo ATAC+RNA single cell sequencing samples
Committee that handles the majority of policies for the Linnarsson Lab at Karolinska Institutet
RNA-seq count matrix for 296 bulk pre-treatment tumors from IMblaze370
18 WGBS lanes for 9 samples of pilocytic astrocytoma.
TBD
Firs 1106 16S rDNA data for the Flemish Gut Flora Project
The IPM BioMe Biobank, founded in September 2007, is an ongoing, broadly-consented electronic health record (EHR)-linked clinical care biobank that enrolls participants non-selectively from the Mount Sinai Medical Center patient population. BioMe currently comprises >42,000 participants from diverse ancestries, characterized by a broad spectrum of longitudinal biomedical traits. Participants are enrolled through an opt-in process and consent to be followed throughout their clinical care (past, present, and future) in real-time, allowing us to integrate their genomic information with their EHRs for discovery research and clinical care implementation. BioMe participants consent for recall, based on their genotype and/or phenotype, permitting in-depth follow-up and functional studies for selected participants at any time. Phenotypic and genomic data are stored in a secure database and made available to investigators, contingent on approval by the BioMe Governing Board. BioMe uses a "data-broker" system to protect confidentiality. Ancestral diversity - BioMe participants represent a broad racial, ethnic and socioeconomic diversity with a distinct and population-specific disease burden. Specifically, BioMe participants are of African (AA), Hispanic/Latino (HL), European (EA) and other/mixed ancestry. BioMe participants are predominantly of African (AA, 24%), Hispanic/Latino (HL, 35%), European (EA, 32%), and other ancestry (OA, 10%). Participants who self-identify as Hispanic/Latino further report to be of Puerto Rican (39%), Dominican (23%), Central/South American (17%), Mexican (5%) or other Hispanic (16%) ancestry. More than 40% of European ancestry participants are genetically determined to be of Ashkenazi Jewish ancestry. With this broad ancestral diversity, BioMe is uniquely positioned to examine the impact of demographic and evolutionary forces that have shaped common disease risk. Phenotypes available in BioMe - BioMe has a high-quality and validated set of fully implemented clinical phenotype data that has been culled by a multi-disciplinary team of experienced investigators, clinicians, information technologists, data-managers, and programmers who apply advanced medical informatics and data mining tools to extract and harmonize EHRs. BioMe, as a cohort, offers a great versatility for designing nested case-control sample-sets, particularly for studying longitudinal traits and co-morbidity in disease burden. Biomedical and clinical outcomes: The BioMe Biobank is linked to Mount Sinai's system-wide Epic EHR, which captures a full spectrum of biomedical phenotypes, including clinical outcomes, covariate and exposure data from past, present and future health care encounters. As such, the BioMe Biobank has a longitudinal design as participants consent to make all of their EHR data from past (dating back as far as 2003), present and future inpatient or outpatient encounters available for research, without restriction. The median number of outpatient encounters is 21 per participant, reflecting predominant enrollment of participants with common chronic conditions from primary care facilities. Environmental data: The clinical and EHR information is complemented by detailed demographic and lifestyle information, including ancestry, residence history, country of origin, personal and familial medical history, education, socio-economic status, physical activity, smoking, dietary habits, alcohol intake, and body weight history, which is collected in a systematic manner by interview-based questionnaire at time of enrollment. The IPM BioMe Biobank contributed ~10,600 DNA samples for whole genome sequencing to the TOPMed program. Samples were selected for the Coronary Artery Disease (CAD) and the Chronic Obstructive Pulmonary Disease (COPD) working groups. Using a Case-Definition-Algorithm (CDA), we identified ~4,100 individuals with CAD (~50% women) and ~3,000 individuals as controls (65% women). In addition, we identified ~800 individuals with COPD (62% women) and 1800 individuals as controls (72% women). Another 600 BioMe participants with Atrial Fibrillation, all of African ancestry, were included.
This study (DA033813; PI: Andrew W Bergen; PMID:26132489) includes samples from two laboratory studies of nicotine metabolism. The Pharmacokinetics of Nicotine Metabolism in Twins study (PKTWIN; PI: Gary E Swan; PMID: 15527659) was based on recruitment from a twin registry (PMID: 23084148). The Integrated Research Project on Tobacco Use and Dependence (IRP; PI: Gary E Swan; PMID: 14578134) was based on recruitment from a pedigree-based longitudinal study of risk factors for substance use, the Smoking in Families study (SMOFAM; DA03706; PI: Hy Hops). These two laboratory studies (PKTWIN and IRP/SMOFAM) served as the Stage I dataset to interrogate Drug Metabolizing Enzyme and Transporter genes with a targeted SNP array for association with the Nicotine Metabolite Ratio (NMR, ratio of trans-3'-hydroxycotinine and cotinine), an established biomarker of nicotine metabolism. In addition to the laboratory studies, samples from eight RCTs (PMID: 23249876) with the NMR and smoking-related measures used to test SNPs identified in Stage I (PMID: 26132489). In a third stage, a lung cancer meta-analysis database (PMID: 24880342) was used to assess association of SNPs identified in Stage II with lung cancer. The objectives of the study were to identify novel genes and SNPs contributing to nicotine metabolism (Stage I), and to validate PK SNPs associated with the NMR from individuals participating in a clinical laboratory protocol with the NMR obtained from treatment-seeking smokers, and then to investigate association with prospective smoking cessation (Stage II). This study built upon existing studies of nicotine metabolism and randomized trials of smoking cessation therapies. Enhanced knowledge of the genes influencing nicotine metabolism and prospective abstinence may help personalize smoking cessation treatment and risk assessment for smoking-related diseases. For Stage I, both subject [fixed-dose NMR, covariates (age, BMI, ethnicity, sex, smoking status, and hormone use), and pedigree relationships] and sample (common DMET SNP genotype, genotyping quality control) data are available in this accession. The analysis protocol, quality control summaries, summary genotype, summary phenotype, and analysis results are available for Stage I, II and III samples (PMID: 26132489). Extensive discussion of the prior CYP2A6 association literature with the NMR, abstinence, smoking heaviness and lung cancer risk is available (PMID: 26132489). The NMR has previously been associated with CYP2A6 activity, response to smoking cessation treatments, and cigarette consumption. We searched for drug metabolizing enzyme and transporter (DMET) gene variation associated with the NMR and prospective abstinence in 2,946 participants of laboratory studies of nicotine metabolism and of clinical trials of smoking cessation therapies. Stage I was a meta-analysis of the association of 507 common single nucleotide polymorphisms (SNPs) at 173 DMET genes with the NMR in 449 participants of two laboratory studies. Nominally significant associations were identified in ten genes after adjustment for intragenic SNPs; CYP2A6 and two CYP2A6 SNPs attained experiment-wide significance adjusted for correlated SNPs (CYP2A6 PACT=4.1E-7, rs4803381 PACT=4.5E-5, rs1137115, PACT=1.2E-3). Stage II was mega-regression analyses of 10 DMET SNPs with pretreatment NMR and prospective abstinence in up to 2,497 participants from eight trials. rs4803381 and rs1137115 SNPs were associated with pretreatment NMR at genome-wide significance. In post-hoc analyses of CYP2A6 SNPs, we observed nominally significant association with: abstinence in one pharmacotherapy arm; cigarette consumption among all trial participants; and lung cancer in four case:control studies. CYP2A6 minor alleles were associated with reduced NMR, CPD, and lung cancer risk. We confirmed the major role that CYP2A6 plays in nicotine metabolism, and made novel findings with respect to genome-wide significance and associations with CPD, abstinence and lung cancer risk. Additional multivariate analyses with patient variables and genetic modeling will improve prediction of nicotine metabolism, disease risk and smoking cessation treatment prognosis.