Whole genome sequencing was conducted on 10 tumor/germline paired samples along with 20 additional unpaired tumor samples from patients with Waldesntrom's macroglobulinemia. Tumor lymphoplasmacytic lymphoma cells were obtained from CD19+ selected bone marrow mononuclear cells. Germline tissue was obtained from CD19 depleted peripheral blood mononuclear cells. High molecular weight DNA was then submitted for whole genome sequencing with Complete Genomics and aligned to HG19/NCBI human reference build 37.
This study consists of whole genome sequencing (target: average 30x coverage) of 110 European-ancestry (EA), early-onset, family-history-positive breast cancer cases, 21 Asian cases, 25 African-American cases, and 24 controls from six studies participating in the Discovery, Biology, and Risk of Inherited Variants in Breast Cancer (DRIVE) consortium, part of the NCI's Genetic Associations and Mechanisms in Oncology (GAME-ON) initiative ( http://epi.grants.cancer.gov/gameon/ )
MI cases were ascertained from two studies: (i) the British Heart Foundation Family Heart Study and (ii) the BRICCS Study. Control subjects were ascertained from the control subjects being recruited as part of the UK Aneurysm Growth Study (UKAGS). All exome sequencing was performed at the Broad Institute of Harvard and MIT; samples sequence capture was performed using Illumina's ICE Capture reagent and sequencing was performed on an Illumina HiSeq 2000 or 2500.
The goal of this study is to identify previously unknown genetic causes of human syndromic cleft lip/palate by using genome sequencing. Many previous studies have shown that isolated cleft lip and/or palate are not often likely to be monogenic or Mendelian diseases. We hope that by focusing on syndromic presentations, we will find novel causes of disorders which include orofacial clefting among their phenotype(s). Data provided will be fastq files from sequencing partners.
We generated 10X, droplet-based Multiome (paired RNA+ATAC) data of PBMCs in 38 IBD patients. Patients are either UC or CD patients. To test the responsiveness of the cells to bacterial stimuli, we stimulated samples with either RPMI, LPS or S. Salmonella (total: 80 samples). Single-nuclei libraries are multiplexed across donors. Genotypes are provided in this dataset to perform genetic demultiplexing, phenotypes providing mapping information to retrace donors / pools / stimulations.
Single-cell RNA-sequencing of brain organoids at day 120 grown in four distinct growth protocols designed to generate dorsal (D) and ventral (V) forebrain, midbrain (M) and striatum (S) tissue using eight different pluripotent stem cell lines. Per per cell line-protocol combination two organoids (biological replicates) were grown. For cell line 176 the experiment was conducted in two independent biological replicates (repititions E1 and E2).
Only a few studies have reported the molecular characteristics of adult cerebellar glioblastoma (C-GBM), a subtype comprising 1% of glioblastoma cases located within the infratentorial brain region due to the rarity. By identifying genomic profiles from 19 adult C-GBM samples, we revealed the genetic intertumoral heterogeneity in C-GBM as well as distinct genomic characteristics from those of supratentorial glioblastomas (S-GBMs), emphasizing the need of individualized therapies for C-GBM patients.
Somatic RNA for 40 samples matched to the WGS was extracted using the Qiagen Qiasymphony RNA protcol (cat no 931636). The tissue was initially homogenised using a Qiagen Bioruptor, followed by the manufacturers recommended protocol (including DNase digestion). The resulting RNA the underwent quality control as follows: firstly, A260 and A280nm were measured on a Denovix DS-11 Fx to qualitatively illustrate A260/280nm and A260/230nm ratios as measures of RNA purity. A260/280 had to be 2.0 and A260/230 had to be 2.0-2.2. Then RNA was quantified using LifeTechnologies Qubit RNA BR kit (cat no Q10210). RNAseq was carried out by the Edinburgh Clinical Research Facility on an Illumina NExtSeq500. Total RNA samples were assessed on the Agilent Bioanalyser (Agilent Technologies, #G2939AA) with the RNA 6000 Nano Kit (#5067-1512) for quality and integrity of total RNA, and then quantified using the Qubit 2.0 Fluorometer (Thermo Fisher Scientific Inc, #Q32866) and the Qubit RNA HS assay kit (#Q32855). Libraries were prepared from total-RNA sample using the NEBNext Ultra 2 Directional RNA library prep kit for Illumina (#E7760S) with the NEBNext rRNA Depletion kit (#E6310) according to the provided protocol. 400ng of totalRNA was then added to the ribosomal RNA (rRNA) depletion reaction using the NEBNext rRNA depletion kit (Human/mouse/rat) (#E6310). This step uses specific probes that bind to the rRNA in order to cleave it. rRNA-depleted RNA was then DNase treated and purified using Agencourt RNAClean XP beads (Beckman Coulter Inc, #66514). RNA was then fragmented using random primers before undergoing first strand and second strand synthesis to create cDNA. cDNA was end repaired before ligation of sequencing adapters, and libraries were enriched by PCR using the NEBNext Multiplex oligos for Illumina set 1 and 2 (#E7500). Final libraries had an average peak size of 271bp. Libraries were quantified by fluorometry using the Qubit dsDNA HS assay and assessed for quality and fragment size using the Agilent Bioanalyser with the DNA HS Kit (#5067-4626). Sequencing was performed using the NextSeq 500/550 High-Output v2 (150 cycle) Kit (# FC- 404-2002) on the NextSeq 550 platform (Illumina Inc, #SY-415-1002). Libraries were combined in an equimolar pool based on the library quantification results and run across 5 High-Output Flow Cell v2.5.
Somatic RNA for 37 samples was extracted using the Qiagen Qiasymphony RNA protcol (cat no 931636). The tissue was initially homogenised using a Qiagen Bioruptor, followed by the manufacturers recommended protocol (including DNase digestion). The resulting RNA the underwent quality control as follows: firstly, A260 and A280nm were measured on a Denovix DS-11 Fx to qualitatively illustrate A260/280nm and A260/230nm ratios as measures of RNA purity. A260/280 had to be 2.0 and A260/230 had to be 2.0-2.2. Then RNA was quantified using LifeTechnologies Qubit RNA BR kit (cat no Q10210). RNAseq was carried out by the Edinburgh Clinical Research Facility on an Illumina NExtSeq500. Total RNA samples were assessed on the Agilent Bioanalyser (Agilent Technologies, #G2939AA) with the RNA 6000 Nano Kit (#5067-1512) for quality and integrity of total RNA, and then quantified using the Qubit 2.0 Fluorometer (Thermo Fisher Scientific Inc, #Q32866) and the Qubit RNA HS assay kit (#Q32855). Libraries were prepared from total-RNA sample using the NEBNext Ultra 2 Directional RNA library prep kit for Illumina (#E7760S) with the NEBNext rRNA Depletion kit (#E6310) according to the provided protocol. 400ng of totalRNA was then added to the ribosomal RNA (rRNA) depletion reaction using the NEBNext rRNA depletion kit (Human/mouse/rat) (#E6310). This step uses specific probes that bind to the rRNA in order to cleave it. rRNA-depleted RNA was then DNase treated and purified using Agencourt RNAClean XP beads (Beckman Coulter Inc, #66514). RNA was then fragmented using random primers before undergoing first strand and second strand synthesis to create cDNA. cDNA was end repaired before ligation of sequencing adapters, and libraries were enriched by PCR using the NEBNext Multiplex oligos for Illumina set 1 and 2 (#E7500). Final libraries had an average peak size of 271bp. Libraries were quantified by fluorometry using the Qubit dsDNA HS assay and assessed for quality and fragment size using the Agilent Bioanalyser with the DNA HS Kit (#5067-4626). Sequencing was performed using the NextSeq 500/550 High-Output v2 (150 cycle) Kit (# FC- 404-2002) on the NextSeq 550 platform (Illumina Inc, #SY-415-1002). Libraries were combined in an equimolar pool based on the library quantification results and run across 5 High-Output Flow Cell v2.5.
The electronic Medical Records and Genomics (eMERGE) Network is a consortium of ten participating sites (Cincinnati Children's Hospital Medical Center/Boston Children's Hospital, Children's Hospital of Philadelphia, Essentia Institute of Rural Health, Marshfield Clinic Research Foundation and Pennsylvania State University, Geisinger Clinic, Group Health Cooperative/University of Washington, Mayo Clinic, Icahn School of Medicine at Mount Sinai, Northwestern University, Vanderbilt University Medical Center) funded by the NHGRI to investigate the use of electronic medical record (EMR) systems for genomic research. The goal of eMERGE is to conduct genome-wide association studies in approximately 55,000 individuals using EMR-derived phenotypes and DNA from linked Biorepositories. Using electronic phenotyping methods, the consortium used DNA samples from all participating sites to explore the genetic determinants of over forty phenotypes, including Abdominal aortic aneurysm; Ace-Inhibitor/Cough; Attention Deficit Hyperactivity Disorder; Age-related macular disease; Appendicitis; Asthma; Atopic Dermatitis; Autism; Benign Prostatic Hyperplasia; Carotid artery disease as a Quantitative Measure; caMRSA; Cataract; Clostridium difficile colitis; Extreme Obesity; Chronic Kidney Disease; Chronic Kidney Disease and Type 2 Diabetes; Chronic Kidney Disease, Type 2 Diabetes and Hypertension; Colon Polyps; Cardiorespiratory Fitness; Dementia; Diverticulosis; Diabetic retinopathy; Gastroesophageal Reflux Disease; Glaucoma; Height; Heart failure; Hypothyroidism; Lipids; Ocular hypertension; Peripheral Arterial Disease; QRS duration; Red blood cell indices; Remission of Diabetes after ROUX-EN-Y gastric bypass surgery; Resistant hypertension; MACE while on Statins; Type 2 Diabetes; Venous Thromboembolism; White blood cell indices; and Zoster virus infection, as well as using the phenome-wide association study (PheWAS) paradigm to replicate and discover relationships between targeted genotypes with multiple phenotypes. Sites and participants include: Children's Hospital of Pennsylvania (CHOP): The Center for Applied Genomics (CAG) at the Children's Hospital of Philadelphia (CHOP) is a high-throughput, highly automated genotyping and sequencing facility equipped with state-of-the-art genotyping and sequencing platforms. Children who are treated at the Children's Hospital Healthcare Network and their parents may be eligible to take part in a major initiative to collect more than 100,000 blood samples, covering a wide range of pediatric diseases. A large majority of participants consenting to prospective genomic analyses also consent to analysis of their de-identified electronic medical records (EMRs). EMRs are longitudinal, with a mean duration of 6.5 years. Cincinnati Children's Hospital Medical Center/Boston's Children's Hospital (CCHMC/BCH): Cincinnati Children's Hospital Medical Center (CCHMC) and Boston Children's Hospital (BCH) are pediatric institutions dedicated to improving health and welfare of children and to the shared purpose of discovery and practical application of new genomic information to the ordinary care of children. The CCHMC/BCH site has been built on a five-year history of collaboration, particularly in patient electronic record (ERM)-related informatics, the basis of much of eMERGE II. CCHMC and BCH together bring an extraordinary faculty to eMERGE II who are committed to diseases that afflict children, specifically phenotypes that focus upon diseases of children in ways that will leverage the available eMERGE adult GWAS and EMRs to discover meaningful use results. CCHMC/BCH plans to demonstrate real-time execution of phenotypic selection across their two distinct pediatric institutions as a model for ensuring phenotypic standardization and for national scalability. They will also look carefully at parents' responses to results and use of their children's research results and better understand the factors that influence their decisions about learning incidental findings. In addition to patient and parent perceptions CCHMC/BCH will also explore clinician perceptions of pharmacogenetic research results after EMR integration. Geisinger Health System: A research cohort of adult Geisinger Clinic patients was enrolled from community-based primary care clinics of the Geisinger Health System. Patients were eligible for enrollment if they were a primary care patient of a Geisinger Clinic physician and were scheduled for a non-emergent clinic visit. All participants provided written informed consent and HIPAA authorization. Consenting patients agreed to provide blood samples for broad biomedical research use, and permission to access data in their Geisinger electronic medical record for research. The enrollment rate was 90% of patients approached. The demographics of the cohort approximate those of the Geisinger Clinic outpatient population. Research blood samples were collected during an outpatient clinical phlebotomy encounter. Research blood samples are coded and stored in a central biorepository. Samples are linkable to clinical data in a de-identified manner for research via an IRB-approved data broker process. For genomic analysis, DNA is extracted from EDTA-anticoagulated whole blood. Group Health(GH)/University of Washington (UW): GH participants for the PGx project were enrolled in the eMERGE Network through the Northwest Institute of Genetic Medicine (NWIGM) biorepository, and provided the appropriate consent to receive clinically relevant genetic results (N~6300.) Participants were eligible if aged 50 - 65 years old at the time of their enrollment into the NWIGM repository, living, enrolled in GH's integrated group practice, and had completed an online Health Risk Appraisal. The selection algorithm was based on several data sources from the EHR at Group Health: 1. Demographics - participants with self-reported race as Asian or African ancestry were prioritized and selected to enrich for non-European ancestry; 2. Diagnosis and procedure codes - participants were selected if found to have a history of hypertension, atrial fibrillation (AF), or congestive heart failure (CHF). Participants with a history of arrhythmia were added if the entire selection algorithm did not generate 900 individuals. We also enriched for participants with EHR evidence of actionable indications related to PGRNSeq genes. Participants were selected if found to have an ICD9 code for malignant hyperthermia, hypertension, atrial fibrillation, congestive heart failure or long QT syndrome (LQTS); 3. Laboratory values - if participants had any laboratory event of creatine kinase (CK) >1000, and were dispensed statins within 6 months of the event, then they were selected; and 4. Medications - participants were excluded if ever on carbamazepine or had a current regimen of warfarin. Essentia Institute of Rural Health, Marshfield Clinic, Pennsylvania State University (Marshfield): The Marshfield Clinic Personalized Medicine Research Project is a population-based biobank in central Wisconsin with more than 20,000 adult subjects who provided written, informed consent to access their medical records and provided a blood sample from which DNA was extracted and plasma and serum stored. In addition to an average of 30 years of medical history data, a questionnaire about environmental exposures, including a detailed food frequency questionnaire, is available to facilitate gene/environment studies. Mayo Clinic: The Mayo biobank is a disease-specific biobank for vascular diseases including peripheral arterial disease (PAD). PAD patients were identified from individuals referred to the non-invasive vascular laboratory for lower extremity arterial evaluation. Since 1997, laboratory findings have been recorded into an electronic database employing an in-house software package for data archiving and retrieval; this data becomes part of the Mayo EMR. Patients referred to the center with suspected PAD undergo a comprehensive non-invasive evaluation including the ankle-brachial index (ABI) - the ratio of blood pressure measured in the upper arms divided by blood pressure measured at the ankles. Controls subjects are identified from patients referred to the Cardiovascular Health Clinic for stress ECG. The prevalence of PAD in patients with normal exercise capacity who do not have inducible ischemia on the stress ECG , was <1%. Data regarding risk factors for atherosclerosis such as diabetes, dyslipidemia, hypertension, and smoking are ascertained from the EMR. Icahn School of Medicine at Mount Sinai School (Mt. Sinai): The Institute for Personalized Medicine (IPM) Biobank Project is a consented, EMR-linked medical care setting biorepository of the Mount Sinai Medical Center (MSMC) drawing from a population of over 70,000 inpatients and 800,000 outpatient visits annually. MSMC serves diverse local communities of upper Manhattan, including Central Harlem (86% African American), East Harlem (88% Hispanic Latino), and Upper East Side (88% Caucasian/white) with broad health disparities. IPM Biobank populations include 28% African American (AA), 38% Hispanic Latino (HL) predominantly of Caribbean origin, 23% Caucasian/White (CW). IPM Biobank disease burden is reflective of health disparities with broad public health impact: average body mass index of 28.9 and frequencies of hypertension (55%), hypercholesterolemia (32%), diabetes (30%), coronary artery disease (25%), chronic kidney disease (23%), among others. Biobank operations are fully integrated in clinical care processes, including direct recruitment from clinical sites, waiting areas and phlebotomy stations by dedicated Biobank recruiters independent of clinical care providers, prior to or following a clinician standard of care visit. Recruitment currently occurs at a broad spectrum of over 30 clinical care sites. Northwestern University: The NUgene Project is a repository with longitudinal medical information from participating patients at affiliated hospitals and outpatient clinics from the Northwestern University Medical Center. Participants' DNA samples are coupled with data from a self-reported questionnaire and continuously updated data from our Electronic Medical Record (EMR) representing actual clinical care events. Northwestern has a state-of-the art, comprehensive inpatient and outpatient EMR system of over 2 million patients. NUgene has broad access to participant data for all outpatient visits as well as inpatient data via a consolidated data warehouse. NUgene participants consent to distribution and use of their coded DNA samples and data for a broad range of genetic research by third-party investigators. Vanderbilt University: BioVU, Vanderbilt's DNA databank, is an enabling resource for exploration of the relationships among genetic variation, disease susceptibility, and variable drug responses, and represents a key first step in moving the emerging sciences of genomics and pharmacogenomics from research tools to clinical practice. BioVU acquires DNA from discarded blood samples collected from routine patient care. The biobank is linked to de-identified clinical data extracted from Vanderbilt's EMR, which forms the basis for phenotype definitions used in genotype-phenotype correlations.