Rhabdomyosarcoma (RMS) describes rare soft-tissue tumors that exhibit features of skeletal muscle differentiation. The most common subtypes in children are alveolar and embryonal rhabdomyosarcoma, with the alveolar subtype characterized by PAX3/7 fusions. A lesser known and rarer subtype, pleomorphic rhabdomyosarcoma (PRMS), occurs most frequently in adults vetween the ages of 40 and 50. This pleomorphic subtype is often misdiagnosed and little is known about its molecular characterization. Here, we conducted comprehensive genomic, transcriptomic, and methylation profiling of these tumors.
Genetic analysis of patients with Inherited Retinal Dystrophies (IRDs) was carried out by performing Whole Genome Sequencing (WGS). The main purpose of this study is to identify simple and complex mutations responsible for IRD in patients. WGS was performed on selected affected and unaffected individuals using the Illumina HiSeqX10. The reads were aligned to human genome 19 (hg19) and variant calling was performed using Genome Analysis Toolkit (GATK). The genotyping quality of single nucleotide variants (SNVs) and indels was assessed using the variant quality score recalibration approach implemented in GATK. Copy number variations (CNVs) were called using Genome STRiP and SpeedSeq. This large set of whole genome sequencing data from different ethnicity can be stored and shared through dbGaP. This data could serve as a source for checking frequencies of variants or the pathogenicity of selected variants in different ethnicities.
This study is a part of NHGRI's Center for Common Disease Genomics, which is a collaborative large-scale genome sequencing effort to comprehensively identify rare risk and protective variants contributing to multiple common disease phenotypes. Current estimates anticipate that the CCDG program will sequence approximately 140K whole genomes and 225K whole exomes during the life of the project. The Cardiovascular Disease working group of the CCDG considered five diseases: early-onset coronary artery disease (EOCAD), stroke, atrial fibrillation, congestive heart failure and type 2 diabetes. Atrial fibrillation will affect between 6-12 million individuals in the US by 2050. AF also is associated with increased risks of stroke, dementia, heart failure, death, and high health care costs. Many risk factors for AF have been identified, including advancing age, cardiovascular disease (CVD), and CVD risk factors. However, there is little knowledge how to prevent AF. Furthermore, therapies for AF are only partially effective, and are themselves associated with substantial morbidity. Previously, heritable forms of AF have been considered rare; yet in the last decade, it has been established that AF, and in particular early-onset forms of AF, are heritable. Genome-wide association studies (GWAS) provide a powerful tool to identify common variants underlying disease risk. The AFGen Consortium currently consists of investigators from more than 25 studies with >20,000 individuals with AF and >100,000 without AF. In the latest analyses, 14 loci have been identified for AF1 . Broadly, the loci implicate genes related to cardiopulmonary development, cardiac-expressed ion channels, and cell signaling molecules.Source: https://ccdg.rutgers.edu/sites/default/files/CCDG_CVD_EOAF_FINAL_w_link.pdfAnalysis of 165 pharmacokinetic-related gene polymorphisms (the number of polymorphisms may increase) using DNA derived from the blood of approximately 1,000 Japanese general populations. Tokai University School of Medicine Molecular Life Sciences 2 Isogo Lab has been trying to find new genes which contribute to the development of multiple diseases and to develop new methods for human genome diversity analysis that will lead to the development of technology for disease screening by finding genes related to diseases and incorporating gene analysis technology. Using this method, we are trying to clarify the relationship between the characteristics of rheumatoid arthritis and psoriasis vulgaris and genes to seek effective treatments and prevention methods. Raw sequencing data, metadata, vcf and phenotype data at individual level are available at https://anvilproject.org/data. For questions about availability contact help@lists.anvilproject.org.
Type 2 diabetes mellitus (T2D) affects approximately 21 million individuals in the U.S., or almost 10% of the U.S. adult population. Because diabetes is determined by both genetic and environmental factors, a better understanding of the etiology of diabetes requires a careful investigation of gene-environment interactions. The Nurses' Health Study (NHS) and Health Professionals' Follow-up Study (HPFS) are well-characterized cohort studies of women and men for whom stored blood and DNA samples are available as well as detailed information on dietary and lifestyle variables. The major goals of the project include: 1. To conduct a GWA analysis among 3,000 cases of T2D and 3,000 healthy controls in NHS/HPFS cohorts. 2. To use information on the joint effects of genes and a list of carefully selected environmental exposures at the initial screening stage to test gene-environment interactions. This approach optimizes our power to detect variants that have a sizeable marginal effect and those with a small marginal effect but a sizeable effect in a stratum defined by an environmental exposure. For this analysis, we have developed a joint test of genetic marginal effect and gene-environment interaction. This flexible two-degree-of-freedom test generally provides greater power than standard methods and has the potential to uncover both marginal genetic effects and stratum-specific effects. The Version 1 (v1) dbGaP release of data from the GENEVA Diabetes Study (NHS/HPFS) includes data from the NHS only. The Version 2 (v2) dbGaP release includes data from both the NHS and HPFS. This study is part of the Gene Environment Association Studies initiative (GENEVA, http://www.genevastudy.org) funded by the trans-NIH Genes, Environment, and Health Initiative (GEI). The overarching goal is to identify novel genetic factors that contribute to type 2 diabetes mellitus through large-scale genome-wide association studies of well-characterized cohorts of nurses and health professionals. Genotyping was performed at the Broad Institute of MIT and Harvard, a GENEVA genotyping center. Data cleaning and harmonization were done at the GEI-funded GENEVA Coordinating Center at the University of Washington.
This file set has 1478 Greenlandic individuals scored on the Illumina MEGA array (1,748,250 sites). The data is in PLINK bed/bim/fam format. The individuals originate from the B2018 population survey.
This data contains the TCR-beta sequences of 10 head and neck squamous carcinomas and 19 nasopharyngeal carcinomas. The library preparation method is a customised targeted amplification of the VDJ regions and is sequenced on the Illumina Miseq.
This submission is of the sequencing data used in the CRISPR iPSC methods paper. Specifically it is 3 fastq files that each represent a replicate of an experiment to transduce the Toronto KnockOut CRISPR Library - Version 3 (TKOv3) into inferred pluripotent stem cell (iPSC) derived macrophages. The sequencing is of the guide RNAs from the TKOv3 having been extracted from the transduced iPSC derived macrophages.
This study is the first phase of the Moroccan Genome Project, which included the complete sequencing of 109 genomes from the Kingdom of Morocco. The sequencing was performing using the Illumina NovaSeq6000 platform, with a mean coverage of 30X.
This is human phenotype data for participants in a gut microbiome study. This data was collected at the same time as the stool samples used for the microbiome component. Participants were also part of the AWI-Gen Phase 1 main study. https://www.ebi.ac.uk/ena/data/view/PRJEB40733
The Electronic Medical Records and Genomics (eMERGE) Network is a National Institutes of Health (NIH)-organized and funded consortium of U.S. medical research institutions. The primary goal of the eMERGE Network is to develop, disseminate, and apply approaches to research that combine biorepositories with electronic medical record (EMR) systems for genomic discovery and genomic medicine implementation research. eMERGE was announced in September 2007 and began its third phase in September 2015. eMERGE III consists of nine study sites, two central sequencing and genotyping facilities, and a coordinating center. eMERGE Phase III aims to: 1) sequence and assess the phenotypic implication of rare variants in a custom designed eMERGEseq panel consisting of 109 genes (including 56 ACMG actionable finding list genes and the top 6 genes from each site relevant to their specific aims), as well as approximately 1400 SNPs; 2) assess the phenotypic implications of these variants by developing, validating and implementing new phenotype algorithms, 3) integrate genetic variants into EMRs to inform clinical care; and 4) create community resources. Included in this study are: ~24,000 eMERGE participants from 10 eMERGE III study sites. Corresponding demographics, body mass index measurements. Top PheWAS codes generated from a collated list of ICD codes from all study sites. Study sites and participants include: Cincinnati Children's Hospital Medical Center (CCHMC): Cincinnati Children's Hospital Medical Center (CCHMC) is a not-for-profit hospital and research center pioneering breakthrough treatments, providing outstanding family-centered patient care and training healthcare professionals for the future, and dedicated to improving health and welfare of children and to the shared purpose of discovery and practical application of new genomic information to the ordinary care of children. We bring a comprehensive electronic health record (EPIC), a deidentified i2b2 data warehouse of 680K patient records, a biobank with >261,000 consents that allow return of results to >84,000 patients and guardians who have provided DNA samples, and hundreds of faculty and senior staff who make genomics or informatics an active focus of their research. CCHMC will help the eMERGE III Steering Committee identify genes for the eMERGE III targeted sequencing panel, provide 3,000 DNA samples from CCHMC patients to be sequenced, review targeted gene panels from clinical care at CCHMC for somatic mosaicism and reinterpretation, and further develop and disseminate a software workflow suite for sequence analysis. We will also extend our work generating phenotype algorithms using heuristic and machine learning methods to many new childhood diseases. We will develop tools to evaluate adolescent return of results preferences, examine the ethical and legal obligations and potential to reanalyze results, and develop clinical decision support for phenotyping, test ordering, and returning sequencing results. Children's Hospital of Philadelphia (CHOP): The Center for Applied Genomics (CAG) is a specialized Center of Emphasis at the Children's Hospital of Philadelphia (CHOP), and one of the world's largest genetics research programs, with to state-of-the-art high-throughput sequencing and genotyping technology. Our primary goal is to translate basic research findings to medical innovations. We aim to develop new and better ways to diagnose and treat children affected by rare and complex medical disorders, including asthma, autism, epilepsy, pediatric cancer, learning disabilities, and a range of rare diseases. Ultimately, our objective is to generate new diagnostic tests and to guide physicians to the most appropriate therapies. Participants were recruited from the CAG biorepository (n>450,000), specifically from >100,000 CHOP pediatric patients and family members, which is enriched for rare-diseases (n>12,000). Center for Applied Genomics, The Children's Hospital of Philadelphia We gratefully thank all the children and their families who enrolled in this study, and all individuals who donated blood samples for research purposes. Genotyping for this project was performed at the Center for Applied Genomics and supported by an Institutional Development Award from The Children's Hospital of Philadelphia. Sequencing was supported by the National Institutes of Health through an award from the National Human Genome Research Institute's Electronic Medical Records and Genomics (eMERGE) program (U01HG008684). Columbia University: The goal of the Columbia eMERGE III project is to develop methods for integrating genomic data in EHRs and to study the impact of such genomic informatics interventions on the health of a diverse, underserved urban adult English- and Spanish-speaking patient population in Northern Manhattan served by Columbia University Medical Center/New York-Presbyterian Hospital system. The study group is 2500 patients recruited from diverse clinics and community outreach centers of self-reported White (~61%), Asian (~11%), African-American (~11%), American Indian/Alaska Native (<1%) racial and Hispanic (~33%) ethnic backgrounds. There are two subgroups in the study cohort - a retrospective group (N=1052) that includes patients from oncology and nephrology clinics, and a prospective one (N=1448) that includes healthy individuals as well as participants with diverse medical conditions. Confirmed pathogenic variants in 70 selected genes will be returned to participants and their healthcare providers through the EHR integration. Participants are able to choose the results they receive and will have the freedom to meet with a genetic counselor and a geneticist to review results. The impact of genetic testing on clinical care is determined by periodic monitoring of EHRs. Geisinger: Samples and phenotype data in this study were provided by the Geisinger MyCode® Community Health Initiative. Participants are recruited across the Geisinger System via online consents or in-person consents at a hospital or clinic visit. Enrollment is ongoing with over 100,000 individuals currently consented. Partners Healthcare (Harvard University): The Partners HealthCare Biobank is a large research program designed to help researchers understand how people's health is affected by their genes, lifestyle, and environment. This large research data and sample repository provides access to high-quality, consented blood samples to help foster research, advance our understanding of the causes of common diseases, and advance the practice of medicine. For the Partners research community (Massachusetts General Hospital and Brigham and Women's Hospital), the Biobank provides: Banked samples (plasma, serum, and DNA) collected from consented patients Blood samples that were discarded after clinical testing in the Crimson Cores maintained in the Brigham and Women's Hospital and Massachusetts General Hospital Pathology Departments Sample handling and preparation services Link to the biobank data to the Partners Research Patient Data Registry (RPDR) a research instance of our electronic clinical chart Data access through our research portal. To date, over 70,000 Partners patients have given their consent to enroll, give a blood sample, receive research results and agreed to be re-contacted for additional research studies. The Biobank has enabled Partners investigators to compete for nationally recognized grants in personalized medicine such as a clinical electronic Medical Records and Genomics network (eMERGE) site and the national All of US program. The Biobank currently supports over 120 Partners investigators and over 130 million dollars in NIH research. Kaiser Permanente Washington/ (KPWA) / University of Washington (UW): KPWA participants were enrolled in the eMERGE Network through the Northwest Institute of Genetic Medicine (NWIGM) biorepository, and provided the appropriate consent to receive clinically relevant genetic results (N=2,500.) NWIGM is based at the University of Washington and co-managed by the University of Washington and KPWA. The purpose of the NWIGM biorepository is to build infrastructure and resources to carry out a broad range of future genetic research. KPWA members enrolled in the biorepository are asked to provide informed consent to providing a DNA sample for storage in the NWIGM biorepository. The consent is purposefully broad to serve the dual purpose of reducing the burden on researchers who wish to use this biorepository and the IRB committees who will be responsible for reviewing these requests in the future. Participants were eligible if aged 50 - 65 years old at the time of their enrollment into the NWIGM repository, living, enrolled in KPWA's integrated group practice, and had completed an online Health Risk Appraisal. The selection algorithm was based on several data sources from the EHR at KPWA. 1) Demographics - participants with self-reported race as Asian ancestry were prioritized and selected to enrich for non-European ancestry. The KPWA eMERGE cohort includes N=1,245 members of Asian ancestry. 2) Participants were also selected for a history of colorectal cancer (N=1,255), in order to allow us to enrich germline pathogenic variants. Mayo Clinic: The Return of Actionable Variants Empirical (RAVE) Study was approved by the Mayo Clinic IRB. We recruited 2537 participants from Mayo Clinic biobanks in Rochester, MN, who had hypercholesterolemia or colon polyps, thereby enriching for Familial hypercholesterolemia (FH) and monogenic causes of colorectal cancer (CRC). Additional eligibility criteria were: 1) residents of Southeast MN who were alive and aged 18-70 years; 2) LDL-C level >155 or >120 mg/dl while on lipid-lowering therapy; 3) no known cause of secondary hyperlipidemia; and 4) no cognitive impairment or dementia that would compromise their ability to give written informed consent. Based on these criteria, we identified 5270 eligible patients and obtained informed consent from 3030 participants. Recruitment was conducted in waves and utilized mailed recruitment packets consisting of a study brochure, a written informed consent form, a baseline psychosocial questionnaire, and a return postage-paid envelope. DNA of 2537 participants was sent for CLIA-certified targeted sequencing of 109 genes including genes associated with FH and CRC. Targeted sequencing and genotyping was performed in a Central Laboratory Improvement Amendment (CLIA)-certified laboratory. Northwestern University: Samples and data used in this study were obtained from patients from Northwestern Medicine, an integrated healthcare system, formed through a partnership of Northwestern Memorial HealthCare and Northwestern University Feinberg School of Medicine. Participants include a retrospective cohort from the Northwestern Pharmacogenomics Study, funded through the eMERGE II project, NHGRI (3U01HG006388-02S1) and a prospective cohort from the Genetic Testing and Your Health Study, funded through the eMERGE III project, NHGRI (U01HG008673). Patients were eligible to participate if they were18 years or older and see a physician at Northwestern Medicine. Patients consented to genetic testing and to allow their results to be placed in their electronic medical record. Vanderbilt University Medical Center: Vanderbilt University Medical Center (VUMC) participants were enrolled in the eMERGE Network through the Vanderbilt Genome-Electronic Records (VGER) project. Patients were provided the appropriate consent to receive clinically relevant genetic results (N=2,700). Participants were eligible if aged 21 or over, had a healthcare provider at VUMC, and visited the provider at least 3 times in the past 3 years. Meharry Medical College: Inclusion of ethnic groups in genomic research is critical to identify possible reasons for health disparities. African-Americans are being enrolled in various outpatient clinics of Nashville General Hospital at Meharry, an inner city hospital primary serving a poorer patient group. A total of 500 African Americans with four cancer types demonstrating health disparities in this population - prostate, colon, breast, lung are identified and approached by clinical research coordinators. The purpose of the study is to determine if any genetic information can be identified from these patients who have or are at high risk of one of these disparate cancers. All participants provide written informed consent and HIPAA authorization to provide blood samples for broad research use and permission to access data in their hospital electronic medical record for research now and in the future. An extensive demographic profile is obtained and entered into a REDCap database. Blood samples are obtained for a panel of alleles from extracted DNA at Baylor. In addition, de-identified coded samples are processed and stored in a central biorepository for further DNA, RNA and proteomic analyses. The survey and phlebotomy are performed at the time of the initial contact and agreement to participate. Nearly all patients approached willingly agree to participate for potential benefit to themselves, family members, or humankind. Little concern is voiced of providing samples for genetic analysis. Study investigators will share results with the participants and providers if testing does not indicate high risk. Results indicating increased risk or actionable alleles for the patient and/or family will be returned by a genetic counselor. Monitoring of the patients' health in this cohort will continue to be followed in the EMR to identify any future associations that might explain health disparities in African Americans. Proposals will be reviewed from investigators to study the genetic or proteomic samples as well as the clinical and demographic information in the repository. Please note that this version of the dataset has a handful of mismatches between genotyped and provided sex. Data with the following IDs should be removed prior to analysis: 420252874213744142412243424569384245694642672223