RNA-seq dataset used for the validation of CDK6 cis-regulatory mutation annotated by OncoCis. NB bam files for manuscript A_Proteomic_Chronology_of_Gene_Expression_through_the_Cell_Cycle_in_Human_Myeloid_Leukemia_Cells are now available at the following link:http://www.ebi.ac.uk/ena/data/view/ERP008483
The dataset includes BAM, FASTQ and decompressed gVCF files for 50 samples from Benin generated for the H3Africa Chip Design Study.
WGS DATA FILES FOR SJPhLike
Description of the disorder: A very common disorder presenting to pediatricians/pediatric endocrinologists is childhood growth failure. Sometimes the cause is evident, for example, growth hormone deficiency. In other children, the etiology remains unknown despite extensive evaluation, resulting in the unhelpful diagnosis of severe idiopathic short stature (SISS). These conditions are quite heterogeneous, including children with isolated growth failure and others who also have other abnormalities such as developmental delay or a constellation of congenital anomalies (syndromic short stature). Sometimes, the disorder appears to primarily affect the growth plate, which drives skeletal growth and thereby determines overall body proportions, whereas in other children, the disorder affects skeletal and non-skeletal tissues equally. Some cases of SISS have a polygenic inheritance while others appear to follow a Mendelian inheritance model, recessive, dominant or X-linked. Very recently, genome-wide analysis for copy number variants (CNVs) and whole-exome sequencing have begun to identify some of the molecular etiologies of these disorders.1 Identifying the molecular etiology of growth disorders has clinical and scientific value. Clinically, identifying a molecular cause prevents extensive further testing and may direct anticipatory care for associated medical problems. For example, we recently studied aggrecan (ACAN) gene mutations in families with autosomal dominant short stature and accelerated skeletal maturation. These mutations affect both growth plate cartilage, causing linear growth failure, and also articular cartilage, causing osteochondritis dissecans and early-onset osteoarthritis.1 Etiological classification of idiopathic growth failure allows more precise characterization of prognosis and response to treatment, which are currently highly imprecise because of the locus heterogeneity. In some cases, finding the genetic etiology points to a novel treatment approach that targets the specific molecular pathway involved. The proposed project is central to the main focus of our group, the Section on Growth and Development, NICHD. Our primary goal is to investigate cellular and molecular mechanisms governing childhood growth and to gain insight into the many human genetic disorders causing childhood growth failure. The proposed project is well suited for the intramural program because it takes advantage of Clinical Center expertise to phenotype subjects with SISS. Study subjects: We will study subjects with SISS and nuclear family members. SISS will be defined by height SDS < -2.5 for age without evident cause after routine evaluation including: growth hormone axis evaluation; thyroid function testing; celiac disease screening; urinalysis; CBC; chemistry; karyotype (girls, for Turner syndrome); and testing for single gene defects based on the clinical evaluation (for example, SHOX or Noonan-associated genes). Candidate families will include isolated growth disorders and growth disorders that are accompanied by congenital anomalies, developmental delay, or other syndromic short stature. Strong preference will be given to subjects with a severe phenotype and a pedigree that indicates a Mendelian inheritance. Multiple independent families with the same phenotype will have priority. The pool of applicants for recruitment is large, and we receive many inquiries by emails and phone consultations from pediatric endocrinologists for advice regarding diagnosis and management of unusual growth disorders, including familial disease. Often these families are seeking further evaluation and are willing to participate in a research study. From this pool, we will be selecting pedigrees with very favorable Mendelian characteristics, for example de novo dominant occurrences where two normal parents have a child with SISS, and the child grows up to be an adult who passes this phenotype on to multiple grandchildren in the next generation. Subjects and family members will be brought to the NIH Clinical Center (NIHCC) for outpatient evaluation. Participants will be evaluated by pediatric endocrinology fellows (as part of our training program) and by senior staff to establish the clinical findings and construct a pedigree. Subjects will receive additional biochemical and imaging studies at the Clinical Center to complete the phenotyping and assign affected status. The growth abnormality will be evaluated by assessing body proportions, relative organ size, and skeletal imaging as indicated. Associated clinical abnormalities beyond altered growth will be characterized with the help of other Clinical Center subspecialists. Subjects and family members will be evaluated by SNP microarray and whole-exome sequencing. We anticipate 4-5 persons for each of 16-20 families, for a total of 72-90 whole-exome sequences. Half of the families will be recruited and studied within the first year and half in the second year. We will use freshly collected peripheral blood as the DNA source for SNP array and NextGen Sequencing. Analytic approach: The candidate genes will be chosen based on 1) inheritance state consistency, 2) population frequency in the ESP and UDP databases, and 3) predictions of deleteriousness. We will use VarSifter and the B road Institute Integrated Genome Viewer to filter and visualize these data. The genetic model will be dependent on the family's pedigree. For a simple trio, we will explore variants using genetic models including autosomal recessive, de novo (dominant), compound heterozygous, deletion/point mutation recessive, and X-linked (male only). The candidate variants will be identified using Boolean logic sets in VarSifter following intramural NHGRI/UDP methods. We will also use SNP microarray data to identify copy number variations, complete/single copy deletions, duplications, non-paternity, consanguinity for homozygosity mapping, uniparental isodisomy, mosaicism, and segregation patterns (bed file generation for use in VarSifter filter work). After a list of candidate sequence variants has been generated, annotation will include using the Exome Variant Server, Polyphen-2, MutationTaster, Sift, and CADD predictions of deleteriousness. Biological laboratory data will be included to prioritize candidate variants. Our group has expertise in the molecular mechanisms regulating both skeletal growth2,3 and growth of other tissues4,5, which may be helpful in this phase of the analysis. The most promising candidate mutations will be confirmed by Sanger sequencing and studied functionally, in vitro and/or in vivo. For mutations that affect skeletal growth, we will use experimental systems related to growth plate cartilage. For in vitro studies, we have experience transfecting chondrocyte cell lines, such as ATDC5, and primary chondrocytes. We will determine whether the mutation alters protein and/or cell function. In vivo studies can be used to explore pathophysiology. We have recently successfully used a new approach, the CAS9/CRISPR system to knockout multiple loci in mice (unpublished), which can be used again in the future to create mouse models efficiently. 1J Clin Endocrinol Metab, 2014 (PMID: 24762113) 2J Mol Endocrinol, 2014 (PMID: 24740736) 3Hum Mol Genet, 2012 (PMID: 22914739) 4Proc Natl Acad Sci U S A, 2013 (PMID: 23530192) 5Endocr Rev, 2011 (PMID: 21441345)
This data set includes the following summary level data files used for the 13k analysis of T2D-GENES data: wes.variants.list: list of variants to keep for any analysis of the exomes data wes.assoc.samples.list: list of samples to keep for association analysis wes.assoc.variants.list: list of variants to keep for association analysis wes.sv.assoc.txt: single variant association analysis results wes.gene.ptv.variants.list.txt: list of protein truncating variants to use in gene-level analysis wes.gene.ptv.assoc.txt: results from gene-level tests of protein truncating variants wes.gene.nsstrict.variants.list.txt: list of NSstrict variants to use in gene-level analysis wes.gene.nsstrict.assoc.txt: results from gene-level tests of NSstrict variants wes.gene.nsbroad.variants.list.txt: list of NSbroad variants to use in gene-level analysis wes.gene.nsbroad.assoc.txt: results from gene-level tests of NSbroad variants wes.gene.ns.variants.list.txt: list of non synonymous variants to use in gene-level analysis wes.gene.ns.assoc.txt: results from gene-level tests of non synonymous variants
These are the log2CPM (log2 counts per million) fragments per gene counts associated with the BAM files in EGAD00001003806, in tab separated format. Counts for 36 postmortem brain samples from 9 non-demented control subjects and 9 Hereditary cerebral hemorrhage with amyloidosis-Dutch type subjects are included (1 Frontal cortex sample and 1 Occipital cortex sample per subject). RNA samples were depleted for ribosomal RNA with the Ribo Zero Gold Human kit (Illumina) and strand specific RNA-Seq libraries were generated. Paired-end sequencing was performed on a HiSeq2500 Illumina system (2x50bp reads). Alignments were performed using GSNAP v2014-12-23 with setting "--npaths 1" on GRCh38 reference genome without the alternative contigs. Fragment per gene counting was performed using HTSeq-count v0.6.1p1 with setting "--stranded reverse". The gene annotation used for quantification were UCSC RefSeq genes for GRCh38 downloaded on 2015-07-13.
Arcagen is an EORTC/SPECTA pan-European project that aims to recruit 1000 rare cancer patients from different tumour domains of EURACAN. This study collected samples from advanced or metastatic rare cancer from patients older than 12, and analysed them using Foundation Medicine next-generation sequencing (NGS) panels (FoundationOne CDx for FFPE samples or FoundationOne Liquid CDx for blood samples). Here we are submitting the dataset that contain NGS files from rare thoracic malignancies (n=102)
Precision mapping of genetic alterations in cancer can enable better selection of therapies and improved outcomes when combined with new sequencing diagnostics. We describe whole-exome sequences from cervical adenocarcinomas and paired normal samples in Hong Kong Chinese women. These data uncover a heterogeneous genomic landscape but identify commonly aberrant loci including FAT1, ARID1A, ERBB2 and PIK3CA that may provide a focus for the development of individualized targeted therapies for Chinese women with cervical adenocarcinoma.
To define a genetic syndrome of combined immunodeficiency, severe autoimmunity, and developmental delay, 4 patients from two families who had similar syndromic features were studied. To identify disease-causing mutations, we performed whole exome sequencing for one patient and her healthy parent from Family 1 and also for one patient from Family 2. Disease segregated with novel autosomal recessive mutations in a single gene, tripeptidyl-peptidase II (TPP2) gene. The result defines a new human metabolic immunodeficiency.
Mutations in splicing factor genes are common in myelodysplastic syndromes but the reason for selection of these mutations remains incompletely understood. This study aimed to identify the effect of minor intron retention due to ZRSR2 mutations in myelodysplastic syndromes. Nine samples from patients with myelodysplastic syndromes bearing ZRSR2 mutations and ten samples from patients with myelodysplastic syndromes not bearing any splicing factor mutations were subjected to transcriptomic analysis for mis-splicing events after RNA-sequencing.