Identifying and understanding changes in cancer genomes is essential for the development of targeted therapeutics. Here we analyse systematically more than 70 pairs of primary human colon tumours by applying next-generation sequencing to characterize their exomes, transcriptomes and copy-number alterations. We have identified 36,303 protein-altering somatic changes that include several new recurrent mutations in the Wnt pathway gene TCF7L2, chromatin-remodelling genes such as TET2 and TET3 and receptor tyrosine kinases including ERBB3. Our analysis for significantly mutated cancer genes identified 23 candidates, including the cell cycle checkpoint kinase ATM. Copy-number and RNA-seq data analysis identified amplifications and corresponding overexpression of IGF2 in a subset of colon tumours. Furthermore, using RNA-seq data we identified multiple fusion transcripts including recurrent gene fusions involving R-spondin family members RSPO2 and RSPO3 that together occur in 10% of colon tumours. The RSPO fusions were mutually exclusive with APC mutations, indicating that they probably have a role in the activation of Wnt signalling and tumorigenesis. Consistent with this we show that the RSPO fusion proteins were capable of potentiating Wnt signalling. The R-spondin gene fusions and several other gene mutations identified in this study provide new potential opportunities for therapeutic intervention in colon cancer.
Structural variations (SVs) are large genomic rearrangements that can drive many diseases. Conventional short-reads whole genome sequencing (cWGS) allows their identification with base-pair resolution, but suffers from high false discovery rate. cWGS taps in short-range information from short-reads while linked-reads sequencing (10XWGS) utilizes long-range information. 10XWGS allows linkage of short-reads originating from the same large DNA molecule with a unique barcode captured in a gel bead in emulsion. This mitigates alignment-based artefacts from cWGS especially in repetitive regions. However, the false discovery rate of this technology is unclear. In this study, we performed a comprehensive analysis of different type and size of SVs predicted from these two technologies. The SVs common between both technologies were found to be highly specific by PCR and Sanger sequencing while validation rate dropped for uncommon events. Further, we propose a novel enrichment approach for filtering out false positive calls from both the technologies independently. To this end, we trained a machine learning model for respective technologies and used it to characterise SVs from MCF7 cell line and a primary breast cancer tumor with high precision. This approach would be valuable in understanding true mechanisms driven by SVs in various diseases.
This collection contains all of NCIs authorized individual-level genomic datasets currently in dbGaP that are approved for General Research Use (GRU) and have no further limitations beyond those outlined in the model Data Use Certification Agreement. Access to this study will include any additional authorized individual-level GRU datasets that become available. Renewal of this study is required annually.
We conducted a cohort-based study to investigate the association between the genetic background and diet/lifestyle in 51 healthy Japanese. We analyzed some participants living in Nagahama. The SNP array (Japonica Array v2) was used for genotyping. Shotgun metagenomic sequencing of fecal microbiome were analyzed.
We searched for the genomic signatures of positive selection in the genome-wide data of 432 people from eight different northern Russian populations (Russians from the Archangelsky and Vologdsky regions, Izhemski Komi, Priluzski Komi, Veps, Khanty, Mansi and Nenets) who were genotyped using microarrays for 700,000 SNPs (InfiniumOmniExpress-24v1-2_A1) through testing the extended haplotype homozigosity (EHH).
The goal of the proposed study is to use the Hepatitis B virus (HBV) vaccine as a model for a future Human Immunodeficiency Virus (HIV) vaccine trial, examining the efficacy of community-based outreach intervention as well as an accelerated vaccine schedule as a method for increasing acceptance/adherence with HBV vaccination protocols among not-in-treatment drug users. This study also examined the effect of HBV vaccination coupled with community-based outreach intervention on reducing the incidence of HIV, HBV and Hepatitis C Virus (HCV) infections and the frequency of needle use and sexual risk behaviors related to these viral transmissions. A secondary purpose is to assess the antibody response after HBV vaccination as a measurement of immunological response in drug users.
Even though whole genome sequence (WGS) data has been generated and published in many studies, much of this information is not yet processed for use in down-stream analysis. This study's aim is to provide 1342 WGS normal-tumor paired single nucleotide variations (SNP) over 18 different cancer types provided by The Cancer Genome Atlas (TCGA) project. Individual level data for TCGA can be accessed by requesting access for phs000178. This data includes variations within self-reported white and African-American populations. Variations that exist within the tumor tissue but are absent in associated normal organ tissue (as compared to the human reference genome) are reported. Data published includes SNP and small insertions and deletions, which were generated through a pipeline including the VarScan2 variant calling software.
Neurofibromatosis 1 is a hereditary syndrome characterized by the development of numerous benign neurofibromas, a small subset of which progress to malignant peripheral nerve sheath tumors (MPNSTs). To better understand the genetic basis for MPNSTs, we performed whole genome sequencing on four MPNSTs from patients with neurofibromatosis 1 and found that each of them had a somatic, inactivating mutation of SUZ12, a chromatin modifying gene adjacent to the NF1 gene responsible for the benign neurofibromas in these patients. We then performed targeted sequencing on an additional 46 MPNSTs and found that 12 had somatic mutations in SUZ12. Fifteen of the 17 (88%) mutations in SUZ12 were predicted to inactivate protein function, implicating it as a tumor suppressor gene possibly responsible for the progression from neurofibromas to MPNTs.
Description of the disorder: A very common disorder presenting to pediatricians/pediatric endocrinologists is childhood growth failure. Sometimes the cause is evident, for example, growth hormone deficiency. In other children, the etiology remains unknown despite extensive evaluation, resulting in the unhelpful diagnosis of severe idiopathic short stature (SISS). These conditions are quite heterogeneous, including children with isolated growth failure and others who also have other abnormalities such as developmental delay or a constellation of congenital anomalies (syndromic short stature). Sometimes, the disorder appears to primarily affect the growth plate, which drives skeletal growth and thereby determines overall body proportions, whereas in other children, the disorder affects skeletal and non-skeletal tissues equally. Some cases of SISS have a polygenic inheritance while others appear to follow a Mendelian inheritance model, recessive, dominant or X-linked. Very recently, genome-wide analysis for copy number variants (CNVs) and whole-exome sequencing have begun to identify some of the molecular etiologies of these disorders.1 Identifying the molecular etiology of growth disorders has clinical and scientific value. Clinically, identifying a molecular cause prevents extensive further testing and may direct anticipatory care for associated medical problems. For example, we recently studied aggrecan (ACAN) gene mutations in families with autosomal dominant short stature and accelerated skeletal maturation. These mutations affect both growth plate cartilage, causing linear growth failure, and also articular cartilage, causing osteochondritis dissecans and early-onset osteoarthritis.1 Etiological classification of idiopathic growth failure allows more precise characterization of prognosis and response to treatment, which are currently highly imprecise because of the locus heterogeneity. In some cases, finding the genetic etiology points to a novel treatment approach that targets the specific molecular pathway involved. The proposed project is central to the main focus of our group, the Section on Growth and Development, NICHD. Our primary goal is to investigate cellular and molecular mechanisms governing childhood growth and to gain insight into the many human genetic disorders causing childhood growth failure. The proposed project is well suited for the intramural program because it takes advantage of Clinical Center expertise to phenotype subjects with SISS. Study subjects: We will study subjects with SISS and nuclear family members. SISS will be defined by height SDS < -2.5 for age without evident cause after routine evaluation including: growth hormone axis evaluation; thyroid function testing; celiac disease screening; urinalysis; CBC; chemistry; karyotype (girls, for Turner syndrome); and testing for single gene defects based on the clinical evaluation (for example, SHOX or Noonan-associated genes). Candidate families will include isolated growth disorders and growth disorders that are accompanied by congenital anomalies, developmental delay, or other syndromic short stature. Strong preference will be given to subjects with a severe phenotype and a pedigree that indicates a Mendelian inheritance. Multiple independent families with the same phenotype will have priority. The pool of applicants for recruitment is large, and we receive many inquiries by emails and phone consultations from pediatric endocrinologists for advice regarding diagnosis and management of unusual growth disorders, including familial disease. Often these families are seeking further evaluation and are willing to participate in a research study. From this pool, we will be selecting pedigrees with very favorable Mendelian characteristics, for example de novo dominant occurrences where two normal parents have a child with SISS, and the child grows up to be an adult who passes this phenotype on to multiple grandchildren in the next generation. Subjects and family members will be brought to the NIH Clinical Center (NIHCC) for outpatient evaluation. Participants will be evaluated by pediatric endocrinology fellows (as part of our training program) and by senior staff to establish the clinical findings and construct a pedigree. Subjects will receive additional biochemical and imaging studies at the Clinical Center to complete the phenotyping and assign affected status. The growth abnormality will be evaluated by assessing body proportions, relative organ size, and skeletal imaging as indicated. Associated clinical abnormalities beyond altered growth will be characterized with the help of other Clinical Center subspecialists. Subjects and family members will be evaluated by SNP microarray and whole-exome sequencing. We anticipate 4-5 persons for each of 16-20 families, for a total of 72-90 whole-exome sequences. Half of the families will be recruited and studied within the first year and half in the second year. We will use freshly collected peripheral blood as the DNA source for SNP array and NextGen Sequencing. Analytic approach: The candidate genes will be chosen based on 1) inheritance state consistency, 2) population frequency in the ESP and UDP databases, and 3) predictions of deleteriousness. We will use VarSifter and the B road Institute Integrated Genome Viewer to filter and visualize these data. The genetic model will be dependent on the family's pedigree. For a simple trio, we will explore variants using genetic models including autosomal recessive, de novo (dominant), compound heterozygous, deletion/point mutation recessive, and X-linked (male only). The candidate variants will be identified using Boolean logic sets in VarSifter following intramural NHGRI/UDP methods. We will also use SNP microarray data to identify copy number variations, complete/single copy deletions, duplications, non-paternity, consanguinity for homozygosity mapping, uniparental isodisomy, mosaicism, and segregation patterns (bed file generation for use in VarSifter filter work). After a list of candidate sequence variants has been generated, annotation will include using the Exome Variant Server, Polyphen-2, MutationTaster, Sift, and CADD predictions of deleteriousness. Biological laboratory data will be included to prioritize candidate variants. Our group has expertise in the molecular mechanisms regulating both skeletal growth2,3 and growth of other tissues4,5, which may be helpful in this phase of the analysis. The most promising candidate mutations will be confirmed by Sanger sequencing and studied functionally, in vitro and/or in vivo. For mutations that affect skeletal growth, we will use experimental systems related to growth plate cartilage. For in vitro studies, we have experience transfecting chondrocyte cell lines, such as ATDC5, and primary chondrocytes. We will determine whether the mutation alters protein and/or cell function. In vivo studies can be used to explore pathophysiology. We have recently successfully used a new approach, the CAS9/CRISPR system to knockout multiple loci in mice (unpublished), which can be used again in the future to create mouse models efficiently. 1J Clin Endocrinol Metab, 2014 (PMID: 24762113) 2J Mol Endocrinol, 2014 (PMID: 24740736) 3Hum Mol Genet, 2012 (PMID: 22914739) 4Proc Natl Acad Sci U S A, 2013 (PMID: 23530192) 5Endocr Rev, 2011 (PMID: 21441345)
It is apparent from our recent population genetic and admixture mapping work 1–3 that a substantial part of the ancestral input into the South African Coloured (SAC) population is from the San or Khoe groups, which are not well represented in publicly available genetic databases. It is a reasonable assumption that the SAC population in the Western Cape may have derived more genetic input from the present KhoeSan population in the Northern Cape than the San residing in Namibia. It was concluded by our group that it is indeed the southern African KhoeSan group, the ≠Khomani, that best represent the KhoeSan contribution seen in the SAC 4. More recently, it has been shown that there are ancestry related increases in TB susceptibility, especially with increased Bantu-speaking African and KhoeSan ancestry 5. We aim to elucidate the epidemiological and human-host genetic risk factors for TB and the immunological pathways modulating TB infection. The proposed sub-study will enroll people evaluated for TB at Northern Cape community health clinics and their contacts from their households and/or community. We will conduct a demographic interview, collect saliva, which will be used for genetic analysis, and blood to determine latent TB infection status and capture immunological responses to mycobacterial infections at the cellular level and RNA sequencing.