Three different types of samples were used: 19 normal adjacent, 17 adenoma and 19 colorectal tumor tissue samples. This included 10 pairs of colorectal cancer and normal samples and 1 pair of adenoma-normal samples of the same patient. Tissue specimens were formalin-fixed paraffin-embedded (FFPE). DNA was isolated using the QIAamp FFPE Tissue kit (Qiagen, Hilden, DE) according to the manufacturer’s instructions. The processed DNA was run on the Illumina Human MethylEPIC® v1.0 BeadChip array.
This dataset contains the aligned whole genome sequencing data of cell line 380. This cell was established from the peripheral blood of a 15-year-old boy with acute lymphoblastic leukemia at relapse, showing an immature phenotype and carrying an IGH-MYC (t(8;14)) as well as an IGH-BCL2 (t(14;18)) chromosomal translocation. The sequencing was performed on an Illumina X-ten sequencer.
The contribution of genetic predisposing factors to the development of pediatric acute lymphoblastic leukemia (ALL), the most frequently diagnosed cancer in childhood, has not been fully elucidated. Children presenting with multiple de novo leukemias are more likely to suffer from genetic predisposition. Here, we selected five of these patients and analyzed the mutational spectrum of normal and malignant tissues.
We constructed de novo genome assemblies for six family trios from diverse Middle Eastern ancestries (Sudan, Jordan, Syria, Qatar, and Afghanistan), involving probands with various unresolved neurodevelopmental conditions. We generated high-quality, nearly complete genome assemblies for trios, revealing extended novel sequence impacting known genes, novel HLA/KIR alleles, and strong signals of inbreeding, with runs of homozygosity covering large parts of individual chromosomes. We also identified potential disease variants underlying the unresolved symptoms. Also, the assemblies uncovered unique variation relative to existing references, showing enhanced mapping and variant calling of Middle Eastern genomes. The dataset available through dbGaP includes raw short (Illumina) and long (PacBio) sequencing reads and assembled haplotypes.
Genomic translocation events frequently underlie cancer development through generation of gene fusions with oncogenic properties. Identification of such fusion transcripts by transcriptome sequencing might help to discover new potential therapeutic targets. We developed TRUP (Tumor-specimen suited RNA-seq Unified Pipeline (https://github.com/ruping/TRUP), a computational approach that combines split-read and read-pair analysis with de-novo assembly for the identification of chimeric transcripts in cancer specimens. We apply TRUP to RNA-seq data of different tumor types, and find it to be more sensitive than alternative tools in detecting chimeric transcripts, such as secondaryrearrangements in EML4-ALK-positive lung tumors, or recurrent inactivating rearrangements affecting RASSF8.
Long-range sequencing with low error rate has been challenging. Sequence assembly and phasing usually require a high-quality reference genome for mapping, so working on highly-variable genomic regions or regions with no reference genome information would be difficult. In this study, we describe novel bench protocols and algorithms to obtain ultra-low-error-rate haplotype-phased sequence assemblies of regions 10 KB in length using a short-read sequencing platform that simultaneously solves the above two problems. We accomplish this by imprinting each template strand from a target region with a dense and unique mutation pattern. The mutation process randomly and independently converts ~50% of cytosines to uracils. Short-read sequencing libraries are made from both mutated and unmutated templates. A conservative de Bruijn graph approach seeds an assembly of the mutated templates, which we then extend by mapping paired-end reads. We next partition the template assemblies into two or more haplotypes after using the unmutated sequence library to recover almost all of the mutated bases. The final haplotype is assembled and corrected for residual template mutations and PCR errors. We obtain per-base-error rates below 10 9. We apply this method to a human family, correctly assembling and phasing three genomic intervals, including the highly polymorphic HLA-B gene.
BRCA1 splice isoforms d11 and d11q can contribute to PARP inhibitor (PARPi) resistance by splicing-out mutation-containing exons, producing truncated, partially-functional proteins. However, the clinical impact and underlying drivers of BRCA1 exon skipping remain undetermined. We analyzed nine ovarian and breast cancer patient derived xenografts (PDX) with BRCA1 exon 11 frameshift mutations for splice isoform expression and therapy response.
<p>We will apply whole genome sequencing of trio families to determine how patterns of germline mutation throughout the genome determine risk for Autism Spectrum Disorder (ASD). We will investigate the nature intrinsic hypermutability and the extrinsic forces, such as paternal age, that influence rates of germline mutation. We will accomplish these goals through the following specific aims: Specific Aim 1 will characterize germline de novo mutations (DNMs) by whole genome sequencing in families. These studies will identify and validate ~8,000 de novo point mutations and structural variants in trios and controls to determine the parent of origin of DNMs. Specific Aim 2 will identify hot spots for germline mutation based on the regional density of DNMs in the genome, and determine the effects of DNA sequence features on rates of mutation. We will determine the association of mutation hotspots with ASD in the discovery sample and in genomic datasets from an independent sample of 2700 cases and 2700 controls. Specific Aim 3 will characterize the effects of extrinsic factors, including parental age and environment, on genome-wide rates of mutation. We will quantify the effect of paternal age on pathogenic and neutral alleles in sperm and investigate whether some DNMs confer a germline selective advantage. The findings of this study will provide fundamental insights into the genetic basis of autism risk and the genetic mechanism of the observed parental age effects in ASD. We will identify genes that confer significant risk for autism, and we will determine how intrinsic properties of the genome interact with extrinsic forces to determining risk for disease in offspring.
The 340 de novo Acute Myeloid Leukemia (AML) patients (ages 1 month to 21 years) were enrolled in COG-AAML03P1 (NCT00070174). Everyone received standard chemotherapy regimen of ara-C, daunorubicin and etoposide (ADE) with addition of one 3 mg/m2 dose of Gemtuzumab Ozogamicin (GO) in induction 1 as well as in intensification II phase. The 1022 de novo AML patients (ages 0–30 years) were enrolled in COG-AAML0531 (NCT01407757). They were randomly assigned to receive either the standard ADE regimen (ADE arm, n = 511) or with the addition of one 3 mg/m2 dose of GO, during induction I as well as intensification II phase (ADE+GO arm, n = 511). Detailed study design, treatment regimen, and clinical outcomes of these two trials have been previously published (Cooper TM, Franklin J, Gerbing RB, et al., PMID:21766293 and Gamis AS, Alonzo TA, Meshinchi S, et al., PMID:25092781). The current study used genomic DNA from 1,225 pediatric patients treated in these two trials with 470 patients treated with standard chemotherapy in COG-AAML0531 (ADE arm) and 755 patients treated with addition of GO to standard therapy in COG-AAML03P1 and COG-AAML0531 trials (ADE+GO arm). The 132 SNPs in 42 genes within DNA-damage repair (DDR) pathways or genes implicated in mediating calicheamicin were selected for genotyping. These SNPs were genotyped using the Sequenom platform at the Biomedical Genomics Center, University of Minnesota. All SNPs had a call rate of more than 0.98 and were in accordance with the Hardy–Weinberg equilibrium. These genotypes were used to test for association with clinical endpoints as defined in the COG-AAML03P1 and COG-AAML0531 trials.
Study 1 2R01-NS050375 (PI: DOBYNS, William B.) The genetic basis of mid-hindbrain malformations Our general goal for this project is to advance our understanding of human developmental disorders that involve the brainstem and cerebellum - brain structures derived from the embryonic midbrain and hindbrain - that affect a minimum of 2.4 per 1000 resident births based on data from the CDC. Importantly, this large class of disorders co-occurs with more common developmental disorders such as autism, mental retardation and some forms of infantile epilepsy, and shares some of the same causes. With this renewal, we propose to expand the scope of our work beyond single phenotypes and genes to focus on delineating the critical phenotype spectra to which the most common MHM belong, and defining the underlying biological networks that are disrupted. To pursue these goals, we will use our large and growing cohort of human subjects to map additional MHM loci using SNP microarrays that provide both high-resolution autozygosity and linkage data in informative families as well as detect critical copy number variants in sporadic subjects. The causative genes will be identified using traditional Sanger or new high-throughput sequencing methods as appropriate abased on size of the critical region. We will use these and other known MHM causative genes to construct and revise model biological networks of genes and proteins, and test these genes and networks in additional patients as a candidate gene or more accurately a candidate network approach. These approaches need to be supported by ongoing active subject recruitment, as studies of comparable disorders such as mental retardation and autism have benefited from even larger numbers of subjects that we have so far collected. We need to use new high-throughput sequencing methods to more efficiently test larger critical regions, and to test entire gene networks rather than individual genes in matched cohorts of subjects. At every step; phenotype analysis, CNV analysis, model network construction and high-throughput sequencing, we will need expanded bioinformatics capabilities. Finally, we need to test the biological function of new genes and networks to support our gene identification studies. We expect that these studies will contribute immediately to more accurate diagnosis and counseling, and over time will lead to development of specific treatments for a subset of these disorders. We further expect that studies of mid-hindbrain development will have broad significance for human developmental disorders generally, providing compelling evidence for a connection between cerebellar development and other classes of developmental disorders such as autism, mental retardation and epilepsy. Study 2 R01-NS058721 (PI: DOBYNS, William B.) De novo copy number variation and gene discovery in human brain malformations Project Summary/Abstract The number of recognized brain malformations and syndromes has grown rapidly during the past several decades, yet relatively few causative genes have been identified, especially for three common malformations that have been associated with numerous cytogenetically visible chromosome deletions and duplications, and that often occur together: agenesis of the corpus callosum (ACC), cerebellar vermis hypoplasia (CVH) including Dandy-Walker malformation (DWM), and polymicrogyria (PMG). We propose to perform high-resolution array comparative genome hybridization (aCGH), emerging technology able to detect small copy number variants (CNV), in 700 probands with one or more of these three malformations. Our central hypothesis states that more than 10% of patients with ACC, CVH or PMG will have de novo CNV below the resolution of routine cytogenetic analysis, but detectable by current array platforms. We therefore expect to identify 70-100 patients with small CNV. We will distinguish CNV found in normal individuals from potentially disease-associated changes, and will confirm CNV using fluorescence in situ hybridization (FISH) and microsatellite (STRP) analysis. We will give highest priority to CNV that are de novo and involve 2 or more BACs, and secondary priority to familial and smaller CNV excluding known polymorphisms. After that, we will evaluate and rank candidate genes in the critical regions using information from public databases and our own expression studies, and perform mutation analysis of the best candidate genes from well-defined critical regions by sequencing in a large panel of subjects with phenotypes that match the phenotypes of the patients whose CNV define the critical regions. Here, we will use more refined criteria to supplement our clinical classification, such as the developmental level and presence of epilepsy or other birth defects. Any abnormalities found will be analyzed using existing data regarding polymorphisms (i.e. dbSNP), cross-species comparisons, and functional assays appropriate for the specific sequence change. Study 2A In 1995, we described a novel multiple congenital anomaly syndrome associated with facial dysmorphism (congenital ptosis, high arched eyebrows, shallow orbits, trigonocephaly), colobomas of the eyes, neuronal migration malformation (frontal predominant lissencephaly) and variable hearing loss. We hypothesized from de novo mutations and used trio-based exome sequencing to identify de novo mutations in the ACTB and ACTG1 genes. Study 2B In 1997 and 2004, we and others defined two novel developmental syndromes associated with markedly enlarged brain size, or megalencephaly, and other highly recognizable features. The megalencephaly-capillary malformation syndrome (MCAP) consists of megalencephaly and associated growth dysregulation with variable asymmetry, developmental vascular anomalies, distal limb malformations, variable cortical malformation, and a mild connective tissue dysplasia. The megalencephaly-polymicrogyria-polydactyly-hydrocephalus syndrome (MPPH) resembles MCAP but lacks vascular malformations and syndactyly. We hypothesized that MCAP and MPPH result from mutations - including postzygotic events - in the same pathway, and studied them together. Using a combination of exome sequencing, Sanger sequencing, restriction-enzyme assays, and targeted ultra-deep sequencing in 50 families with MCAP or MPPH, we identified de novo germline or postzygotic mutations in three core components of the phosphatidylinositol-3-kinase/AKT pathway. These include two mutations in AKT3, a recurrent mutation in PIK3R2, and multiple mostly postzygotic mutations in PIK3CA (Rivière JB, Mirzaa GM, O'Roak BJ, Beddaoui M, Alcantara D, Conway RL, St-Onge J, Schwartzentruber JA, Gripp KW, Nikkel SM, Worthylake T, Sullivan CT, Ward TR, Butler HE, Kramer NA, Albrecht B, Armour CM, Armstrong L, Caluseriu O, Cytrynbaum C, Drolet BA, Innes AM, Lauzon JL, Lin AE, Mancini GMS, Meschino WS, Reggin JD, Saggar AK, Lerman-Sagie T, Uyanik G, Weksberg R, Zirn B, Beaulieu CL, FORGE Canada Consortium, Majewski J, Bulman DE, O'Driscoll M, Shendure J, Graham Jr. JM, Boycott KM, Dobyns WB. De novo germline and postzygotic mutations in AKT3, PIK3R2 and PIK3CA cause a spectrum of related megalencephaly syndromes. Nat. Genet. In press). Study 3 2R01-NS046616 (PI: GOLDEN, Jeffrey A) The role of ARX in normal and abnormal brain development This subcontract from the Children's Hospital of Philadelphia to the University of Chicago (UC) is intended to support research studies of the ARX and functionally related genes in human subjects with any one of several specific developmental disorders. The Co-investigator at UC (W.B. Dobyns) will identify a series of patients with mental retardation and severe infantile epilepsy, some of whom will have specific brain malformations and others who will have normal brain structure by brain imaging studies, and collect research samples from these subjects with informed consent. The studies to be performed will include mutation analysis of ARX, mutation analysis of specific downstream target genes, X inactivation studies in humans and X inactivation studies in mutant mice. The results will be analyzed to determine the significance of any changes found in the gene.