Search Results - EGA European Genome-Phenome Archive

Profiling the genomic landscape and evolutionary history of polyploid giant cancer cells in undifferentiated pleomorphic sarcomas

Polyploid giant cancer cells (PGCCs), characterised by multi-nucleation and atypical nuclear morphology, are a common feature of undifferentiated pleomorphic sarcomas. While PGCCs may be a critical substrate for cancer evolution, their formation pathways and the genomic consequences of PGCCs remain relatively under explored. In this study, we characterise PGCCs in undifferentiated pleomorphic sarcomas, as well as their histological mimics, and use topographic single-cell DNA sequencing (scDNA-seq) to investigate their genomic landscape. We selected PGCCs based on their nuclear morphology, including mono-nucleated or multi-nucleated bizarre, misshapen nuclei and analysed them at single-cell resolution. Our findings highlight PGCCs as a highly heterogeneous and evolutionarily dynamic component of undifferentiated pleomorphic sarcomas.

Study EGAS50000001445

Assessment of RNA-Seq Sample Preparation Methodology

The goal of this study was two-fold, to determine if common purification techniques have any effect on downstream differential expression analysis and to evaluate combinations of alignment and differential expression software for reliability. To this end, blood was collected from three individuals and pooled during and after extraction. After pooling and mixing was completed, samples were divided into three aliquots for testing. The first aliquot was a control and not purified or concentrated. It was diluted to produce samples of varying concentrations for testing. The second aliquot was diluted to 20 ng/μL and used to test six different variations on the AMPure XP bead purification procedure. The last aliquot was also diluted to varying concentrations of 60, 30, and 9.6 ng/μL and purified using MinElute columns. Samples were submitted for total RNA-Seq library preparation and sequencing. Library preparation was performed using the TruSeq Stranded Total RNA with Ribo-Zero Globin kit (20020612, Illumina Inc.), and 2x150 bp PE sequencing was done on an Illumina NovaSeq 6000 with the S4 reagent kit. Comparisons were made between methods (AMPure vs. unpurified, AMPure vs. MinElute, MinElute vs. unpurified) to assess the effects of purification method on downstream differential expression. Comparisons were also made within methods using the varying concentrations tested for unpurified samples and for MinElute purified samples to assess the effects of concentration on differential expression. Variations of the AMPure procedure were also compared to assess the effectiveness of the variations tested in comparison to the unmodified procedure. A subset of samples was selected for use with alignment and differential expression package comparison. Unpurified high concentration samples eluted in RNAse-free water were compared to unpurified high concentration samples eluted in BR5, a buffer from the PAXgene blood miRNA kit, with the expectation that there should be no or very few differentially expressed genes. Unpurified high concentration samples eluted in RNAse-free water were also compared to unpurified low concentration samples also eluted in RNAse-free water, with possibly a small number of differentially expressed genes anticipated. A third dataset of simulated RNA-seq data was created with a known rate of differential expression. These files were aligned using Bowtie2, HISAT2, kallisto, RSEM, Rsubread, Salmon, and STAR. Results were then analyzed for differential expression using ALDEx2, baySeq, DEGseq, DESeq2, edgeR, limma, NOISeq, PoissonSeq, and SAMseq. Differential expression results of all three comparisons were evaluated to determine which combinations provided the most reliable results for both real and simulated data.

Study phs003001

Cryptic Splice Mutation in the Fumarate Hydratase Gene in Patients With Clinical Manifestations of Hereditary Leiomyomatosis and Renal Cell Cancer (HLRCC)

Four patients with hysterectomy were evaluated for biochemical and molecular evidence of autosomal dominant Fumarate Hydratase (FH) alterations causing Hereditary Leiomyomatosis and Renal Cell Cancer (HLRCC). HLRCC is an autosomal dominant condition characterized by the development of cutaneous and uterine leiomyomas, and risk for development of an aggressive form of papillary renal cell cancer. Enzyme assay, western blot analyses, direct nanopore RNA sequencing, and whole genome sequencing (WGS) were utilized. The study identified a cryptic splice mutation in intron 9 of the FH gene that results in retention of 57 base pairs of intronic sequence in the affected allele of the mature FH mRNA.

Study phs003381

Long-read-transcriptome-sequencing of CLL and MDS patients uncovers common molecular effects of SF3B1 mutations

Mutations in SF3B1 occur frequently in patients with chronic lymphocytic leukemia (CLL) and myelodysplastic syndromes (MDS), and a full-length transcriptome approach can expand our current knowledge on SF3B1 mutation effects on RNA splicing. We applied long-read-transcriptome-sequencing (LRTS) to 44 MDS and CLL patients with and without SF3B1 mutations and found a large fraction (>60%) of novel isoforms. Furthermore, we revealed that mutation effects on alternative splicing were largely common in both cancer types and specifically altered the usage of introns as well as 3’-splice-sites. We combined the LRTS with genome-wide SF3B1-RNA binding maps and show multimodal binding at 3’-splice-sites highlighting a window of 12-21nt upstream of the canonical 3’-splice-site in which a dynamic switch in splice site usage is observable in patients carrying SF3B1 mutations. Our work presents the hitherto most complete LRTS study in CLL and MDS and provides a resource for further research on aberrant splicing in cancer.

Study EGAS50000000053

Gene expression profiles of single disseminated breast cancer cells

A set of 56 EpCAM-positive cells derived from bone marrow aspirates of breast cancer patients or patients without a cancererous disease (30 cells from 21 M0-stage and 11 cells from five M1-stage breast cancer patients, 15 cells from seven non-cancer patients serving as controls). EpCAM-positive cells from breast cancer patients were considered disseminated tumor cells as they harbored copy number alterations and showed high expression of the epithelial marker EpCAM and the mammary luminal progenitor marker KIT in comparison to EpCAM-positive bone marrow cells from non-cancer patients. Paired-end RNA-Sequencing of the samples was performed on Illumina NovaSeq6000, raw data are provided in the Fastq format.

Dataset EGAD00001006359

HBCC Postmortem Psychiatric Molecular Studies

This postmortem study examines molecular, genetic and epigenetic signatures in the brains of hundreds of subjects with or without mental disorders conducted by the DIRP NIMH Human Brain Collection Core (HBCC). The brain tissues are obtained under protocols approved by the CNS IRB (NCT00001260), with the permission of the next-of-kin (NOK) through the Offices of the Chief Medical Examiners (MEOs) in the District of Columbia, Northern Virginia and Central Virginia. Additional samples were obtained from the University of Maryland Brain and Tissue Bank (contracts NO1-HD-4-3368 and NO1-HD-4-3383) (http://www.medschool.umaryland.edu/btbank/ and the Stanley Medical Research Institute: http://www.stanleyresearch.org/brain-research/). Clinical characterization, neuropathological screening, toxicological analyses, and dissections of various brain regions were performed as previously described (Lipska et al. 2006; PMID: 16997002). All patients met DSM-IV criteria for a lifetime Axis I diagnosis of psychiatric disorders including schizophrenia or schizoaffective disorder, bipolar disorder and major depression. Controls had no history of psychiatric diagnoses or addictions. SNP array: Array-based genotyping was performed on most samples published in this collection. The number of SNPs assayed via Illumina chips varied between 650,000 and 5 Million. Cerebellar tissue was generally used for genotyping studies. # Diagnosis SNP Array 1 Anxiety Disorder 1 2 Autism Spectrum Disorder 13 3 Bipolar Disorder 114 4 Control 387 5 Eating Disorder (ED) 2 6 Major Depressive Disorder (MDD) 186 7 Obsessive Compulsive Disorder (OCD) 5 8 Post-Traumatic Stress Disorder (PTSD) 0 9 Schizophrenia 220 10 Other 7 11 Tic Disorder 3 12 Undetermined 1 13 Williams Syndrome 2 Table: Numbers of samples in each diagnostic category. DNA extraction: 45-80 mg of cerebellar tissue was pulverized for DNA extractions. The QIAamp DNA mini Kit (Qiagen) method was employed for tissue DNA extraction. The tissue was initially lysed using Tissue Lyser (Qiagen) and extractions were accomplished according to manufacturer's protocol. The DNA was captured in 500uL elution buffer. The concentrations were measured using Thermo Scientific's NanoDrop 1000/NanoDrop ONE. The mean yield was 128.85 uG (+/- 79.48), the mean ratio of 260/280 was 1.87 (+/- 0.105), and the mean ratio of 260/230 was 2.48 (+/-1.75). Genotyping methods: Three types of Illumina Beadarray chips were used: HumanHap650Y, Human1M-Duo, and HumanOmni5M-Quad (San Diego, California). The genotyping was done according to the manufacturer's protocol (Illumina Proprietary, Catalog # WG-901-5003, Part # 15025910 Rev.A, June 2011). Approximately, 400ng DNA was used and each DNA sample was QC tested for 260/280 ratio by nanodrop and DNA band intactness on 2% agarose gel. Briefly, the samples were whole-genome amplified, fragmented, precipitated and resuspended in appropriate hybridization buffer. Denatured samples were hybridized on prepared Bead Array Chips. After hybridization, the Bead Chip oligonucleotides were extended by a single fluorescent labeled base, which was detected by fluorescence imaging with an Illumina Bead Array Reader, iScan. Normalized bead intensity data obtained for each sample were loaded into the Illumina Genome Studio (Illumina, v.2.0.3) with cluster position files provided by Illumina, and fluorescence intensities were converted into SNP genotypes. Microarray: We generated RNA expression data using array technology for psychiatric subjects compared to non-psychiatric subjects as controls. We used tissues from three different brain regions i.e. hippocampus, dorsolateral prefrontal cortex (DLPFC), and dura mater for a large cohort of individuals (total number 552 subjects for hippocampus, 800 for DLPFC and 146 for dura). Total RNA was extracted from ~100 mg of tissue using the RNeasy kit (Qiagen) according to the manufacturer's protocol. RNA quality and quantity were examined using the Bioanalyzer (Agilent, Inc) and NanoDrop (Thermo Scientific, Inc), respectively. Samples with RNA integrity number (RIN) # Diagnosis DLPFC Hippo Dura 1 Anxiety Disorder 1 0 0 2 Autism Spectrum Disorder 14 6 0 3 Bipolar Disorder 90 49 0 4 Control 336 270 75 5 Eating Disorder (ED) 2 1 0 6 Major Depressive Disorder (MDD) 144 87 0 7 Obsessive Compulsive Disorder (OCD) 5 3 0 8 Post-Traumatic Stress Disorder (PTSD) 6 0 0 9 Schizophrenia 192 125 71 10 Other 5 6 0 11 Tic Disorder 3 3 0 12 Undetermined 1 1 0 13 Williams Syndrome 2 1 0 Table: Numbers of samples in each diagnostic category. RNA-Seq of Dorso-lateral prefrontal cortex: All brains were collected and the dorsolateral prefrontal cortical (DLPFC) samples dissected at the HBCC, DIRP, NIMH. Dorsolateral prefrontal cortex (DLPFC) specimens were dissected from right or left hemisphere of frozen coronal slabs. The study was funded by the DIRP, NIMH under contract (#HHSN 271201400099C) with Icahn School of Medicine at Mount Sinai,1106402 One Gustave L. Levy Place, Box 3500, New York NY 10029-6574. RNA extraction, library preparation and sequencing were performed under contract at Icahn School of Medicine. The Common Mind Consortium (CMC) provided project management support. RNA isolation: Total RNA from 468 HBCC samples was isolated from approximately 100 mg homogenized tissue from each sample by TRIzol/chloroform extraction and purification with the Qiagen RNeasy kit (Cat#74106) according to manufacturer's protocol. Samples were processed in randomized batches of 12. The order of extraction for schizophrenia, bipolar, and MDD disorders and control samples was assigned randomly with respect to diagnosis and all other sample characteristics. The mean total RNA yield was 24.2 ug (+/- 9.0). The RNA Integrity Number (RIN) was determined by 4200 Agilent TapeStation System. Samples with RIN DLPFC RNA-Seq quantified expression data are provided for 364 samples. Data were generated, QC'd, processed and quantified as follows: RNA library preparation and sequencing: All samples submitted to the New York Genome Center for RNAseq were prepared for sequencing in randomized batches of 94. The sequencing libraries were prepared using the KAPA Stranded RNAseq Kit with RiboErase (KAPA Biosystems). rRNA was depleted from 1ug of RNA using the KAPA RiboErase protocol that is integrated into the KAPA Stranded RNAseq Kit. The insert size and DNA concentration of the sequencing library was determined on Fragment Analyzer Automated CE System (Advanced Analytical) and Quant-iT PicoGreen (ThermoFisher) respectively. Schizophrenia Bipolar Control 89 65 210 Table: Numbers of samples in each diagnostic category. RNA-Seq of subgenual anterior cingulate cortex (sgACC): All the 200 post-mortem brain samples (61 controls; 39 bipolar disorder; 46 schizophrenia; 54 major depressive disorder) were collected by the HBCC, DIRP, NIMH. RNA Extraction and Quality Assessment: Tissue from sgACC was pulverized and stored at -80°C. Total RNA was extracted from 50-80 mg of the tissue using QIAGEN RNeasy Lipid Tissue Mini Kit (QIAGEN, Cat. # 74804) with DNase treatment (QIAGEN, Cat. # 79254). The RNA Integrity Number (RIN) for each sample was assessed with high-resolution capillary electrophoresis on the Agilent Bioanalyzer 2100 (Agilent Technologies, Palo Alto, California). The concentration of RNA and their 260/280 ratio (2.1+/- 0.032 SD) were determined with NanoDrop (Thermo Scientific). RNA sequencing: Stranded RNA-Seq libraries were constructed after rRNA depletion using Ribo-Zero GOLD (Illumina). RNA sequencing was performed at National Institute of Health Intramural Sequencing Center (NISC). Schizophrenia Bipolar Control MDD 46 39 61 54 Table: Numbers of samples in each diagnostic category. Whole Genome Sequencing: All brains were collected and dissected at the HBCC, DIRP, NIMH. This study generates whole genome sequencing data using sequencing of DNA in the dorsolateral prefrontal cortex (DLPFC), anterior cingulate cortex (ACC) or cerebellum of 443 individuals with schizophrenia, bipolar disorder and major depressive disorder and non-psychiatric controls. The study was funded by the DIRP, NIMH under contract (#HHSN 271201400099C) with Icahn School of Medicine at Mount Sinai,1106402 One Gustave L. Levy Place, Box 3500, New York NY 10029-6574. DNA extraction, library preparation and sequencing were performed under contract at Icahn School of Medicine. The Common Mind Consortium (CMC) provided project management support. All specimens were dissected from right or left hemisphere of frozen coronal slabs. DNA Library Preparation and Sequencing: All samples submitted to the New York Genome Center for WGS were prepared for sequencing in randomized batches of 95. The sequencing libraries were prepared using the Illumina PCR-free DNA sample preparation Kit. The insert size and DNA concentration of the sequencing library was determined on Fragment Analyzer Automated CE System (Advanced Analytical) and Quant-iT PicoGreen (ThermoFisher) respectively. A quantitative PCR assay (KAPA), with primers specific to the adapter sequence, was used to determine the yield and efficiency of the adaptor ligation process. Performed on the Illumina HiSeqX with 30X coverage. Schizophrenia Bipolar Control 115 78 230 Table: Numbers of samples in each diagnostic category. ChIP-Seq: All brains were collected and the dorsolateral prefrontal cortical (DLPFC) samples dissected at the HBCC, DIRP, NIMH. This study generates epigenetic data using sequencing of DNA after chromatin immunoprecipitation (ChIP-Seq) for marks H3K4me3 and H3K27ac in the dorsolateral prefrontal cortex (DLPFC). Dorsolateral prefrontal cortex (DLPFC) specimens were dissected from right or left hemisphere of frozen coronal slabs. The study was funded by the DIRP, NIMH under contract (#HHSN 271201400099C) with Icahn School of Medicine at Mount Sinai,1106402 One Gustave L. Levy Place, Box 3500, New York NY 10029,6574. Chromatin precipitation, library preparation and sequencing were performed under contract at Icahn School of Medicine. The Common Mind Consortium (CMC) provided project management support. Chromatin immunoprecipitation (ChIP) assays for histone marks H3K4me3 and H3K27ac were carried out using Native ChIP. Micrococcal Nuclease (MNase) (Sigma, N3755) treatment was used to digest chromatin into mononucleosomes. The following antibodies were used for chromatin pull-down: anti-H3K4me3 (Cell Signaling, Cat# 9751BC, lot 7) and anti-H3K27ac (Active Motif, Cat# 39133, Lot # 31814008). Histone modification-enriched genomic DNA fragments were recovered using Protein A/G magnetic beads (Thermo Scientific, 88803-88938 or Millipore 16-663), and then washed, eluted, and treated with RNAse A and proteinase K. Final ChIP DNA products were isolated using phenol-chloroform extraction followed by ethanol precipitation. The efficiency of each ChIP assay was validated using Qubit concentration measurement and qPCR for positive (GRIN2B, DARPP32) and negative (HBB) control genomic regions. Only ChIP assays that passed quality control were further processed for library preparation and sequencing; this included ChIP DNA that was not detectable on Qubit but showed a good signal and expected enrichment patterns in qPCR. HISTONE_MARK H3K27ac H3K4me3 Input Bipolar 56 4 7 Control 158 11 24 Schizophrenia 79 11 12 Table: Numbers of individuals in each assay grouped by histone mark or input.Long-Read Whole-Genome Sequencing (WGS) Cohort Description: Brain specimens were obtained from the Human Brain Collection Core (HBCC), part of the NIH NeuroBioBank. Samples were collected under protocols approved by the NIH CNS Institutional Review Board (IRB) (NCT03092687), with informed consent from next-of-kin (NOK). Collection was coordinated through the Offices of the Chief Medical Examiners (MEOs) in Washington, D.C., Northern Virginia, and Central Virginia. Clinical metadata and documentation are publicly available via the NIMH Data Archive (NDA) (Collection #3151) https://nda.nih.gov/edit_collection.html?id=3151 Eligibility Criteria No clinical diagnosis of major neuropsychiatric or neurodegenerative diseaseNo diagnosis of cognitive impairment during life All individuals were confirmed to be neurologically normal at time of deathDemographics Initial cohort size: 155 individuals Ancestry: All individuals self-identified as African or African-admixed Mean age at death: 44.2 years (range: 18–85 years) Sex distribution: 36.4% femaleSample Processing: Frozen frontal cortex tissue was dissected and processed according to the public protocol: https://www.protocols.io/view/processing-human-frontal-cortex-brain-tissue-for-p-kxygxzmmov8j/v2. High-molecular-weight DNA was extracted and libraries were prepared using the Oxford Nanopore Technologies (ONT) LSK-114 kit. Sequencing was performed using ONT PromethION flow cells (R10.4.1 chemistry) Data Processing and Quality Control: Basecalling: Conducted using Guppy v6.38 Read Alignment: Reads were aligned to the GRCh38 reference genome using minimap2 Sample Identity Verification: Sample identity was validated by comparing ONT-derived SNP calls with matched short-read WGS genotypes to ensure concordance and prevent sample swaps Variant Calling and Phasing: Reads were base-called with Guppy v6.38. Reads were aligned to GRCh38 using minimap2. We verified sample identity by cross-checking ONT SNV calls with the existing short-read WGS genotypes, confirming no sample switches. The napu pipeline (https://github.com/nanoporegenomics/napu_wf) produced; haplotype-resolved assemblies, joint small-variant (SNV/indel) calls, and multi-caller structural-variant sets, all reported on GRCh38 and phased where possible. Raw signal data were basecalled to obtain 5-methyl-cytosine (5mC) status; methylation tags were added to the phased BAM files. Genome-wide methylation summaries are provided in BED format.Dataset Filtering and Exclusions: All 155 samples underwent sequencing and SNP-based ancestry inference 8 samples were excluded due to ancestry inconsistent with African or African-admixed background 1 sample was excluded due to insufficient sequencing quality Final Sample Set: 146 high-quality samples from individuals of African or African-admixed ancestry were retained for downstream analyses See PMID: 39764002 for further analysis detailsDiagnosis#SamplesControl155Table: Diagnostic Summary.Note: The data derived from HBCC resources were removed from dbGAP and are now available in the NIMH Data Archive (NDA). They include genotypes, short read whole genome sequencing (WGS), epigenetics (DNA methylation, ChIP-seq for histones), RNA expression (qPCR, microarray, RNA-seq, single nucleus RNA-seq) of various brain regions in cases with schizophrenia, bipolar disorder, major depression, substance use disorders and normative controls. Please access our NDA collection (https://nda.nih.gov/edit_collection.html?id=3151) for further detail.

Study phs000979

Gastrointestinal Cancer Treatment Responders

Comprehensive genomic profiling of colon adenocarcinomas has revealed multiple recurrent alterations that may inform new treatment strategies. Clinical trials of these agents are ongoing, although the mechanisms of response and resistance to these agents are not well characterized. The goal of this study is to perform comprehensive profiling of pre-treatment and post-resistance tumor and germline samples obtained from patients with colon cancer who receive these agents by carrying out whole exome sequencing and RNA sequencing, and to use these data to identify mechanisms of response and resistance.

Study phs000803

Acquired Cross-Resistance in Small Cell Lung Cancer Patient-Derived Xenografts

Here we present whole genome sequencing and RNA sequencing of patient-derived xenograft (PDX) models of small cell lung cancer (SCLC). These models were derived at a variety of clinical time points from either biopsy/resection samples, malignant effusions, or circulating tumor cells (CTCs), and grown in the subcutaneous flank of NSG (NOD.Cg-Prkdcscid Il2rgtm1Wjl /SzJ) mice. The first analysis published with these data was to determine the genomic and transcriptomic features of PDX models derived after relapse that demonstrated in vivo resistance to multiple chemotherapy regimens (cross-resistance).

Study phs003486

Hypermutation of the inactive X chromosome is a frequent event in cancer

Mutation is a fundamental process in tumorigenesis. However the degree to which the rate of somatic mutation varies across the human genome and the mechanistic basis underlying this variation remain to be fully elucidated. As part of the ICGC PedBrain and Malignant Lymphoma (MMML-Seq) consortium we performed a cross-cancer comparison of whole genomes comprising a diverse set of childhood and adult tumors including both solid and hematopoietic malignancies. In addition we performed whole genome sequencing of clonally expanded hematopoietic stem/progenitor cells (HSPCs) from healthy individuals to compare somatic mutation rates.

Study EGAS00001000565

Human skin cancer (BCC, SCC, melanoma) and healthy control skin

Smart-seq2 single cell RNA sequencing of human BCC, SCC, melanoma (ALM) and healthy control skin samples.

Dataset EGAD50000000540

12312 results for "cancer rna-seq"

in 15.98 milliseconds.