scRNAseq dataset of colonic organoids derived from epithelium from biopsies taken from three healthy human individuals. The organoids have either been grown in standard conditions (control) or treated with IL22 (treated). Includes 6 samples in total, one control from each individual (ctrl1, ctrl2, ctrl3) and one treated from each (treat1, treat2, treat3). The samples have been multiplexed using the antibody hashing technique. The 6 samples have been pooled into the one organoids sample. In order to analyse the raw files, they have to be demultiplexed first. Information necessary for demultiplexing, as well as which files belong to which sample, can be found in the map_file.csv, attached to each sample. Dataset includes raw Fastq files and processed csv count matrices. Fastq files are divided into HTO (hashtag) and RNA (transcriptome) files. HTO has one index (I1) and two read (R1, R2) files and RNA has two index (I1, I2) and two read (R1, R2) files. The fastq files are for the pooled (organoids) sample and need to be demultiplexed. Count matrices contain comma-separated values with cell barcodes as column names and gene names as row names. Since count matrices have been created after the demultiplexing step, there’s one matrix for each of the 6 individual samples. scRNA-seq data from human colon organoids was analysed in the same manner as for the Colitis dataset, apart from the following changes. Data was generated with the Cell Hashing technique, which uses oligo-tagged antibodies against surface proteins to barcode single cells. This allows for samples to be multiplexed together and run in a single experiment. The data was demultiplexed using the HTODemux() function from Seurat (Hao et al., 2021).
The majority of embryos that are created through IVF do not implant. It seems plausible that rates of implantation would improve if we had a better understanding of molecular factors affecting embryo competence. Currently, the process of selecting an embryo for uterine transfer utilizes an ad-hoc combination of morphological criteria, the kinetics of development, and genetic testing for aneuploidy. However, no single criterion can ensure selection of a viable embryo. In contrast, RNA-sequencing of embryos could yield highly dimensional data, which may provide additional insight and illuminate the discrepancies among current selection criteria. Indeed, recent advances enabling the production of RNA-sequencing (RNA-seq) libraries from single cells have facilitated the application of this technique to the study of some transcriptional events in early human development. However, these studies have not assessed the quality of their constituent embryos relative to commonly used embryological criteria. Here, we perform proof-of-principle advancement to clinical selection procedures by generating high quality RNA-seq libraries from a trophectoderm biopsy as well as the remaining whole embryo. We combine state-of-the-art embryological methods with low-input RNA-seq to develop the first transcriptome-wide approach for use in future predictive embryology studies. Specifically, we demonstrate the capacity of RNA-seq as a promising tool in preimplantation screening by showing that biopsies of an embryo can capture valuable information content available in the whole embryo from which they are derived. Furthermore, we show that this technique can be used to generate a RNA-based digital karyotype, and to identify candidate competence-associated genes. Together, these data establish the foundation for a future RNA-based diagnostic in IVF.
There is currently a drive to establish cell based assay systems of greater human biological and disease relevance through the use of well characterised transformed cell lines, primary cells and complex cellular models (e.g. co-culture, 3D models). However, although the field is gaining valuable experience in running more non-standard & complex cell assays for target validation and compound pharmacology studies, there is the lack of a systematic approach to determine if this expansion in cell assay models is reflected in increased human biological and disease relevance. The increasing wealth of publically available transcriptomic, and epigenome (ENCODE and Epigenome Roadmap) data represents an ideal reference mechanism for determining the relationship between cell types used for target & compound studies to primary human cells and tissues from both healthy volunteers & patients. The CTTV020 epigenomes of cell line project aims to generate epigenetic and transcriptomic profiles of cell lines and compare these with existing and newly generated reference data sets from human tissue and cell types. The aim is to identify assay systems which will provide greater confidence in translating target biology and compound pharmacology to patients. Multiple cell types commonly used within research have been grouped according to biology. Examples include erythroid, lung epithelial, hepatocyte cell types and immortalised models of monocyte / macrophage biology. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2018-10-23.
Whole-genome sequencing (WGS) was performed for 13 pairs of tumor-normal samples from patients diagnosed with NKTL. Genomic DNA from tumor tissue was extracted with QIAamp DNA Mini Kit. The DNA for the matching normal was obtained from blood or buccal swabs and purified by Blood and Cell Culture DNA Mini kit or E.Z.N.A. Tissue DNA Kit (Omega Bio-tek) according to manufacturer’s instructions. The quantity and quality were assessed by Quant-iT PicoGreen dsDNA Assay Kit (Invitrogen) and agarose gel electrophoresis. All sequencing libraries were prepared using TruSeq Nano DNA Library Prep Kit (Illumina). Paired-end sequencing was performed on Illumina HiSeq 2000 or HiSeq X Ten as 2x101 bp or 2x151 bp, respectively. 8 NKTL FFPE specimens were screened for somatic mutations using deep targeted capture sequencing (TCS). FFPE rolls or slides were extracted using QIAamp DNA FFPE Tissue kit (QIAGEN). The FFPE genomic DNA was treated with NEBNext FFPE DNA Repair Mix and assessed by Quant-it PicoGreen dsDNA Assay Kit (Invitrogen). The library was generated from 10-200 ng DNA with SureSelectXT Low Input Target Enrichment System for Illumina Paired-End Sequencing Library (Agilent Technologies) according to manufacturer’s instructions. RNA based probe was designed with SureDesign (Agilent Technologies) to target-capture 140 genes. Next, the captured libraries were pooled in equimolar concentration and sequenced on Illumina Novaseq 6000 platform with SP or S1 chip. Reads aligning to 40 selected genes were isolated post-alignment for this submission. Prefix used in filenames: T - Tumor samples N - Matched-Normal samples
Background and Rationale for the Childhood Cancer Survivor Study (CCSS) Over the last several decades, advances in treatments for childhood and adolescent cancer have substantially improved survival following diagnosis. These improvements gave rise to the responsibility for investigating long-term treatment-associated morbidity and mortality. Early efforts to describe late effects were largely conducted through single-institution and limited consortia studies. However, by the mid-1980s, it became increasingly clear that these approaches had inherent limitations, including small sample size, convenience sampling, incompletely characterized populations, and limited length of follow-up. To overcome these limitations, the CCSS was proposed and funded by the National Cancer Institute (NCI) as a U01 grant in 1994. Subsequently, the strengths of the CCSS, including an efficient and extensive infrastructure, plus expanding database and biorepository, were recognized and appreciated. Thus, in consultation with the NCI, the CCSS was converted to a U24 (resource grant) funding mechanism to serve the scientific community in 2000. The overarching goal of the CCSS resource is to increase the conduct of innovative and high impact research related to pediatric cancer survivorship. CCSS has been used extensively by researchers from a wide range of disciplines to address a broad spectrum of topics. Strengths of the resource include its large size, comprehensive annotation of treatment exposures, ongoing longitudinal follow-up with characterization of a wide array of participant characteristics and outcomes, and an established biorepository. Design of the Childhood Cancer Survivor Study The Childhood Cancer Survivor Study (CCSS) is a multi-institutional, multi-disciplinary collaborative research resource comprised of a retrospective hospital-based cohort of survivors of childhood cancer and a comparison sibling cohort. Eligible survivors from 31 participating institutions were diagnosed between 1970 and 1999, prior to age 21 years, with selected common pediatric cancers (leukemia, central nervous system tumors, Hodgkin lymphoma, non-Hodgkin lymphoma, kidney tumors, neuroblastoma, soft tissue sarcoma, or bone tumors). All patients who survived five years from the date of diagnosis were eligible, regardless of disease or treatment status. The baseline questionnaire was completed by 24,368 survivors and 5,039 siblings recruited to serve as a comparison group. To date, participants have completed three general follow-up surveys, as well as a number of specialized surveys on specific topics (e.g. health care, insurance, screening practices, men's and women's health issues, adolescent health, sleep and fatigue). In addition, biological samples (buccal cells, saliva and/or blood) have been collected for over 11,000 participants. Full descriptions of the design and characteristics of the CCSS have been previously published (Robison et al; Leisenring et al.), and available data and samples are described at https://ccss.stjude.org/develop-a-study/gwas-data-resource.html. Treatment Data in the Childhood Cancer Survivor Study A key feature of CCSS is the availability of detailed treatment data, which were collected by abstraction of medical records for each individual member of the cohort. Detailed abstraction included dates of therapy, protocol information, and specific details regarding surgery, chemotherapy and radiation. Quantitative dose details were collected for 22 specific chemotherapeutic agents, including alkylating agents, anthracyclines, platinum compounds and epipodophyllotoxins. In addition to individual agent doses, algorithms have been created to calculate cumulative doses of all drugs in a specific class, such as anthracyclines (doxorubicin, daunomycin and idarubicin) or platinum agents (cisplatinum and carboplatinum). Data abstracted for surgeries included dates and both the names and corresponding International Classification of Diseases (9th revision) code. For radiation treatment data, all relevant records were sent to the Radiation Physics Center at M.D. Anderson Cancer Center for detailed abstraction and dosimetry. Initial body region dosimetry was performed for all participants, followed by more detailed dosimetry as needed for specific studies. Genomics Data in the Childhood Cancer Survivor StudyThe NCI's Division of Cancer Epidemiology and Genetics and CCSS investigators collaborated to conduct genomics studies (SNP array genotyping and whole exome sequencing) using samples from the CCSS Biorepository. Studies included all cohort participants with available DNA regardless of sex or ancestry when the genomics studies were initiated. Phenotype Data in the Childhood Cancer Survivor Study Vital status and cause of death for both participants and non-participants is determined via linkage with the National Death Index (NDI). Identification of subsequent neoplasms is based on self-report, followed by validation using medical records, or via NDI. A wide array of additional health outcomes have been ascertained via a comprehensive set of questions on the CCSS questionnaires, covering potential adverse events across a range of organ systems (hearing/vision/speech, urinary, hormonal, heart and circulatory, respiratory, digestive, brain and nervous systems). In addition to health outcomes, longitudinal data have been collected on demographics, health behaviors, family history, screening practices, insurance status, and a range of psychosocial and neurocognitive factors. A full listing of available variables and copies of the CCSS questionnaires are available at http://ccss.stjude.org. Research Areas in the Childhood Cancer Survivor Study Extensive use by the research community has resulted in over 265 published manuscripts on a wide range of topics, including associations between treatment factors and mortality, subsequent neoplasms, chronic health conditions, cardiac events, neurocognitive sequelae, psychosocial factors, fertility, and health status. Additional topics have included health behaviors, screening practices, health care access and utilization, statistical and exposure assessment methodology, and development of risk prediction models. A full listing of published manuscripts using CCSS data is available on the CCSS website at https://ccss.stjude.org/published-research/publications.html. The Childhood Cancer Survivor Study as a Resource for Investigators The CCSS is an NCI-funded resource (U24 CA55727) to promote and facilitate research among long-term survivors of cancer diagnosed during childhood and adolescence. Interested investigators are encouraged to develop research ideas and propose projects within CCSS, whether or not they are from a participating CCSS institution. The CCSS is now accepting proposals to collaborate with CCSS and NCI investigators in the use of genomics data and corresponding outcomes-related data to address innovative research questions relating to potential genetic contributions to risk for treatment-related outcomes. Any researcher, or group of researchers, qualified to conduct genetic research can submit a proposal. There are no restrictions relative to country, institution, or prior involvement in CCSS. A full description of the process for developing a proposal for genetic research in CCSS can be found at https://ccss.stjude.org/develop-a-study/gwas-data-resource.html, along with listings of approved proposals.