This dataset includes RNA-seq data of samples from our paper titled "Dynamic phenotypic heterogeneity and the evolution of multiple RNA subtypes in Hepatocellular Carcinoma: The PLANET study." (National Science Review, nwab192. https://doi.org/10.1093/nsr/nwab192)
Multiple myeloma is a clonal plasma cell (PC) dyscrasia that arises from precursors and has been studied utilizing approaches focused on CD138+ cells. By combining single-cell (sc)RNA- with scB-cell receptor (BCR)-sequencing, we differentiate monoclonal/neoplastic from polyclonal/normal PCs, and find more dysregulated genes, especially in precursor patients, than by analyzing bulk PCs. To determine if this approach can identify oncogenes that contribute to disease pathobiology, MAD2L1 and MAT2A are validated as targets with drug-like molecules that suppress myeloma growth in pre-clinical models. Moreover, functional studies show a role of LAMP5, which is uniquely expressed in neoplastic PCs, in tumor progression and aggressiveness via interactions with c-MYC. Finally, a monoclonal antibody recognizing cell-surface LAMP5 shows efficacy as an antibody-drug conjugate and in a chimeric antigen receptor-guided T-cell format. These studies provide additional insights into myeloma biology and identify potential targeted therapeutic approaches that can be applied to reverse myeloma progression.
Adeno-associated virus (AAV) is a defective mono-stranded DNA virus, endemic in human population (40-80%). AAV infection has long been considered as non-pathogenic1, however few years ago we reported for the first time recurrent clonal AAV2 insertion in the pathogenesis of human hepatocellular carcinoma (HCC) developed on normal liver. These clonal viral insertions target cancer driver genes, including CCNA2, CCNE1, TERT, TNFSF10 and MLL4, leading to their overexpression. The viral inserted sequences involved in almost all the cases the 3’ inverse tandem repeat (ITR) of AAV2, which is important for virus integration in host DNA and exhibits a promoter/enhancer activity. Here, we used RNA sequencing (RNA-seq) to investigate their functional impact on the tissue, such as fusion transcript generation events.
Background and Rationale for the Childhood Cancer Survivor Study (CCSS) Over the last several decades, advances in treatments for childhood and adolescent cancer have substantially improved survival following diagnosis. These improvements gave rise to the responsibility for investigating long-term treatment-associated morbidity and mortality. Early efforts to describe late effects were largely conducted through single-institution and limited consortia studies. However, by the mid-1980s, it became increasingly clear that these approaches had inherent limitations, including small sample size, convenience sampling, incompletely characterized populations, and limited length of follow-up. To overcome these limitations, the CCSS was proposed and funded by the National Cancer Institute (NCI) as a U01 grant in 1994. Subsequently, the strengths of the CCSS, including an efficient and extensive infrastructure, plus expanding database and biorepository, were recognized and appreciated. Thus, in consultation with the NCI, the CCSS was converted to a U24 (resource grant) funding mechanism to serve the scientific community in 2000. The overarching goal of the CCSS resource is to increase the conduct of innovative and high impact research related to pediatric cancer survivorship. CCSS has been used extensively by researchers from a wide range of disciplines to address a broad spectrum of topics. Strengths of the resource include its large size, comprehensive annotation of treatment exposures, ongoing longitudinal follow-up with characterization of a wide array of participant characteristics and outcomes, and an established biorepository. Design of the Childhood Cancer Survivor Study The Childhood Cancer Survivor Study (CCSS) is a multi-institutional, multi-disciplinary collaborative research resource comprised of a retrospective hospital-based cohort of survivors of childhood cancer and a comparison sibling cohort. Eligible survivors from 31 participating institutions were diagnosed between 1970 and 1999, prior to age 21 years, with selected common pediatric cancers (leukemia, central nervous system tumors, Hodgkin lymphoma, non-Hodgkin lymphoma, kidney tumors, neuroblastoma, soft tissue sarcoma, or bone tumors). All patients who survived five years from the date of diagnosis were eligible, regardless of disease or treatment status. The baseline questionnaire was completed by 24,368 survivors and 5,039 siblings recruited to serve as a comparison group. To date, participants have completed three general follow-up surveys, as well as a number of specialized surveys on specific topics (e.g. health care, insurance, screening practices, men's and women's health issues, adolescent health, sleep and fatigue). In addition, biological samples (buccal cells, saliva and/or blood) have been collected for over 11,000 participants. Full descriptions of the design and characteristics of the CCSS have been previously published (Robison et al; Leisenring et al.), and available data and samples are described at https://ccss.stjude.org/develop-a-study/gwas-data-resource.html. Treatment Data in the Childhood Cancer Survivor Study A key feature of CCSS is the availability of detailed treatment data, which were collected by abstraction of medical records for each individual member of the cohort. Detailed abstraction included dates of therapy, protocol information, and specific details regarding surgery, chemotherapy and radiation. Quantitative dose details were collected for 22 specific chemotherapeutic agents, including alkylating agents, anthracyclines, platinum compounds and epipodophyllotoxins. In addition to individual agent doses, algorithms have been created to calculate cumulative doses of all drugs in a specific class, such as anthracyclines (doxorubicin, daunomycin and idarubicin) or platinum agents (cisplatinum and carboplatinum). Data abstracted for surgeries included dates and both the names and corresponding International Classification of Diseases (9th revision) code. For radiation treatment data, all relevant records were sent to the Radiation Physics Center at M.D. Anderson Cancer Center for detailed abstraction and dosimetry. Initial body region dosimetry was performed for all participants, followed by more detailed dosimetry as needed for specific studies. Genomics Data in the Childhood Cancer Survivor StudyThe NCI's Division of Cancer Epidemiology and Genetics and CCSS investigators collaborated to conduct genomics studies (SNP array genotyping and whole exome sequencing) using samples from the CCSS Biorepository. Studies included all cohort participants with available DNA regardless of sex or ancestry when the genomics studies were initiated. Phenotype Data in the Childhood Cancer Survivor Study Vital status and cause of death for both participants and non-participants is determined via linkage with the National Death Index (NDI). Identification of subsequent neoplasms is based on self-report, followed by validation using medical records, or via NDI. A wide array of additional health outcomes have been ascertained via a comprehensive set of questions on the CCSS questionnaires, covering potential adverse events across a range of organ systems (hearing/vision/speech, urinary, hormonal, heart and circulatory, respiratory, digestive, brain and nervous systems). In addition to health outcomes, longitudinal data have been collected on demographics, health behaviors, family history, screening practices, insurance status, and a range of psychosocial and neurocognitive factors. A full listing of available variables and copies of the CCSS questionnaires are available at http://ccss.stjude.org. Research Areas in the Childhood Cancer Survivor Study Extensive use by the research community has resulted in over 265 published manuscripts on a wide range of topics, including associations between treatment factors and mortality, subsequent neoplasms, chronic health conditions, cardiac events, neurocognitive sequelae, psychosocial factors, fertility, and health status. Additional topics have included health behaviors, screening practices, health care access and utilization, statistical and exposure assessment methodology, and development of risk prediction models. A full listing of published manuscripts using CCSS data is available on the CCSS website at https://ccss.stjude.org/published-research/publications.html. The Childhood Cancer Survivor Study as a Resource for Investigators The CCSS is an NCI-funded resource (U24 CA55727) to promote and facilitate research among long-term survivors of cancer diagnosed during childhood and adolescence. Interested investigators are encouraged to develop research ideas and propose projects within CCSS, whether or not they are from a participating CCSS institution. The CCSS is now accepting proposals to collaborate with CCSS and NCI investigators in the use of genomics data and corresponding outcomes-related data to address innovative research questions relating to potential genetic contributions to risk for treatment-related outcomes. Any researcher, or group of researchers, qualified to conduct genetic research can submit a proposal. There are no restrictions relative to country, institution, or prior involvement in CCSS. A full description of the process for developing a proposal for genetic research in CCSS can be found at https://ccss.stjude.org/develop-a-study/gwas-data-resource.html, along with listings of approved proposals.
Sparse profiling of CpG methylation in blood by microarrays have identified epigenetic links to common diseases. We apply methylC-capture sequencing (MCC-Seq) in a clinical population of ~200 adipose tissue and matched blood samples (Ntotal ~400), providing high-resolution methylation profiling (>1.3M CpGs) at regulatory elements. We link methylation to cardiometabolic risk through associations to circulating plasma lipid levels and identify lipid-associated CpGs with unique localization patterns in regulatory elements. We show distinct features of tissue-specific versus tissue-independent lipid-linked regulatory regions by contrasting with parallel assessments in ~800 independent adipose tissue and blood samples from the general population. We follow-up on adipose-specific regulatory regions under (1) genetic and (2) epigenetic (environmental) regulation via integrational studies. Overall, the comprehensive sequencing of regulatory element methylomes reveals a rich landscape of functional variants linked genetically as well as epigenetically to plasma lipid traits.
Lung carcinoma is the leading cause of cancer death in the United States and world-wide; lung adenocarcinoma is the most common cause of lung cancer. Pilot studies of lung adenocarcinoma with hybrid-capture based whole exome sequencing will enable us to identify new targets for therapy and improve diagnosis. We will analyze a blend of whole exome and whole genome sequencing data as well as copy number and somatic mutation calls for 200 tumor and matched normal controls. When completed this study will represent the most comprehensive lung adenocarcinoma genome dataset to date.
The Stratton team uses DNA sequencing for somatic mutations to advance understanding of the causes of cancer. In particular, the investigation of “mutational signatures”, which report the mutational processes operative over the lifetime of each individual, to understand whether they are generated by endogenous or exogenous exposures and the extent to which these vary between human populations. Our work brings together global cancer epidemiology and genomics and is particularly characterised by large-scale multinational studies. Whole genome sequencing of colon crypts from healthy paediatric individuals for identification of early life exposure to carcinogens.
RNA-seq data for rare cells in the haematopoietic lineages, from adult and cord blood samples.
In-patient comparison of single-cell RNA-sequencing (scRNA-seq) and single-nucleus (snRNA-seq) technologies and accompanying tissue processing protocols on transjugular liver biopsy from decompensated cirrhosis patients (n = 3).
A multi-omic dataset of single-nucleus paired ATAC-seq + RNA-seq data of nuclei from the post-mortem human primary motor cortex from patients with ALS/ALS-FTD and unaffected controls.