Whole transcriptome RNA-sequencing of purified bone marrow blasts of 136 de novo, treatment naive AML patients. For further details, we refer to the manuscript "The Proteogenomic Landscape of AML" by Jayavelu, Wolf, Buettner et al. mRNA extraction and whole transcriptome sequencing For transcriptome analysis the TruSeq Total Stranded RNA kit was used, starting with 250ng of total RNA, to generate RNA libraries following the manufacturer’s recommendations (Illumina, San Diego, CA, USA). 100bp paired-end reads were sequenced on the NovaSeq 6000 (Illumina) with a median of 57 mio. reads per sample. RNA Data Analysis Data quality control was performed with FastQC v0.11.9. Reads were aligned to the human reference genome (Ensembl GRCh38 release 82) using STAR v2.6.1. Gene count tables were generated while mapping, using Gencode v31 annotations. All downstream analyses were carried out using R v4.0 and BioConductor v3.12 (Huber et al., 2015; R Core Team, 2020). Size-factor based normalization was performed using DESeq2 v1.28.1(Love et al., 2014).
The study was conducted under the auspices of the Transdisciplinary Research In Cancer of the Lung (TRICL) Research Team, which is a part of the Genetic Associations and MEchanisms in ONcology (GAME-ON) consortium, and associated with the International Lung Cancer Consortium (ILCCO). Ethics All participants provided written informed consent. All studies were reviewed and approved by institutional ethics review committees at the involved institutions. Sequencing data are derived from four sub-studies. The sub-studies that contributed include Harvard, Liverpool, Toronto, and IARC. The IARC and Toronto studies are described above. A description of the Harvard and Liverpool studies is provided below. Liverpool Lung Project: The Liverpool Lung Project (LLP)1 is a case control and cohort study, which has over 11,500 individuals, with detailed epidemiological, clinical and outcome data with associated specimens (i.e. tumour tissue, blood, plasma, sputum, bronchial lavage, EBUS and oral brushings). The participants have completed a detailed lifestyle questionnaire and updated data on clinical outcome and hospital events are collected through the Office of National Statistics, Cancer Registry and from Health Episode Statistics. The project is registered on the UK National Institute for Health Research (NIHR) lung cancer portfolio and has all the required ethical approvals and sponsorship arrangements in place. The LLP has detailed standard operating procedures (SOP) for all aspects of the recruitment, data, specimen collection as well as the data storage. The LLP Cohort study has 8,224 participants with blood and 7,761 with plasma samples. The LLP case-control samples have been incorporated into in a large number of international GWAS and molecular studies 2,3, methylation 4-7, microRNA 8and next generation studies 9-11, resulting in high ranking publications, as well as forming the basis for the LLP risk prediction model 12-14 which has been utilised in the UK lung cancer screening trial (UKLS) 15-17 Patient and control DNAs were derived from EDTA-venous blood samples. Harvard Samples. David Christiani at the Harvard University School of Public Health has been directing research studies to investigate etiological factors influencing lung cancer development since 1983 and has amassed a collection of 2000 controls and 5055 lung cancer cases. He has been actively collecting and storing snap frozen tumor samples since 1992. Around 1500 tumor samples have been collected and the average wet tumor yield is about 30 grams of tumor, of which 631 cases have completely annotated clinical and survival information. Pathology confirmation is provided by two pathologists. At the time of surgery, a minimum of 30 grams of wet lung tumor tissue and 30 grams of non-involved tissue from the same lobe is sectioned, flash frozen and sent to Dr. Christiani's lab for logging and storage. A blood sample for DNA and serum is collected. A structured interview by trained research staff is conducted on each case, and clinical outcomes and treatments is extracted and entered into the molecular epidemiology data base at Harvard. Fresh frozen samples have been collected from 1451 lung cancer and are available for study. Samples from this collaborative study have played key roles in major studies, including the initial finding describing EGFR mutations in lung cancer 22. Participants in this study are patients, > 18 years of age, with newly diagnosed histologically confirmed lung cancer. Samples that are included in the analysis have the following histologies: Adenocarcinoma: 8140/3, 8250/3, 8260/3, 8310/3, 8480/3 8560/3; LCC: 8012/3, 8031/3; squamous carcinoma: 8070/3, 8071/3, 8072/3, 8074/3; and other NSCLC: 8010/3, 8020/3, 8021/3, 8032/3, 8230/3. The Toronto Study: The Toronto study was conducted in the Great Toronto Area between 1997 and 2014. Cases were recruited at the hospitals in the network of University of Toronto and Lunenfeld- Tanenbaum Research Institute. At the time of recruitment in the clinical setting, provisional diagnoses of lung carcinoma were first assigned based on clinical criteria. Diagnoses for all cases included were histologically confirmed by the reference pathologist who is a specialist in pulmonary pathology, based on review of pathology reports from surgery, biopsy or cytology samples in 100% of cases. Diagnostic classification was done initially according to ICD-9, ICD-10, and ICD for oncology-2, and subsequently converted to ICD-O-3. Tumors were grouped into the major categories included in this analysis according to primary cancer type based on the ICD-3 definitions. Controls were randomly selected from individual visiting family medicine clinics and Ministry of Finance Municipal Tax Tapes. All subjects were interviewed using a standard questionnaire and information on lifestyle risk factors, occupational history, medical and family history was collected. Blood samples were collected from more than 85% of the subjects. IARC: The IARC data are derived from case-control studies conducted in Russia and include samples that have available tissue samples. Patient and control DNAs were derived from EDTA-venous blood samples. The lung cancer patients were classified according to ICD-O-3; SQ: 8070/3, 8071/3, 8072/3, 8074/3; AD: 8140/3, 8250/3, 8260/3, 8310/3, 8480/3, 8560/3, 8251/3, 8490/3, 8570/3, 8574/3; with tumous with overlapping histologies classified as mixed. The Lung Cancer Transdisciplinary Research Cohort is utilized in the following dbGaP sub-studies. To view genotypes, other molecular data, and derived variables collected in these sub-studies, please click on the following sub-studies below or in the "Sub-studies" section of this top-level study page phs000876 Lung Cancer Transdisciplinary Research Cohort. phs000877 Meta Analysis phs000878 CIDR Lung Cancer phs001681 Affy Axiom Array
Using deep RNA and whole exome sequencing of pre- and post-treatment autopsy samples, we reveal diverse clonal populations that occurred through altered cell intrinsic, tumor microenvironment and immunologic remodeling mechanisms of resistance.
RNAseq (FASTQ files) of 5 tumor from 3 non-muscle-invasive bladder cancer patients. For patient UC2, RNA from 3 different synchronous tumors was extracted. Relating to the other datasets from the project : no RNA was available from patient UC4. NB : patients aliases are the following : ROXY : UC1, OMAD : UC2, MAKI : UC3, HAPE : UC4
A genome-wide approach based on next generation sequencing (NGS) is performed on trios ( Patient+ Father + Mother) with SAID in order to identify known and new gene variants associated with certain forms of the disease. WGSwas performed on frozen PBMCs. DNA was extracted using the QIAamp DNA Blood Midi Kit and quantified with Nanodrop. Sequencing was done on a NovaSeq 6000 with S4 flow cells, targeting 30× coverage.
This research project was a collaboration between University College London, UK and the Stanley Center at the Broad Institute. In this project we sequenced and analyzed the whole exomes of 1,623 Schizophrenia, Bipolar and Control samples from collaborators in UK. Genomic DNA from each sample was sequenced to a mean depth of 20x. The exome used Twist capture and samples were sequenced on Illumina HiSeqX machines producing CRAM files.
Genomic DNA was obtained from M116 peripheral blood sample and was used for targeted deep sequencing (TDS) studies. Barcoded libraries were prepared according to the manufacturer’s instructions, using a probe-based panel (KAPA HyperCap, Roche®) targeting frequently mutated regions of 50 myeloid-related genes. Samples were run on a MiSeq (Illumina®) sequencer for paired-end 2x75 bp reads with a mean coverage of 1000X.
A scRNAseq dataset consisting of stem cell-derived islets treated with gradient separation (HUES8_Pure_Islets) and unpurified control (HUES8_control_islets). scRNAseq was done on cryopreserved stage 7 stem cell (HUES8 cell line)-derived islets. Details about stem cell differentiation and sequencing can be found on the publication by Rajaei et al. (2025, Science Translational Medicine). This dataset correspond to Fig 4 of the publication.
The dataset contains RNAseq profiles of 182 patients from the CA017-003 clinical trial. The Allprep DNA/RNA FFPE kit was used to simultaneously purify genomic DNA and total RNA from formalin-fixed, paraffin embedded (FFPE) tissue sections. RNAseq libraries (50PE, 50M) were constructed using Illumina TruSeq RNA Access method. Fastq files are included.
This dataset provides insight into the immune response in extrapulmonary tuberculosis (EPTB), using bulk RNA-seq data to analyze different disease severities. Hierarchical clustering identified three distinct levels of severity, revealing disease progression driven by interferon and IL-1β-mediated signaling. Additionally, the dataset helped develop a diagnostic gene expression signature for both EPTB and pulmonary TB.