Patients with metastatic pancreatic ductal adenocarcinoma (PDAC) survive longer when disease spreads to the lung but not to the liver. We generated overlapping, multi-omic datasets to identify molecular and cellular features that distinguish patients whose disease develops liver metastasis (liver cohort) from those whose disease progression results in lung metastasis without liver metastases (lung cohort). Lung cohort patients generally survived longer than liver cohort patients, independent of tumor subtype. We developed a pORG gene signature that distinguishes primary tumors in the liver and lung cohorts. We identified ongoing replication stress (RS) response pathways in high pORG/liver cohort tumors, while low pORG/lung cohort tumors had greater densities of lymphocytes and shared T cell clonal responses. Our study demonstrates that liver-avid PDAC is associated with tolerance to ongoing RS, limited tumor immunity, and less favorable outcomes; whereas low RS, lung-avid/liver-averse tumors are associated with active tumor immunity that may account for favorable outcomes. As expected, we found high frequencies of KRAS, TP53, CDKN2A, and SMAD4 gene alterations in our tumor samples. High pORG activity (GSVA scores) in our primary tumors appear to be positively correlated with alteration frequencies in both TP53 and CDKN2A. We did not see a correlation with alterations in KRAS as almost all the tumor samples have KRAS alterations. From a de-identified dataset of 1,873 patients diagnosed with and/or treated for PDAC at our institution between 2004 and 2020, we identified 422 patients for which we had specimens with sequencing data (N=374) and/or specific evidence of disease metastasis site(s) from the OHSU cancer registry or disease-relevant computed tomography (CT) scans to allow cohort classification. Note that our study includes RNA-Seq, DNA-Seq panel, and TCR-Seq (T-cell Receptor Sequencing) data, but that this dbGaP submission just includes the patients/samples with RNA-Seq and/or DNA-Seq panel data. Therefore, this submission includes 290 samples from 278 patients. TCR sequence data is available on the Adaptive Biotechnologies platform. Clinical course timepoints, patient demographics, stage, grade, nodal involvement, resection margins, and angiolymphatic invasion were provided as deidentified data by the OHSU cancer registry with quality control data verification in a subset by pathologists (BB and TM). We reviewed all available computed tomography (CT) scans for all patients with primary tumor resection dates recorded by the cancer registrar, with tumor samples analyzed by RNA-seq, DNA-seq, or TCR-seq, and/or with additional information indicating metastatic spread (e.g., metastatic samples received for related studies). We abstracted the site of all lesions proven to be metastatic by biopsy and/or that clearly increased in size during progression or decreased in size during treatment as long as a radiologist described the lesion as “likely”, “suspicious for”, “concerning for”, or “favor” metastasis. Clinical imaging was reviewed by a radiologist (AG) to validate patient assignments to the liver, lung, and neither liver nor lung (other recurrence site) cohorts. Time to recurrence was calculated from the earliest of either the recurrence date provided by the OHSU cancer registry, or the date of earliest lesion abstracted from CT reports. The subject attributes reported in this submission were up to date at the time of our data freeze, which occurred July 2021.
Whole genome sequencing of childhood acute lymphoblastic leukaemia patients. Matched diagnostic and germline samples were obtained from bone marrow aspirates and sequenced on the Illumina platform to characterise the underlying genetic features (DOI: 10.1038/s41375-022-01806-8).
The dataset contains amplicon sequencing data from 48 samples from 35 different patients with ovarian cancers. Cell free DNA was collected from plasma. Sequencing was performed on an Ion Torrent platform and the sequencing data is provided in bam format.
This dataset includes small RNA sequencing data from extracellular vesicle-derived RNA. miRNA libraries were prepared using the Qiaseq miRNA Library Kit with adaptations for low RNA input and sequenced on the Illumina NextSeq 550 platform
Paired-read fastq files were derived from standard Illumina WES NGS sequencing for 103 DLBCL biopsy samples. This is one of three datasets associated with the multi-platform NGS sequencing efforts of the Cornell-NCI DLBCL genomic study.
levels of 92 circulating proteins measured by Olink platform, CVDIII panel
Bionano DLS optical mapping data for individual HV31 generated using DNA from peripheral blood mononuclear cells, to a molecule depth of ~153×. Optical mapping was performed at the Weatherall Institute of Molecular Medicine using the Bionano Saphyr platform.
We have in total 16 files, technical duplicates of 8 unique samples from Pre and Post BCG samples collected from four non muscle invasive bladder cancer patients. These are bulk RNAseq samples generated by high-throughput sequencing platform.
Dataset comprises one vcf file containing variants from a list of genes (DNA repair and metabolism associated genes) subset from WES of an adult AML cohort. The cohort contains 145 patient samples. WES was performed using Illumina platform.
De novo assembly of eight immune system regions for individual HV31, generated using a multi-platform pipeline. A full description of the generation of these assemblies can be found at https://doi.org/10.1101/2021.02.03.429586.