Automated machine-learning approach for next generation profiling of sequence alterations, mutation burden, microsatellite instability, and structural variants in human cancers

Sequence and structural alterations together with tumor mutation burden (TMB) and microsatellite instability (MSI) have been identified as biomarkers for the determination of response to targeted and immune checkpoint inhibitor therapies. However, widespread clinical adoption of these biomarkers has historically been limited due to barriers such as evidence of clinical utility and reimbursement. We have developed 2.2 Mb targeted NGS system and an automated machine-learning analysis approach (PGDx elio™ tissue complete, ETC) that has been FDA cleared for examination of 500+ cancer-related genes and 68 mononucleotide repeats for identification of sequence and structural alterations, TMB, and MSI in solid cancers in a clinical setting. We designed and trained this approach using sequence data from 4,174 cancers and >124,000 in silico alterations and evaluated the methodology in >2,550 tumor or non-cancerous normal samples. Independent analyses of ETC sequence changes in 440 formalin fixed paraffin embedded (FFPE) tumor or cell line samples using MSK-IMPACT™, ... (Show More)

This dataset consists of 116 tumor and normal samples analyzed with whole exome sequencing on the HiSeq2500 instruments with 100bp paired-end reads as well as 760 tumor and normal samples analyzed with the PGDx elio tissue complete assay. The PGDx elio tissue complete assay is a hybrid capture approach targeting 500+ genes with sequencing on the NextSeq instruments with 150bp paired-end reads. The bam files provided have been adapter masked and contain duplicate reads.
