Therapeutic decisions in oncology depend on a precise pathological classification of individual neoplasms. Recent years have seen an intensification of research activities aimed at the extraction of clinically relevant information from patient-derived 'omics' data based on Machine-Learning models. However, a comprehensive training of Machine-Learning models requires sufficiently large numbers of training samples, which are usually not available for rare cancer types. The problem is worsened when individual tissues segregate into different cancer subtypes, as their discrimination would require even more training samples.
Here, we report on a new data-augmentation technique to support the training of Machine-Learning models on ‘omics’ data from pancreatic neuroendocrine neoplasms (panNEN). PanNENs display all properties described above: Only about 2-3% of all pancreatic neoplasms are neuroendocrine and they fall into different subtypes with distinctly different prognosis, which makes the precise classification of such samples both difficult and important for therapy ... (Show More)