This DAC is created for the XPAND project by the Translational Bioinformatics unit.
This is the DAC responsible for granting access to sequencing data generated by Fondazione Michelangelo
Background Massively parallel sequencing technology has transformed cancer genomics. It is now feasible, in a clinically relevant time-frame, for a clinically manageable cost, to screen DNA from patient tumours for mutations essentially genome-wide. The challenge for personalised medicine will be to increase the sample size to thousands or tens of thousands of well-characterised cases in order to attain sufficient statistical power to stratify patients accurately across the complexity and genomic heterogeneity expected for most of the common tumour types. Currently, whole genome sequencing on this scale is not feasible, and targeted sequencing of relevant portions of the genome will be required. Pilot data We have developed protocols for large-scale, multiplexed sequencing of 100-200 genes in thousands of samples. Essentially, using robotic technology, genomic DNA from the cancer specimen is processed into sequencing libraries with unique DNA barcodes, thereby allowing sequencing reads to be attributed to the sample they derive from. Currently, these sequencing libraries can be generated in a 96-well format using fully automated protocols, and we are exploring methods to expand this to a 384-well format. The sequencing libraries are pooled and hybridized to custom sets of RNA baits representing the genomic regions of interest. Sequencing of the pulled-down libraries is done in pools of 48-96 samples per lane of an Illumina Hi-Seq. This protocol is already implemented at the Sanger Institute. We have published proof that somatic mutations in novel cancer genes can be identified from exome-wide sequencing. In unpublished pilot data, we have established the feasibility of robotic library production, custom pull-down, and multiplexed sequencing of barcoded libraries for 100 known myeloid cancer genes across 760 myelodysplasia samples. Highlights of the data thus far analysed reveal that the coverage is remarkably even between samples; when 96 samples are run, average coverage per lane of sequencing is ~250, with 90-95% of targeted exons covered by >25 reads; known mutations can be discovered in the data set; and the protocol is amenable to whole genome amplified DNA. The bioinformatic algorithms for identification of substitutions and indels in pull-down data are well-established; we have pilot data proving that copy number changes, LOH and genomic rearrangements in specific regions of interest can also be identified by tiling of baits across the relevant loci. Proposal We propose to apply this methodology to 10000 samples from patients with AML enrolled in clinical trials over the last 10-20 years. Oncogenic point mutations and potentially genomic rearrangements will be identified, and linked to clinical outcome data, with a view to undertaking the following sorts of analyses: ? Identification of co-occurrence, mutual exclusivity and clusters of driver mutations. ? Correlation of prognosis with driver mutations and potentially gene-gene interactions ? Exploration of genomic markers of drug response Ultimately, we would like to be in a position to release the mutation data together with matched clinical outcome data to genuine medical researchers via a controlled access approach, possibly within the COSMIC framework (www.sanger.ac.uk/genetics/CGP/cosmic/). The vision here is to generate a portal whereby a clinician faced with an AML patient and his / her mutational profile can obtain a ?personalised? prediction of outcome, together with a fair assessment of the uncertainty of the estimate. With a sufficient sample size, there would also be the potential to develop decision support algorithms for therapeutic choices based on such data.
This project is analyzing tissue and blood samples from people with rheumatoid arthritis (RA) and lupus to pinpoint genes, proteins, chemical pathways, and networks involved at a single cell level. This type of modular, molecular analysis will allow comparisons across the diseases and will provide insights into key aspects of the disease process. The project will identify differences between those RA patients who respond to therapies and those who do not, as well as provide a better systems level understanding of disease mechanisms in both RA and lupus. This knowledge is essential for the development of targeted therapies and for the application of existing and future therapies to appropriate patient populations. Additional datasets can be accessed through ImmPort (http://www.immport.org/immport-open/public/home/studySearch), accession: SDY997
ADVANCE (Atherosclerotic Disease, VAscular functioN, and genetiC Epidemiology) is a large epidemiological study of genetic and non-genetic determinants of coronary artery disease (CAD) that started in 2000 as a collaborative effort between researchers at Stanford University and Kaiser Permanente of Northern California. The overarching goal of the study is to improve our ability to prevent, diagnose and treat CAD. The initial study included recruitment of over 3600 subjects (including 1873 subjects with incident clinically significant coronary disease and 1745 control subjects) from multiple race/ethnic backgrounds. A subset of ~ 500 subjects with very early onset coronary disease (men < 45 and women < 55) and ~ 500 similar aged controls were genotyped using the Illumina 550K platform as part of an NIH funded effort within the STAMPEED consortium.
T cell non-Hodgkin lymphomas (T-NHLs) represent a heterogeneous group of aggressive cancers of mature CD4+ T cells, for which therapeutic options are limited. Recent work uncovered that the PDCD1 encoded immune checkpoint receptor PD-1 is a key tumor suppressor in T cells. PD-1 is recurrently inactivated in T-NHL and the highest frequencies of PDCD1 deletions are detected in advanced disease, predicting worse prognosis. The tumor-suppressive mechanisms of PD-1 signaling remain unknown. In the present study, we identify transcriptional and epigenetic mechanisms underlying PD-1 tumor suppression in T-NHL.Some subjects included in this study overlap with the subjects from phs002456. To establish the link between the overlapping subjects, the Subject Consent datasets will be utilized.