This dataset included 19 paired diagnostic and remission samples with high hyperdiploid acute lymphoblastic leukemia (ALL) that were collected from four different cohorts: the Division of Clinical Genetics, Lund University, Sweden. All samples were subjected to whole genome sequencing using the Illumina HiSeqX platform. Paired-end sequencing (2x150bp) was done to ~60x coverage for diagnostic samples and ~30x coverage for remission samples. The paired-end reads were aligned to the human reference genome GRCh37 (ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/vertebrate_mammalian/Homo_sapiens/all_assembly_versions/GCF_000001405.25_GRCh37.p13/GCF_000001405.25_GRCh37.p13_genomic.fna.gz) by the Burrows-Wheeler Aligner tool (version 0.7.17). Duplicate reads marking and local realignment were performed by GATK (version 4.0.11.0).
This collection contains all of NCIs authorized individual-level genomic datasets currently in dbGaP that are approved for General Research Use (GRU) and have no further limitations beyond those outlined in the model Data Use Certification Agreement. Access to this study will include any additional authorized individual-level GRU datasets that become available. Renewal of this study is required annually.
NanoString raw data for a noeadjuvant combination PD-L1 plus CTLA-4 blockade trial on patients with cisplatin-ineligible operable urothelial carcinoma. All samples were FFPE tumor samples. Raw probe count data (.RCC files) were generated from nCounter Digital Analyzer (4.0.0.3).
While thyroid nodules per se are frequent (4%–50%), thyroid cancer is rare (∼5% of all thyroid nodules). The minimally invasive Fine Needle Aspiration Cytology (FNAC) is the current gold standard for the diagnosis thyroid nodule malignancy. However, proper discrimination of follicular neoplasias often require more invasive diagnostic techniques. To develop a novel molecular classification system for thyroid cancer malignancy, we performed an genomic profiling of 54 fresh frozen Follicular like thyroid samples using a whole exome sequencing approach (SureSelect V6 Target enrichment protocol and the Illumina Novaseq 6000 platform)
A prospective multi-year clinical translational study including three cohorts of term infants experiencing their first Respiratory Syncytial Virus (RSV) season. All infants are less than or equal to nine months of age at study entry. The three subject cohorts represent the full spectrum of RSV disease severity and include a birth cohort, a cohort of infants hospitalized for RSV disease and infants evaluated at ambulatory settings for RSV infection. All infants are followed longitudinally and evaluated at recognition of acute RSV infection and twice during convalescence. Innate and adaptive immune status are comprehensively measured in association with clinical, environmental, viral, and bacteriologic factors. Genome-wide expression is assessed in the nasal airways, and in sorted peripheral blood lymphocytes. The study goal is to Identify host responses to RSV infection and factors associated with severe disease.
Parkinson?s disease (PD) is an age-related, chronic and progressive neurodegenerative disorder characterized by a loss of multifocal neurons and subsequent motor symptoms. These overt motor symptoms are often preceded by prodromal non-motor symptoms. Though a number of genetic and environmental factors have been identified to play a role in PD, more exact methods for both diagnosing and assessing prognosis are yet to be discovered. Probing the transcriptomes of control and PD cells can give some interesting insight into changes caused by the disease (in both coding and non-coding gene expression), highlighting potential RNA biomarkers that may be used for PD diagnosis and as new drug targets. The study consists of two main sources of data: matched neural stem cell (NSC) and fully differentiated dopaminergic neurons derived from iPS cells. Both sets of data are made up of three samples from a control cell line, and five samples carrying a mutation in one of several genes known to be linked to heritable PD: PARK2, PARK22 or PARK9. All transcriptome libraries were synthesized and sequenced using no-amplification non-tagging cap analysis of gene expression (nAnT-iCAGE) on the Illumina HiSeq 2500 platform. Resulting data was mapped to the human genome annotation (hg38) and processed CAGE tags were clustered ready for differential expression analyses. The 16 samples here described were sequenced across two lanes.
Parkinson���s disease (PD) is an age-related, chronic and progressive neurodegenerative disorder characterized by a loss of multifocal neurons and subsequent motor symptoms. These overt motor symptoms are often preceded by prodromal non-motor symptoms. Though a number of genetic and environmental factors have been identified to play a role in PD, more exact methods for both diagnosing and assessing prognosis are yet to be discovered. Probing the transcriptomes of control and PD cells can give some interesting insight into changes caused by the disease (in both coding and non-coding gene expression), highlighting potential RNA biomarkers that may be used for PD diagnosis and as new drug targets. The study consists of two main sources of data: matched neural stem cell (NSC) and fully differentiated dopaminergic neurons derived from iPS cells. Both sets of data are made up of three samples from a control cell line, and five samples carrying a mutation in one of several genes known to be linked to heritable PD: PARK2, PARK22 or PARK9. All transcriptome libraries were synthesized and sequenced using no-amplification non-tagging cap analysis of gene expression (nAnT-iCAGE) on the Illumina HiSeq 2500 platform. Resulting data was mapped to the human genome annotation (hg38) and processed CAGE tags were clustered ready for differential expression analyses. The 16 samples here described were sequenced across two lanes.
Aims: Identifying new therapeutic targets of small cell lung cancer (SCLC), genome-wide mutation analysis has been performed. Methods: Genomic DNA was extracted from formalin-fixed or methanol-fixed tissue samples. 71 Mb of DNA fragments containing whole coding exons were concentrated using SureSelect Human All Exon V4+UTRs Kit (Agilent Technologies) followed by 100-bp paired end sequencing by HiSeq 2000 (Illumina). Participants/Materials: 51 of 1042 cases of pathologically diagnosed small seen lung cancer that were registered to National Cancer Hospital East Lung Cancer Database in 1992-2012, and which surgically resected or biopsy samples were suitable for DNA extraction for further analyses.
BackgroundValley Fever is typically an infection of the lungs caused by the fungi Coccidioides immitis and Coccidioides posadasii. The incidence of Coccidioidomycosis (CM), or infection with Coccidioides, has dramatically increased over the last 20 years. This is particularly true in the Southwest of the United States, where people often breathe fungal spores that arise from the soil. Reasons for increased infection rates are thought to include population growth and construction in these endemic regions, an increase in the number of people whose immune systems are compromised due to infection or treatment with drugs required for organ transplants, climate change, as well as improved testing practices and greater physician awareness. Mild CM most commonly presents itself with flu-like symptoms and rashes, which can last weeks to months. Individuals with compromised immune systems, specifically-- substantial suppression of the immune cells known as T cells, can develop severe pulmonary and disseminated disease. Infection that remains localized to the lungs is referred to as pulmonary disease, but when the infection spreads out of the lungs into other parts of the body it represents a more serious condition referred to as a disseminated disease, or disseminated CM. In nature, Coccidioides spp. exists as mold and lives in dust and soil. When the contaminated soil or dust is disturbed by human activity, animals, or weather, the Coccidiodies spores are released into the air. Airborne spores are taken up by breathing and settle in the lungs. Once in the moist and warm environment of the lung, spores transform into spherules, which divide and become filled with smaller spores, called endospores. When the spherules get large enough, they rupture and release these endospores, which can spread and disseminate to surrounding tissue. The cycle then repeats itself as these endospores develop into new spherules3. Different ethnic groups have been described to vary in their susceptibility to developing disseminated CM after initial infection with Coccidioides. For example, evidence suggests that African-American and Filipino patients suffer the disseminated disease at a greater rate than other ethnicities. The suggestion that race plays a role in the clinical expression of the disease is still a source of debate amongst the scientific community and any genetic mechanisms responsible for these differences have yet to be fully elucidated. If our genetic makeup influences our ability to limit the spread of infection, finding which DNA differences cause these variances could provide clues to how the body successfully fights infection, and provide opportunities to boost the body’s ability to do this. Further, if we are able to identify the specific genetic risk factors that correlate with the development of disseminated infection, physicians could perform genetic screenings to identify high-risk patients and provide them with preemptive antifungal therapy prior to developing disseminated disease.The genome, made up of DNA, contains all of the information needed for humans to develop and grow. Genome-wide association studies (GWAS) allow us to look for inherited differences that are more common between people who share a particular trait, for example, height or susceptibility to certain diseases, compared to those who do not share the trait. Although some traits and diseases are controlled by a single gene, the majority are influenced by contribution from several, or even many, different genes. To find evidence of genes that contribute to specific traits, GWAS typically compares genome information from large numbers of people who have a particular disease (referred to as “cases”) looking for DNA sequences that are common among these samples, and are different from DNA sequences seen in large numbers of people who lack the trait, but are as much like the cases as possible (referred to as “controls”). The DNA sequence data from each group, cases versus controls, are analyzed to see if there are specific genomic differences that tend to be associated with the disease. MethodsTwo separate GWAS approaches were taken to look for genetic differences that could be responsible for the observed differences between the different patient populations we are studying. The first method, known as genotyping, scans for differences at a set of positions across the genome, which includes both the genes that encode our proteins and the larger amount of DNA that does not. The second method, known as exome sequencing, allows us to compare the entire sequence of the portion of the genome that codes for proteins.  For this study, DNA from patients with either pulmonary or disseminated CM were genotyped and exome sequenced to look for DNA differences that are associated with one condition or the other. All patients were at least 18 years old, had no evidence of immunosuppression, and had proven or probable pulmonary coccidioidomycosis according to established diagnostic criteria. Of these patients, a subset demonstrated disseminated disease, i.e., they showed evidence of coccidioidal infection outside of the thorax by biopsy/aspiration, had radiographic imaging, and show positive coccidioidal serology. Our criteria for including patients with the pulmonary disease were that they must not require ongoing antifungal treatment or show evidence of active CM (in skin test positive patients), show no evidence of extrapulmonary dissemination, and have no evidence of ongoing pulmonary infection (pulmonary nodules are accepted) beyond six months from diagnosis.Patient DNA was purified from blood or from sputum samples by the labs of our collaborators, Drs. George Thompson (UC Davis School of Medicine) and John Galgiani (University of Arizona Health Sciences). Genome-wide association (GWAS) analysis was carried out to look for candidate loci associated with pulmonary versus disseminated disease, taking into account the population structure of the samples. Single nucleotide or insertion/deletion variants were identified from whole-exome sequences (WES) using the Picard/BWA/GATK pipeline. ResultsTable 1. Pulmonary versus Disseminated Cases of Coccidiomycosis for GWAS, Sorted by EthnicityEthnicityPulmonary CasesDisseminated CasesAsian85Black/African American1664Caucasian/White4015Filipino03Hispanic/Latino3414Indian21Mexican American1039Pacific Islander01Samoan03Vietnamese10Unknown16917More than one race02Total373134Table 1 shows the number of samples analyzed from patients with pulmonary versus disseminated disease, and patient ethnicity, where known. In all, we worked with 507 samples, including 134 samples from patients with disseminated disease and 373 samples from patients with pulmonary disease. Of these, 505 samples were genotyped using the Multi-Ethnic Global Array from Illumina Inc. In addition, we were able to generate whole-exome sequence from 498 patient samples. No significant associations were detected that differed between samples from patients with pulmonary versus disseminated disease; that is, no particular DNA sequences were found to be significantly enriched in patients with disseminated disease compared to patients with pulmonary disease. The ability to detect genetic association between specific sequences and genetically determined traits is influenced by several factors, including how many patient samples are available to compare, how many different genes contribute to the trait and how strong their contributions are. When the number of genes is small and the contribution of each gene is great, smaller numbers of patient samples are needed to detect an association. When more genes are involved, or the contribution from each gene is more modest, larger numbers of patient samples must be examined. While we were not able to detect any associated within this study, it does not mean that subsequent studies would not find this connection. Our study suggests significantly more samples should be analyzed in further studies.Whole-exome sequences were generated from 498 samples and were aligned to reference sequences to identify positions where the sequences differed from the reference. These data are being analyzed to determine if any variants are associated with pulmonary, versus disseminated, disease. 
Microsatellite instability-high (MSI-H) colorectal cancers (CRCs) account for 10?15% of all CRCs. MSI-H CRC is characterized by a large number of somatic insertions/deletions (indels) resulting from either mutations within or the silencing of genes involved in the DNA mismatch repair (MMR) system. A subset of MSI-H CRCs is associated with Lynch syndrome (LS), which is caused by germline MMR gene mutations, leading to hereditary cancer predisposition. Other than frequent somatic mutations in BRAF, the transformation mechanisms underlying MSI-H CRC are largely unknown. Here, genomic DNA from 149 MSI-H CRC specimens was analyzed using whole-exome sequencing, and 93 of these samples were subjected to genome-wide DNA methylation analysis. Furthermore, transcriptome sequencing was conducted on 111 samples. Genomic/epigenomic analyses identified three subgroups within our cohort: (1) MSI-H CRCs with silenced MLH1 that share frequent indels, a specific mutation/copy number alteration profile and promoter DNA methylation; (2) LS-associated MSI-H CRCs with MMR genes containing germline mutations; and (3) the remaining MSI-H CRCs with frequent somatic disruptive mutations of MLH1 or MSH2. Unexpectedly, only the first group was found to frequently carry fusion-type protein kinases (15% of this group and 9.9% of all MSI-H CRCs) that are promising therapeutic targets. Thus, MSI-H CRCs can be classified into three subgroups with distinctive genomic as well as epigenomic statuses, probably reflecting the oncogenic processes of these cancers. Fusion-type kinases are enriched in MSI-H CRCs, shedding new light on potential treatment strategies.