Data supporting: "Understanding the malignant potential of gastric metaplasia of the oesophagus and its relevance to Barrett’s Oesophagus surveillance: individual-level data analysis" Black et al (WES OACs/BOs/normals)
The main goal of the project is the study the associations between the gut metagenome and human health. The dataset contains data for n=7211 FINRISK 2002 participants who underwent fecal sampling. Demultiplexed shallow shotgun metagenomic sequences were quality filtered and adapter trimmed using Atropos (Didion et al., 2017), and human filtered using Bowtie2 (Langmead and Salzberg, 2012).
The purpose of this study is to investigate the genetics of orofacial clefts (OFCs) in a large study population, and importantly, to incorporate subclinical phenotypic features into these studies. Orofacial clefts (OFCs) comprise a significant fraction of human birth defects (about 1/700 live births (Rahimov et al. 2012) and represent a major public health challenge, as individuals with these anomalies require surgical, nutritional, dental, speech, medical and behavioral interventions, thus imposing a substantial economic and personal burden (Berk and Marazita 2002*). The most common forms include OFCs of the lip alone (CL, Figure 1A), CL plus cleft palate (CL+CP, Figure 1B) or of the palate only (CP, Figure 1C). Individuals born with OFC may have their first surgical repair at age 3 months, but this initial surgery is just the beginning of a lifetime of health burdens. An individual born with an OFC has a hospital use rate increased for most ages (up to 233% increase for children ages 0-10 years and 16% for middle aged adults (Wehby et al. 2012). Healthcare costs for children with OFCs are estimated to be 800% greater compared with their unaffected peers (Boulet et al. 2009). Data from Denmark show that people born with CL with or without CP (CL/P) have an increased mortality up to age 55, which may be attributed to an increased risk of suicide and/or certain cancers (Christensen et al. 2004). The focus of most OFC genetic research has been CL and/or CP. Furthermore, the majority of OFC, i.e. about 70% of CL/P and 50% of CP is considered "nonsyndromic" (Jones 1988), i.e. isolated anomalies with no other apparent cognitive or structural abnormalities. Figure 1 Sample OFC Types A: Bilateral Cleft Lip; B:Cleft Lip plus Cleft Palate; C:Cleft Palate Alone The factors leading to the majority of nonsyndromic OFCs are still unclear, particularly at an individual family level. As is true for many complex traits, substantial progress in gene identification has occurred in the OFC field in the last two years (Dixon et al. 2011; Marazita 2012). Genome Wide Association Studies (GWAS) and sequencing studies to date by our research team and others have focused on genetic risk factors for overt CL/P and CPO-and have been very successful. A major finding from this work is that OFCs exhibit significant genetic heterogeneity, i.e., multiple genetic regions have been implicated (Beaty et al. 2010; Ludwig et al. 2012). Thus, approaches are needed to understand this genetic heterogeneity. Are there GxG interactions at work? Are there subsets of families, each due to a different gene? Our research group has shown that a promising approach to dissect the etiology of OFC is to focus on subclinical phenotypic features within entire cleft families (not just in affected cases, but also in their non-cleft relatives). These subtle features are believed to represent mild manifestations of the same underlying genetic susceptibility responsible for OFCs; as such, their inclusion in case-control and family-based genetic studies can help to clarify and refine the relationship between genotype and phenotype. The study population comprises a large number of families and individuals (~12,000 individuals) from multiple populations worldwide (Caucasians from the US and Europe, Asians from China and the Philippines, Mixed Native American/Caucasians from South America, and Africans from Nigeria and Ethiopia). There are cases, case families (nuclear families and extended kindreds), as well as controls with no history of OFC nor other developmental defects. *Berk NW, Marazita ML (2002) Costs of Cleft lip and Palate: Personal and Societal Implications. In: Wyszynski DF (ed) Cleft Lip and Palate: From Origin to Treatment. Oxford University Press, Inc., New York, pp 458-467.
WGS data from biliary tract cancer samples (Beaudry et al, 2025; n=55)
Integrated callset of high coverage Egyptian genomes from the Pagani et al. 2015 AJHG paper (doi: http://dx.doi.org/10.1016/j.ajhg.2015.04.019)
Bulk RNAseq PBMC
Bulk CD14 RNAseq
RNAseq data, Publication Fernandez-Cuesta et al., 2014, CD74-NRG1 fusions in lung adenocarcinoma
Fernandez-Cuesta et al, 2014, Nature Communication, RNA Sequencing data set
Insertion of processed pseudogenes is known to occur in the germline but has not previously been observed in somatic cells. Formation of pseudogenes could represent a new class of mutation in cancers and a new source of potential driver events.