Study

GCAT | Genomes for life: cohort study of the genomes of Catalonia

Study ID Alternative Stable ID Type
EGAS00001003018 Other

Study Description

The GCAT Study have recruited 20 000 participants aged 40–65 years. Participants who agreed to take part in the study completed a self-administered computer-driven questionnaire, and underwent blood pressure, cardiac frequency and anthropometry measurements. For each participant, blood plasma, blood serum and white blood cells are collected at baseline. A total of 5459 genomic profiles have been characterised by comprehensive genotyping. Genome-wide genotypes have been generated using Illumina Infinium SNP-bead array technology. We chose the Multi-Ethnic Global (MEGAEX, V.2) consortium array, a multipurpose, multiethnic genotyping array with two million selected markers (including previously described germline mutations, insertions-deletions (InDels) and SNPs).We have strictly followed the standard manufacturer recommended automated protocol for the Infinium HTS Assay scanned with a HiScan confocal scanner (Illumina, San Diego, California, USA). Genome Studio V.2011.1 has been used for raw data analysis. Genotyping was performed at the Genomics and Bioinformatics Unit of the ... (Show More)

Study Datasets 11 datasets.

Click on a Dataset ID in the table below to learn more, and to find out who to contact about access to these data

Dataset ID Description Technology Samples
EGAD00001007729
Sex, age at recruitment (2014-2018), and birthdate of GCAT Cohort individuals.
19329
EGAD00001007730
First 20 principal components of 4988 genotyped GCAT Cohort individuals with Infinium Multi-Ethnic Global (MEGAEX2) array, with data for Cr1-22. Plink files with QC and imputed (SHAPEIT+IMPUTE).
4988
EGAD00001007731
Disease diagnoses of GCAT Cohort participants obtained from electronic health records (EHR), mainly including the time period from 2012 to 2017. Disease diagnoses are codified in ICD-9, and the position of diagnosis refers to primary/secondary diagnoses (up to 14 secondary diagnoses per visit). The date and origin of the visit are also specified (AP: primary care, UGR: emergency, AH: hospital care, SMA: outpatient medical service, SMH: hospital medical service).
17155
EGAD00001007774
This dataset contains genotypes (35.4M of SNVs, Indels and SVs), from 785 samples, after QC filtering, from the 808 WGS GCAT cohort.
785
EGAD00001008201
This dataset include FASTQ files of 808 samples from GCAT cohort. Technology used HiSeq 4000, read length 150 bp, inner mate distance 300 bp. For each sample the paired -ends are generated in separated files. Each FASTQ is splitted in multiple LANEs and grouped by the Multiplex index.
Illumina HiSeq 4000 808
EGAD00001008202
This dataset include BAM files of 808 samples from GCAT cohort. Technology used HiSeq 4000, read length 150 bp, inner mate distance 300 bp. For each sample the paired -ends are generated in separated files. Each FASTQ is splitted in multiple LANEs and grouped by the Multiplex index.
808
EGAD00001008210
This dataset contains raw genotypes ( SNVs, Indels and SVs), from 785 samples,without applying any filter, from the 808 WGS GCAT cohort.
785
EGAD00010001664
4988 samples issued from GCAT cohort, genotyped with MEGAex-Infinium Array, with data for Cr1-22. Plink files with QC and imputed (SHAPEIT+IMPUTE).
Illumina-Genotyping Array 4988
EGAD00010001665
4988 samples issued from GCAT cohort, genotyped with MEGAex-Infinium Array, with data for Cr1-22. Plink files with QC but not imputed.
Illumina-Genotyping Array 4988
EGAD00010002152
This resource contains the SV annotations using the AnnotSV tool. The description of annotations can be found in AnnotSV web page https://lbgi.fr/AnnotSV/ or GCAT-BSC web page: http://cg.bsc.es/GCAT_BSC_iberianpanel
Illumina HiSeq 4000 785
EGAD00010002153
This dataset includes the .hap, .legend and .sample files from the GCAT|Panel (Iberian reference panel), built from 785 samples, after QC, from the 808 WGS GCAT cohort, including 30.3M SNVs, 5M Indels and 89K SVs. This resource has been generated using Shapeit4 and WhatsHap software. Technology used HiSeq 4000, read length 150 bp, inner mate disatance 300 bp.
Illumina HiSeq 4000 785

Who archives the data?

Publications

Citations

Retrieving...
Retrieving...