Introduction

The European Genome-phenome Archive (EGA) is available at the European Bioinformatics Institute (EBI) and the Centre for Genomic Regulation (CRG).

The EGA provides a service for the permanent archiving and distribution of personally identifiable genetic and phenotypic data resulting from biomedical research projects. Data at EGA was collected from individuals whose consent agreements authorise data release only for specific research use to bona fide researchers. Strict protocols govern how information is managed, stored and distributed by the EGA project.

Nature Genetics 47, 692–695, (2015)| doi:10.1038/ng.3312

 

EGA overview

EGA: Introduction

Data Search and Access

Studies and datasets can be browsed on the public website at the EBI or the CRG,providing information;(metadata) about the aim, the experiments and the data used in the registered studies. Each study is assigned a stable accession that may be referenced in publications.

Data providers are assigned an individual page on the EGA website, from which their studies may be browsed.

The EGA implements a controlled access policy whereby the access decisions resides with the Data Access Committee (DAC).  The DAC is created by the submitting organisation and is typically composed of the individuals involved in the collection and analysis of the data. DACs may be responsible for approving access to single or a mutiple number of datasets.

Each dataset is covered by a Data Access Agreement (DAA), which defines the terms and conditions of use for the specified dataset/s. The DAA is created and provided by the DAC and must be signed by the individual wishing to access the given dataset/s.

The EGA only supports data access decisions that are based on ethical considerations resulting from the consent agreements with the research participants. Access decisions made for other reasons, such as scientific competitiveness, are not supported by the EGA.

A complete tutorial with  examples of how to access a dataset can be found here.

 

The EGA Account

An EGA account is created by the DAC for an individual user, in response to a successful application to access single or multiple datasets.

Datasets are downloaded using the EGA Download Clientwhich provides instant access to all approved files, which may be filtered and downloaded by dataset.

In exceptional circumstances, data can also be provided in a temporary dropbox, which may be accessed using FTP or Aspera.

The EGA account and EGA Download Client is for personal use only; the terms and conditions of the account prohibit sharing of the account log-in details. The EGA will create new or update existing accounts only from the direction of the appropriate DAC. The EGA will also require DAC authorisation when the original application is to be updated (eg. removal of users that have left the research group or new participants in the same project).

 

Data submission

The EGA accepts only de-identified data with a DAC approved access plan. The accepted data types include raw data formats from the array-based and new sequencing platforms as well as phenotype files describing study samples.  

An overview of the types of data accepted at the EGA can be found here. 

The EGA offers a range of tools for the file and meta-data upload. The automated tools include the Java EgaCryptor, which encrypts and verifies all the accepted file types, and the EGA Webin portal for the metadata submission.

A detailed and comprehensive description of the entire process can be found on the "Submit to EGA" section of this website. It includes a step-by-step tutorial to guide you through the whole submission, so we strongly recommend potential submitters to read it before starting to work on the submission.