The EGA - Submitter Portal, provides the tools that aim to facilitate the metadata submission of human data to the European Genome Archive. The aim of this page is to provide a video tutorial on how to use the EGA Submitter Portal. The page is divided into ordered sections for completing a submission.
In this tutorial video we will demonstrate how to use the Submitter Portal to register your metadata for a run sequencing submisison. In the video we assume that your files have been encrypted and uploaded to your ega submisison account.
The metadata objects required for read submissions are as follows:
- Study: information about the sequencing study
- Samples: Information about the sequencing samples
- Analysis: References the analysis (BAM) files; associated with samples and study.
- DAC: contains information about the Data Access Committee (DAC)
- Policy: contains the Data Access Agreement (DAA); associated with DAC
- Dataset: contains the collection of runs/analysis data files to be subject to controlled access; associated with Policy
- **Study, samples, DAC and policy metadata can all be registered prior to uploading files**
If you are performing Array-based submission(s), the Submitter Portal should only be used to register the Study, Data Access Committee and Policy metadata objects. We are currently working on the features to provide the creation of array metadata submissions using the portal.
In the below short video, you can find a worked example, with detailed instructions on how to use the EGA submitter portal to perform metadata submissions to the EGA.
- 0:36 – Explanation of the example
- 1:08 – What metadata needs to be submitted?
- 2:43 – Submitter Portal Common aspects
- 5:17 – Making a new submission
- 7:11 – Register a study
- 10:02 – Register samples
- 13:32 – Register experiments
- 15:35 – Link files and samples
- 17:43 – Register data access committee (DAC)
- 19:39 – Register data access policy
- 21:07 – Register dataset
- 23:41 – Submit objects to the server
Points to Notice
There is a strong relationship among EGA metadata objects. Unless the primary objects (study, samples and DAC) are properly submitted, their linked and secondary objects will not validate (experiments, runs, analyses or policies). The tertiary metadata object (dataset) require all the objects to be submitted before can be validated and submitted. Should you prefer to submit everyone at once, please generate all the objects with no validation and the go to "Edit title and description" tab and click "I'm done". This will validate and submit all together
The EGA submitter portal video focuses on a unique use, the submission of Runs.
Aligned BAM files are expected to be submitted as runs (1 to 1 cardinality with samples). Analysis should be only be used for BAM/BAI pair, VCF and phenotype linkage to samples.. The analysis is an EGA specific metadata object that links Samples, to Files. This object also stores some metadata about your experiments, such as the experiment type, genome reference, or the platform used.**If only BAM or CRAM alignment files are submitted but not the original unaligned FASTQ files, then please make sure that the BAM or CRAM files also contain the unaligned reads. This is critical to enable primary re-analysis and re-alignment of the dataset using new tools or future genome assemblies.**
Prior to defining the Analysis
In order to register your analysis you should firstly :
Please note that the EGA allows for the re-use of registered metadata. Therefore the previously registered Study, DAC, Policy or samples can be re-used for the analysis data submission.
Defining the Analysis
- In the Submitter Portal accordion, select the option "Link files and samples" and click "Analysis Data".
- Start by selecting the sample(s) to be linked to the file, and populate the required attribute fields. Please note the existence of mandatory fields. These must be populated.
- Finally, select the file and file type to be associated with the sample. If you wish to add additional files, click the button "Add additional files".
- Your analysis will be created in draft status. To learn more about validating, editing or deleting the analysis view the Submitter Portal video section above
Points to notice
When populating the chromosome field (mandatory). Please, after selecting the chromosome(s), press key ENTER in order to save your selection.
EGA objects can be identified by their unique accession. These are ID's displayed everywhere, shared among all EGA locations and specific for each data type (More information on the list below)
|EGA Accession ID||EGA Object description|
|EGAS||EGA Study Accession ID|
|EGAC||EGA DAC Accession ID|
|EGAP||EGA Policy Accession ID|
|EGAN||EGA Sample Accession ID|
|EGAR||EGA Run Accession ID|
|EGAX||EGA Experiment ID|
|EGAZ||EGA Analysis Accession ID|
|EGAD||EGA Dataset Accession ID|
|EGAB||EGA Submission ID|
|EGAF||EGA File Unique Accession ID|
EGA Webin is an online tool that could be used to submit metadata (affiliated to sequence files) to the EGA. Furthermore, it can also be used to to register Study (EGAS), Data Access Committee (DAC) and Policy (EGAP) for all array based submissions.
The Webin platform is a historical tool that preceded the current Submitter Portal. It was developed and used to register metadata affiliated to sequence files. Detailed documentation about Webin and how to use it can be obtained here.
Click on the links below for guides on submitting specific metadata using the Webin :
Webin will not be further maintained, however, it can be used as a backup tool if there is any issue with your submission via the Submitter Portal. Please contact the ega helpdesk for any related queries.