Need Help?

Submitter Portal API

The metadata submission process can be difficult and time-consuming. For this reason, the EGA has developed the Submitter Portal, a tool that was created to offer a simplified and user-friendly method of registering metadata.

Our portal provides features that are intended to make the input of metadata easier, ensuring that your data is registered correctly and effectively. Our page is divided into logical sections and includes a helpful video instruction to further assist you as you complete the submission process. With the Submitter Portal, you can rest assured that the submission of your data is in good hands.

For those that want a more flexible and automated approach, we also provide a programmatic approach using the Submitter Portal AP in addition to the user interface of the Submitter Portal. With the help of our API, you can quickly include the submission of metadata into your own workflow for a more effective and individualised experience.

Previous steps

Create your EGA account

To submit data you first need to create your EGA user. Then, once your account has been verified, you will have to request a submitter role and sign the EGA Data Processing Agreement (DPA). When the DPA is signed, you must send it to EGA Helpdesk for further validation.

Please, note that if you already have an EGA account, or you have an ega-box (submission account), you can skip this step. 

Upload your files

Please note that all your files must be encrypted using the Crypt4GH tool before upload them.

As soon as you are assigned with a submitter role, you will be able to connect to the EGA inbox and upload your files.

Understand the EGA metadata schema

It's crucial to comprehend the EGA metadata schema, a set of rules that specify how data is organised, described, and shared inside the EGA, in order to get the most out of this resource. Learn all about EGA metadata schema!

Register your DAC and policy

There are two objects in the EGA metadata schema that are registered in a separate portal. All DACs and policy objects are registered using the DAC Portal, a tool developed by EGA to help data controllers manage their data stored at the EGA. You can find the relevant information in the DAC Portal Guide.

Programmatic submission

All the calls to the API need to be Authenticated. We use the OpenID Connect protocol.

API Usage Flow:

  1. Obtain Access and Refresh Token:
  2. The first step to using the API is to obtain an access and refresh token. These tokens are required for authentication and authorisation of API requests. To obtain these tokens, you need to log in using your EGA credentials.

  3. Use Access Token in API Calls:
  4. Once you have obtained the access token, you must use it in all the calls to the API. The access token is valid for a limited time period. When it expires, you will need to use the refresh token to obtain a new access token.

  5. Create Submission Object:
  6. After authentication, you can start creating a submission object. This object will be used to store all the metadata objects and files related to the submission.

  7. Create EGA Metadata Objects:
  8. Within the submission object, you can create other metadata objects required for the submission. These objects may include information about the submitter, study, sample, experiment, and run.

  9. Finalise Submission:
  10. Once all the metadata objects are created and linked to the submission, you can finalise the submission. Finalising the submission sends it to the Helpdesk team for review.

    Please, note that all files linked to the submission must be ingested, and all objects created in a submission must be linked before you can finalise the submission.

API Reference:

For a full API reference, please check our specification documentation.

Start your submission:

#/bin/bash
#Prerequisistes: curl, jq

IDP_URL='https://idp.ega-archive.org/realms/EGA/protocol/openid-connect/token'
SP_URL='https://submission.ega-archive.org/api'

Login

access_token=$(curl "$IDP_URL" \
-d "grant_type=password" \
-d "client_id=sp-api" \
-d "username=your-username" \
-d "password=your-password" | jq -r ".access_token")

echo "Access token is: $access_token"
echo "$SP_URL/submissions"

Create submission

submission_id=$(curl "$SP_URL/submissions" \
-H "Authorization: Bearer $access_token" \
-H "Content-Type: application/json" \
-d "{ \"title\":\"My submission title\",
      \"description\":\"My submission description\" }" | jq -r ".provisional_id" )
​
echo "Submission id is: $submission_id"

Enums

To know which values are accepted for the types i.e study_type, you can check the enums that are available by using the following command:

enums_available=$(curl "$SP_URL/enums" \
-H "Authorization: Bearer $access_token" \
-H "Content-Type: application/json")

echo "Enums available: $enums_available"

#Enums available: ["status_types","study_types","biological_sex","case_controls","platform_models",
"library_layouts","library_strategies","library_sources","library_selections","run_file_types",
"analysis_types","experiment_types","genomes","chromosomes","dataset_types","repositories"]

Study types

To check for a specific enum you can access just access it to see it's possible values

study_types_available=$(curl "$SP_URL/enums/study_types" \
-H "Authorization: Bearer $access_token" \
-H "Content-Type: application/json")

echo "Study types available: $study_types_available"

Create study

study_id=$(curl "$SP_URL/submissions/$submission_id/studies" \
-H "Authorization: Bearer $access_token" \
-H "Content-Type: application/json" \
-d "{ \"title\":\"My study title\",
      \"description\":\"My studydescription\", 
      \"study_type\":\"Metagenomics\" }" | jq -r ".[].provisional_id" )
​
echo "study_id id is: $study_id"

Create sample

sample_id=$(curl "$SP_URL/submissions/$submission_id/samples" \
-H "Authorization: Bearer $access_token" \
-H "Content-Type: application/json" \
-d "{ \"alias\":\"My unique sample alias 1\",
      \"biological_sex\":\"male\", 
      \"phenotype\":\"nose\", 
      \"subject_id\":\"192873366738836788\" }" | jq -r ".[].provisional_id" )
​
echo "sample_id id is: $sample_id"

Create experiment

experiment_id=$(curl "$SP_URL/submissions/$submission_id/experiments" \
-H "Authorization: Bearer $access_token" \
-H "Content-Type: application/json" \
-d "{ \"design_description\":\"My experiment design\",
      \"study_provisional_id\": $study_id, 
      \"instrument_model_id\": 1,
      \"library_layout\": \"SINGLE\",
      \"library_strategy\": \"WGS\",
      \"library_source\": \"GENOMIC\",
      \"library_selection\": \"RANDOM\" }" | jq -r ".[].provisional_id" )
​
echo "experiment_id id is: $experiment_id"

Obtain IDs of files

You can filter by a prefix by adding the prefix param. For example: &prefix=/my_folder/a

files=$(curl "$SP_URL/files?status=inbox" \
-H "Authorization: Bearer $access_token" \
-H "Content-Type: application/json" )
​
echo "Files: $files "

Select the IDs of the file you are interested in. For the next call we will use the provisional id 7266

Create run

run_id=$(curl "$SP_URL/submissions/$submission_id/runs" \
-H "Authorization: Bearer $access_token" \
-H "Content-Type: application/json" \
-d "{ \"run_file_type\": \"bam\",
      \"files\": [ 7266] ,
      \"experiment_provisional_id\": $experiment_id, 
      \"sample_provisional_id\": $sample_id }" | jq -r ".[].provisional_id" )
  
echo "run_id id is: $run_id"

Obtain policy accession id from your DAC

To obtain all policies:

policies=$(curl "$SP_URL/policies" \
-H "Authorization: Bearer $access_token" \
-H "Content-Type: application/json" )
echo "Policies: $policies"

Create dataset

In this case we link with the policy with accession EGAP50000000000.

dataset_id=$(curl "$SP_URL/submissions/$submission_id/datasets" \
-H "Authorization: Bearer $access_token" \
-H "Content-Type: application/json" \
-d "{ \"title\": \"My dataset title\",
     \"description\": \"My dataset description\",
     \"dataset_types\": [ \"Whole genome sequencing\" ],
     \"policy_accession_id\": \"EGAP50000000000\",
     \"run_provisional_ids\": [ $run_id ] }" | jq -r ".[].provisional_id" )
​
echo "dataset_id is: $dataset_id"

Finalise submission

You can only finalise a submission when the files are ingested and all the objects must be used.

finalise=$(curl "$SP_URL/submissions/$submission_id/finalise" \
-H "Authorization: Bearer $access_token" \
-H "Content-Type: application/json" \
-d "{ \"expected_release_date\": \"2026-04-28\" }")