Metadata Distribution

Welcome to the realm of Metadata Distribution within the EGA ecosystem!

Our Metadata REST API empowers you to effortlessly retrieve metadata from the expansive landscape of EGA. By utilising this API, you gain access to publicly available insights across various EGA domains, including studies, samples, experiments, runs, analyses, policies, DACs, and datasets. Furthermore, this API facilitates cross-referencing of objects, enabling you to gather, for example, all the datasets associated with a specific DAC, seamlessly.

In addition, we have added the ability to query private data using the metadata API. If you possess the necessary permissions, you can access behind-the-login private data for a specified list of datasets.

Metadata Distribution Index

Identifiers

Dataset Mappings

Website Download

Metadata API - Private

Identifiers

At the core of EGA's organisational structure are unique accessions that serve as essential tags for our diverse objects. Here's a quick overview of the accessions and their corresponding object types:

EGA Accession ID	EGA Object description
EGAS	EGA Study Accession ID
EGAC	EGA DAC Accession ID
EGAP	EGA Policy Accession ID
EGAN	EGA Sample Accession ID
EGAR	EGA Run Accession ID
EGAX	EGA Experiment ID
EGAZ	EGA Analysis Accession ID
EGAD	EGA Dataset Accession ID
EGAB	EGA Submission ID
EGAF	EGA File Unique Accession ID

For further information check our metadata schema documentation.

Dataset Mappings

For authorised datasets, comprehensive mappings reveal meaningful connections:

Sample_file: This file presents information about the linkage between samples and files available in the dataset.
Study_experiment_run_sample: This file presents information about the linkage between studies, experiments, runs, and samples within the dataset.
Study_analysis_sample: This file presents information about the linkage between studies, analyses, and samples contained within the dataset.
Run_sample: This file presents information about the linkage between runs and samples within the dataset.
Analysis_sample: This file presents information about the linkage between analyses and samples within the dataset.

An empty file indicates the absence of corresponding information.

Website Download

Our website serves as your gateway to downloading metadata. Simply navigate to the dataset page, and you'll find a blue Metadata button. Once authenticated, you can click this button for authorised datasets. If you lack permissions for a particular dataset, request access by clicking the 'Request Access' button.

Metadata button displayed on EGA dataset page

For authorised datasets, choose your preferred metadata format: CSV, TSV, or JSON.

Metadata API - Private

Leverage the power of programmatic metadata downloads! Start by authenticating yourself with your credentials to obtain an access token. With this token, programmatically query private information.

Queries mirror the structure of the Public Metadata API. However, behind the login, you can delve into specific mapping information (as mentioned above in dataset mappings) alongside object-level exploration.

Authentication

An active session is required to work with the API. Each time you log in with your credentials a new session is started, which is identified by an access_token. Below an example on how to obtain one using curl:

  curl https://idp.ega-archive.org/realms/EGA/protocol/openid-connect/token \
  -d 'client_id=metadata-api' \
  -d 'username=...' \
  -d 'password=...' \
  -d 'grant_type=password'

All responses from the API are in JSON format. A successful response should include a new token to be used for the session:

  {"access_token":"eyJhbGciOiJSUzI1NiIsInR5cCIgOiA...TNw",
  "expires_in":300,
  "refresh_expires_in":1800,
  "refresh_token":"eyJhbGciOiJIUzI1NiIsInR5cCIgOiA...pTX10",
  "token_type":"Bearer",
  ...
  }

Save the access_token value and include it in the API call headers.

Example query usage

Below you can find some example of queries available behing authentication and authorisation.

Querying study-experiment-run-sample mappings:

  curl https://metadata.ega-archive.org/datasets/{datasetID}/mappings/study_experiment_run_sample \
  -H 'Authorization: Bearer access_token'

Querying run-sample mappings:

  curl https://metadata.ega-archive.org/datasets/{datasetID}/mappings/run_sample \ 
  -H 'Authorization: Bearer access_token'

Querying study-analysis-sample mappings:

  curl https://metadata.ega-archive.org/datasets/{datasetID}/mappings/study_analysis_sample \
  -H 'Authorization: Bearer access_token'

Querying analysis-sample mappings:

  curl https://metadata.ega-archive.org/datasets/{datasetID}/mappings/analysis_sample \
  -H 'Authorization: Bearer access_token'

Querying sample-file mappings:

  curl https://metadata.ega-archive.org/datasets/{datasetID}/mappings/sample_file \
  -H 'Authorization: Bearer access_token'

You can get a different output format by adding one of these options to the curl command:

  -H 'Accept: text/tsv'
  -H 'Accept: application/json'
  -H 'Accept: text/csv'

For more detailed information, refer to the Metadata API Specification.