Need Help?

Standards

The EGA is a long-standing supporter of the Global Alliance for Genomics & Health (GA4GH) to enhance responsible sharing of human genetic data through the development of interoperable global standards for human data access. The EGA is one of the founding GA4GH Driver Projects and has contributed to the development and implementation of several GA4GH standards and APIs.

Below is a list of the GA4GH standards and APIs that are currently available or planned for implementation at EGA.

Technical
Standards
Purpose Specification
Version
Supported
Version
Implementation
Large Scale Genomics
htsget A protocol for secure, efficient, and reliable access to sequencing read and variation data. V1.3.0 V1.0.0 Specification
Documentation
Endpoint
Read File Formats (SAM/BAM/CRAM) Specifications for storing next-generation sequencing read data. V3.0.0 V3.0.0 Implementation
Example of Usage
Variation File Formats (VCF/BCF) The specifications for Variant Call Format Files (VCF) and its binary counterpart BCF. V4.0.0
V2.0.0
V4.0.0
V2.0.0
Implementation
Example of Usage
Crypt4GH v1.0 Enables direct byte-level compatible random access to encrypted genetic data stored in community standards (e.g. CRAM, VCF) V1.0 V1.0 Specification
Documentation
Endpoint
refget API Enables access to reference sequences using an identifier derived from the sequence itself. V1.2.6 NA Specification
RNAget API v1 Provides a means of retrieving data from several types of RNA experiments including (i) feature-level expression data from RNA-seq type measurements and (ii) coordinate-based signal/intensity data similar to a bigwig representation via a client/server model. V1.0.0 NA Documentation
Discovery
Beacon v2 Supports discovery of genomic variants, phenotypes, and individuals V1.0.1 V0.3 Web UI
API
Source Code
Service Info API v1 The Service Info API is an endpoint for describing GA4GH service metadata, designed for extension and inclusion in other APIs. Service info is used to describe a single service, while Service Registry is used to describe multiple services. V1.0.0 NA Documentation
Service Registry API v1 provides information about other GA4GH services, primarily for the purpose of organizing services into networks or groups and service discovery across organizational boundaries. V1.0.0 NA Documentation
Data Use Researcher Identities
Data Use Ontology (DUO) Allow users to semantically tag genomic datasets with usage restrictions, allowing them to become automatically discoverable based on a health, clinical, or biomedical researcher’s authorisation level or intended use. 2021-02-23 2021-02-23 Specification
Documentation
Endpoint
Authentication & Authorization Infrastructure (AAI) The GA4GH AAI specification leverages OpenID Connect (OIDC) Servers for use in authenticating the identity of researchers desiring to access clinical and genomic resources from data holders adhering to GA4GH standards, and to enable data holders to obtain security-related attributes of those researchers. V1.2.0 V1.2.0 API URI: ega.ebi.ac.uk:8443
Documentation Repository
Researcher IDs (passport, visa) Specify the collection of researchers that may access a dataset at any given time, and the credentials they must supply. V1.0.1 V1.0.1 Specification
Documentation
Endpoint
Cloud
Tool Registry Service API TRS is a standard API for exchanging tools and workflows to analyze, read, and manipulate genomic data. V2.0.1 NA Documentation
Repostiory
Data Repository Service API ​DRS API is a standard for building data repositories and adapting access tools to work with those repositories, works with other approved APIs from the GA4GH Cloud Work Stream to allow researchers to discover algorithms across different cloud environments and send them to datasets they wish to analyse. V1.0.3 NA Documentation
Repostiory
Workflow Execution Service API This API lets users run a single workflow (defined using CWL or WDL) on multiple different platforms, clouds, and environments, and be confident that it will work the same way. The API provides methods to request that a workflow be run, pass parameters to that workflow, get information about running workflows, and cancel a running workflow. V1.0.1 NA Documentation
Repostiory
Genomic Knowledge Standards
Variation Representation v1 Provides a flexible framework of computational models, schemas, and algorithms to precisely and consistently exchange genetic variation data across communities. V1.3.0 EGA team is contributing to including it in in Beacon v2
Specification and Elixir Reference Implementation
Documentation
Repostiory
Clin/ Pheno Data Capture
Phenopackets Provides information models with different levels of complexity to enable high level clinical phenotype information as well as deep clinical phenotype information to be exchanged. V2.0.0 Included in Ongoing Submissions
EGA team is contributing to including it in in Beacon v2
Specification and Elixir Reference Implementation
Documentation
Repostiory

Driver Project

The EGA, jointly coordinated by the EBI and the CRG, was announced, in 2017, to be one of the 15 Driver Projects for GA4GH. Driver Projects are international genomic data initiatives, focussed on real projects and challenges that will guide the development efforts in order to accelerate and enable completely responsible and standarised data sharing by 2022. All chosen Driver Projects make a cross-sectional effort by playing an important role across the different workstreams.

Thomas Keane, Jordi Rambla, Mallory Freeberg, and Aina Jené have been named Driver Project Champions for the EGA. All Driver Project Champions will be leading this ambitious initiative for the following years.