Using the API
Base URLs for the API V2.0 :
https://ega.ebi.ac.uk/ega/rest/access/v2 http://ega.ebi.ac.uk/ega/rest/download/v2
Access to view data and to make requests is HTTPS secured. Once a request has been made, the encrypted data stream is downloaded via HTTP (unsecured) to avoid multiple and unnecessary encryption overhead.
The EGA Download REST API performs all user interaction on an SSL secured HTTPS connection (port 443). For better performance, all data transfer operations are performed on a plain HTTP connection (port 80). All data is encrypted before transmission, so sending that data stream on a standard connection does not pose a security risk.
According to REST conventions, downloads are provided as a two-step process: Step one creates a REST resource (a URL), step 2 uses that resource to download the requested data. Upon successful completion of the download (or after 7 days) the download resource URL will be removed again.
A request to download data includes providing user credentials so that the system can verify access rights. In this step the user specifies the data to be downloaded as well as an encryption key, which is used to initialize the outgoing encrypted stream. This ensures that the data arrives on the user's system encrypted with a key provided by the user. This information is combined with the user IP address to create a REST resource URL. The resource is identified by a "ticket", which is sent to the user in response to a download request.
The download ticket (the corresponding resource URL) is the only information required to start downloading the requested data. The EGA server matches the IP address of the request and the IP address of the download to ensure that data can only be downloaded by the requestor. The response to accessing the download URL is a binary stream, which is stored on the user's system. Upon successful download, that download resource is removed again. The download resource also is removed automatically after 7 days.
The API returns primarily simple JSON Arrays (containing a list of response items), occasionally JSON Objects (containing key-value pairs) as response. Download URLs return a binary stream as response.
Listing all authorised Datasets
Listing files in an authorized Dataset
Listing Requests (i.e. listing all current request tickets)
Listing tickets in one Request (which contains information about the requested files)
Using the API to Log In
Successful login produces a session id/token, which must be provided in subsequent REST calls. The session token times out and becomes invalid after 10 minutes of inactivity. There are three ways to log in via the API:
(1) Submitting a named form
(2) Basic Authentication
(3) Via URL + Parameter
Examples using “testuser@ebi.ac.uk” as username and “testpassword” for that user’s password:
1) Login by submitting a form named "loginrequest" with URLEncoded fields for "username" and "password":
curl -k -X POST -F loginrequest='{"username":"testuser%40ebi.ac.uk","password":"testpassword"'} -H "Accept: application/json" https://ega.ebi.ac.uk/ega/rest/access/v2/users/login
2) Login with Basic Authentication (using base 64 encoded credentials “testuser%40ebi.ac.uk:testpassword”):
curl -k -H "Accept: application/json" -H "Authorization: Basic dGVzdHVzZXIlNDBlYmkuYWMudWs6dGVzdHBhc3N3b3Jk" https://ega.ebi.ac.uk/ega/rest/access/v2/users/login
3) Login via URL + Parameter; the username is part of the URL, the password is passed as URL parameter:
curl -k -H "Accept: application/json" https://ega.ebi.ac.uk/ega/rest/access/v2/users/ testuser%40ebi.ac.uk?pass=testpassword
Any of these three calls returns a JSONArray with two elements, in case of success, and with one element in case of failure:
{"header":{"apiVersion":"v2","code":"200","docLink":"http://www.ebi.ac.uk/ega","errorCode":"200","errorStack":"","service":"access","technicalMessage":"","userMessage":"OK"},"response":{"numTotalResults":1,"result":["success","b195b0c5-b574-43f2-9910-37d5853826ba"],"resultType":"us.monoid.json.JSONArray"}}
The interesting part is in JSON object "response": "response":{"numTotalResults":1,"result":["success","b195"],"resultType":"us.monoid.json.JSONArray"} and this contains the JSON array "result":
"result":["success","b195"] -- the first element is "success"/"false". And if it is "success" then there is a second element, which contains the session token. In future REST calls this token is added to the REST URL as parameter ''.
In case of failure the array in element "result" contains a single failure error message.
Logging out
This is not required, as a session times out after 10 minutes anyway. But it is cleaner to explicitly end a session when API interaction has completed. This is done with a REST call to ‘logout’:
curl -k -H "Accept: application/json" https://ega.ebi.ac.uk/ega/rest/access/v2/users/logout
This should return:
{"header":{"apiVersion":"v2","code":"200","docLink":"http://www.ebi.ac.uk/ega","errorCode":"200","errorStack":"","service":"access","technicalMessage":"","userMessage":"OK"},"response":{"numTotalResults":1,"result":["logged out"],"resultType":"us.monoid.json.JSONArray"}}
The important part is the message "logged out", indicating a successful logout. Sessions expire after 10 minutes of inactivity.
Listing all authorised Datasets
Authorisation information is available for a valid user session. A call to ‘datasets’ lists all authorized dataset for the user:
curl -k -H "Accept: application/json" https://ega.ebi.ac.uk/ega/rest/access/v2/datasets
The result in the "result" element is a JSONArray of Strings, containing authorized Dataset IDs.
Listing files in an authorized Dataset
A call to ‘files’ requires specification of an authorized dataset_stable_id in the URL (otherwise no results are returned):
curl -k -H "Accept: application/json" https://ega.ebi.ac.uk/ega/rest/access/v2/datasets/{dataset}/files
Where {dataset} is the dataset_stable_id of a dataset.
The result in the "result" element is a JSONArray of EgaTicket objects, containing file information about all available and pending files in the specified authorized Dataset IDs.
Example:
curl -k -H "Accept: application/json" https://ega.ebi.ac.uk/ega/rest/access/v2/datasets/EGAD00010000805/files5
This produces an ArrayList of EgaFile JSON objects, in this case containing 12 elements (only one is shown in this example):
{"header":{"apiVersion":"v2","code":"200","docLink":"http://www.ebi.ac.uk/ega","errorCode":"200","errorStack":"","service":"access","technicalMessage":"","userMessage":"OK"},"response":{"numTotalResults":12,"result":[{"fileDataset":"EGAD00010000805","fileID":"EGAF00000867414","fileIndex":"keke.txt","fileMD5":"TODO: MD5","fileName":"/arrays/331-01-3TD.CEL.cip","fileSize":"69084299","fileStatus":"available"}, […] ],"resultType":"us.monoid.json.JSONArray"}}
The EgaFile object contains information about one file:
{ "fileDataset":"EGAD00010000805", "fileID":"EGAF00000867414", "fileIndex":"keke.txt", "fileMD5":"TODO: MD5", "fileName":"/arrays/331-01-3TD.CEL.cip", "fileSize":"69084299", "fileStatus":"available" }
Listing Requests (i.e. listing all current request tickets)
A call to ‘requests’ lists all requests containing files that have not been downloaded yet.
curl -k -H "Accept: application/json" https://ega.ebi.ac.uk/ega/rest/access/v2/requests
This produces an ArrayList of EgaTicket JSON objects, listing all information about currently requested files (this can be a very long list):
{"header":{"apiVersion":"v2","code":"200","docLink":"http://www.ebi.ac.uk/ega","errorCode":"200","errorStack":"","service":"access","technicalMessag e":"","userMessage":"OK"},"response":{"numTotalResults":2413,"result":[{"encryptionKey":"","fileID":"EGAF00000098885","fileName":"/WTCCC2_PE/raw/a520532-00-791321-072009-4059643-01970.CEL.gz.gpg ","fileSize":"3693067150","fileType":"EBI","label":"ltest3","ticket":"0ab20948-cba2-48a7-baf7-da871b738665","transferTarget":"","transferType":"","user":"asenf@ebi.ac.uk"}, [...] ],"resultType":"us.monoid.json.JSONObject"}}
The EgaTicket object contains information about one requested ticket:
{ "encryptionKey":"", "fileID":"EGAF00000098885", "fileName":"/WTCCC2_PE/raw/a520532-00-791321-072009-4059643-01970.CEL.gz.gpg ", "fileSize":"3693067150", "fileType":"EBI", "label":"ltest3", "ticket":"0ab20948-cba2-48a7-baf7-da871b738665", "transferTarget":"", "transferType":"", "user":"testuser@ebi.ac.uk" }
The encryption Key is always empty, for security reasons. transferTarget and transferType are also empty and refer to future functionality. The ticket is used to form the download URL.
Listing tickets in one Request (which contains information about the requested files)
Specifying a request in the URL lists all the individual request tickets (which are part of the download URL) in that request:
curl -k -H "Accept: application/json" https://ega.ebi.ac.uk/ega/rest/access/v2/requests/{requestlabel}
Where {requestlabel} is the label of one user request. The result is a the same format as in the previous section, but only tickets are included where the ‘label’ matches the provided {requestlabel}.
Making a Dataset Request
Before any data can be downloaded that data has to be requested. This creates download links for all specified files (for example, all files in a dataset) and deposits the encryption key to be used for this data on the server. At the moment the “downloadType” must always be “STREAM”:
curl -k -X POST -F downloadrequest='{"rekey":"{user_re_encryption_key}","downloadType":"STREAM","descriptor":"{requestlabel}"'} -H "Accept: application/json" https://ega.ebi.ac.uk/ega/rest/access/v2/requests/new/datasets/{datasetid}
Where {datasetid} id a dataset_stable_id. The “descriptor” is the label by which the request is listed later, and by which it can be downloaded (later also referred to as ‘request’ or ‘request label’).
Making a File Request
Individual files can be requested by specifying the file stable ID in the request URL:
curl -k -X POST -F downloadrequest='{"rekey":"{user_re_encryption_key}","downloadType":"STREAM","descriptor":"{requestlabel}"'} -H "Accept: application/json" https://ega.ebi.ac.uk/ega/rest/access/v2/requests/new/files/{fileid}
Where {fileid} id a file_stable_id.
Deleting a Request
A call to ‘delete’ removes the specified request from the server:
curl -k -H "Accept: application/json" https://ega.ebi.ac.uk/ega/rest/access/v2/requests/delete/{requestlabel}
Where {requestlabel} is the label of one user request.
Deleting a Request Ticket
If one specific request ticket is specified in the delete call that one request ticket is deleted from the server:
curl -k -H "Accept: application/json" https://ega.ebi.ac.uk/ega/rest/access/v2/requests/delete/{requestlabel}/{ticket}
Where {requestlabel} is the label of a user request, and {ticket} is the uuid of a ticket in that request.
Downloading a Ticket
Each requested file has its own download URL, identified by the request ticket for that file. Downloading data in a request essentially means to stream each download URL associated with that request. Note that this is plain HTTP:
curl -H "Accept: application/octet-stream" http://ega.ebi.ac.uk/ega/rest/ds/v2/downloads/{downloadticket}
This produces a binary data stream, which is the file specified by the ticket, encrypted using the password specified at the time the request was made.
Verifying a Download
Downloads are verified at each stage by MD5 checksums. If a file has been successfully streamed to completion from the download server,the MD5 of the stream that was sent is stored temporarily. Calling the ‘results’ URL for a download ticket, after the download is complete, retrieves that MD5 (and optionally the local MD5 can be submitted). This way the MD5 of the received data file can be verified:
curl -H "Accept: application/json" http://ega.ebi.ac.uk/ega/rest/ds/v2/results/{downloadticket}[?md5={local_md5}]
If the local MD5 is provided in the request (optional), then the server will also know if the download was correct. The result (in the "result" element) is a JSONArray containing the server MD5 and the size of the file that was sent.
Decrypt files using the EGA Download client
All files downloaded are encrypted (.cip) with the key specified in your original download request. To decrypt the files please use the EGA download client.
Full API Overview
Access
Access - https://ega.ebi.ac.uk/ega/rest/access/v2
POST /users/login Preferred way to log in
["loginrequest":{"username":"{username}","password":"{password}"}]
GET /users/{user}?pass= 'user' and 'pass' URLEncoded
GET /users/login Log in via Basic Auth header
GET /users/logout Log out of specified session
GET /datasets List all authorized datasets
GET /datasets/{dataset}/files List all available/pending files in an authorized dataset
GET /files/{fileid} List details about one authorized file
GET /requests List all Requests
GET /requests/{requestlabel} List specified Request
GET /requests/ticket/{ticket} List details on specified request Ticket
GET /requests/ticket/delete/{ticket} Delete specified request Ticket
GET /requests/delete/{requestlabel} Delete specified Request
GET /requests/delete/{requestlabel}/{ticket} Delete specified request Ticket POST /requests/new/datasets/{datasetid} New Request: one entire authorized dataset ["downloadrequest":{"rekey":"{user_re_encryption_key}","downloadType":"STREAM","descriptor":"{requestlabel}"'}]
POST /requests/new/files/{fileid} New Request: one individual authorized file ["downloadrequest":{"rekey":"{user_re_encryption_key}","downloadType":"STREAM","descriptor":"{requestlabel}"'}]
Download
Download - http://ega.ebi.ac.uk/ega/rest/download/v2
GET /downloads/{downloadticket} Start downloading a re-encrypted file
GET /results/{downloadticket}[?md5={local_md5}] Obtain server statistics (size, md5) of the download after completion
GET /metadata/{dataset} Obtain metadata .tar.gz packet for selected Dataset