EGA Statistics
Welcome to the EGA statistics page.
The aim of this page is to present a number of regularly update statistics about the European Genome - Phenome archive organization.
The statistics are comprised of 5 main topics :
- Bibliography: Here we expose in a yearly and cumulative manner, the publications, citations, publishers journals and impact factor of EGA related studies.
- Growth: In this section the growth of registered objects such as studies, datasets and dacs in the EGA.
- Community: Here, we show the amount and global distribution of submitters and requesters
- Archive: In this section, we expose the overall volume of data available to download and the different archived file types.
- Distribution: Here, we show the amounf of data distributed by the EGA.
Publications citing data stored at the EGA
Cumulative published studies by year
Impact Factor of published studies
Publications citing data stored at the EGA
The studies that use EGA datasets are requested to cite the EGA into their bibliography. This fact allows for a faster spreading of the role and purpose of the EGA, being the availability of their studies for testability and reusability purposes.
In order to track the studies we use the unique study accession, provided by the EGA when submitting a new study, and comprised of the keyword EGAS followed by 11 digits such as EGAS00000000001. We use the Europe PMC RESTful Web Service in order to query, and index studies where the EGAS accession was used.
Below, you can find a chart showing the journal publishers of the studies citing data stored in the EGA being cited:
Number of publications citing data stored in the EGA by year
Here, you can find a chart showing the number of publications citing data stored in the EGA by year. These publications could have deposit, or re-used the data. The study acession can be found in the publication. The values are non-cumulative.
Number of publications citing data stored in the EGA by year
Here, you can find a chart showing the number of publications citing data stored in the EGA by year. These publications could have deposit, or re-used the data. The study acession can be found in the publication. The values are cumulative.
Journal Impact Factor of Publications Citing Data Stored in the EGA
Here, you can find a chart showing the journal impact factor of publications citing data stored in the EGA. We have used Scimago Journal Rank for this purpose.
- Low impact treshold was defined between 0.0 and 5.0 excluding;
- Medium impact treshold was defined between 5.0 and 10.0 excluding;
- High impact range was established from values equal to or higher than 10.0;
Publications citing a publication containing a EGA study accession
The publications, stating a EGA study accession in the bibliography, can be cited for testibility or reusability purposes. Below, you can find a chart showing the publishers of the publications being cited:
Number of publications citing a publication containing a EGA study accession by year
Here, you can find a chart showing the number of publications citing a publication containing a EGA study accession by year by year. The values are non-cumulative:
Released Studies, Dataset and Dac.
Cumulative released Study, Dataset and Dac.
Released Study, Dataset and Dac by year.
The figure below represent the distribution of released Studies, Datasets and Dacs per year.
Cumulative released Study, Dataset by year.
The figure below represent the cumulative sum of released Studies, Datasets and Dacs by year.
Last Updated : May 23 2023Created Study, Dataset and Dac by year.
This figure is available for authenticated users.
Cumulative created Study, Dataset and Dac by year.
This figure is available for authenticated users.
Requester accounts created by country
Requester accounts created by year
Requester accounts created by country
The figure below represents the number of requester accounts by country. Country was infered from the requester email domain. Unassigned country represents requesters whose email domain could not be assigned to a country such as @gmail.com
Requester accounts created by year
The figure below represents the number of requester accounts by year.
The figure below represents the top 15 industry requesting companies.
Submitter accounts created by country
The figure below represents the number of submitter accounts by country. Location was infered from the requester email domain.
Submitter accounts created by year
The figure below represents the number of submitter accounts by year.
EGA Archive growth in size and number of files
The figure below represents the ega archive growth in size (GB) and number of files.
The figure below represents on a initial level, the percentage of archived files by data technology. The second level, acessible by clicking in the required extension, displays the number of files by extension archived in the ega archive. Please click over the extension to learn more about the files.
The figure below represents the cumulative EGA data distribution via the download clients(v2 and v3.x api versions) using HTTPS, FTP, and download boxes using ASPERA.
The figure below represents the EGA data distribution via the download clients(v2 and v3.x api versions) using HTTPS, FTP, and download boxes using ASPERA.
Cumulative EGA distribution of the UK Biobank dataset
Genomic data from the 500,000 people participating in the UK Biobank initiative is being distributed via the European Genome–phenome Archive (EGA). UK Biobank provides extremely detailed, high-quality datasets on individuals. It is an unprecedented collection that offers endless possibility and substantial efficiency savings for biomedical research and understanding the causes of disease.
Around 500,000 people from across the UK, between the ages of 40 and 69, participated in UK Biobank between 2006 and 2010, undergoing extensive measurements and genotyping. They provided blood, urine and saliva samples for future analysis – including genetic – and gave detailed information about themselves. They also agreed to allow UK Biobank to integrate information from their electronic health records. In order to learn more please visit the UK Biobank study page.
Daily EGA distribution of the UK Biobank dataset
The figure below represents the UK Biobank daily data distribution from the EGA.