Need Help?

Variant calling dataset from the whole-exome study of familial pulmonary fibrosis in the Canary Islands-VCF files

Sequence reads were obtained from a set of 13 families with familial pulmonary fibrosis in the Canary Islands, Spain, using Illumina paired-end reads, respectively, at the Institute of Technology and Renewable Energy (ITER). Briefly, we used bcl2fastq v2.18 to perform sample demultiplexing and BWA-MEM v0.7.15 (https://github.com/lh3/bwa) to align reads to GRCh37/hg19 reference. Resulting BAM files were assessed with SAMtools v1.3 (http://www.htslib.org) and Picard v2.10.10 (https://broadinstitute.github.io/picard/) for quality control steps. Small insertions/deletions (< 50 bp) and single nucleotide variants (SNVs) were identified using an in-house bioinformatics pipeline based on GATK HaplotypeCaller v3.8 (https://gatk.broadinstitute.org/hc/en-us/articles/360037225632-HaplotypeCaller). This pipeline follows the Best Practices recommendations for germline variant calling and its description is publicly available (https://github.com/genomicsITER/benchmarking/tree/master/WES).

Request Access

ITER-FIISC Data Access Committee (FPF)

DATA ACCESS AGREEMENT These terms and conditions govern access to the managed access datasets (details of which are set out in Appendix I) to which the User Institution has requested access. The User Institution agrees to be bound by these terms and conditions. 1. The User Institution agrees to only use these Data for the purpose of the Project (described in Appendix II) and only for Research Purposes. The User Institution further agrees that it will only use these Data for Research Purposes which are within the limitations (if any) set out in Appendix I. 2. The User Institution agrees to preserve, at all times, the confidentiality of these Data. In particular, it undertakes not to use, or attempt to use these Data to compromise or otherwise infringe the confidentiality of information on Research Participants. Without prejudice to the generality of the foregoing, the User Institution agrees to use at least the measures set out in Appendix I to protect these Data. 3. The User Institution agrees to protect the confidentiality of Research Participants in any research papers or publications that they prepare by taking all reasonable care to limit the possibility of identification. 4. The User Institution agrees not to link or combine these Data to other information or archived data available in a way that could re-identify the Research Participants, even if access to that data has been formally granted to the User Institution or is freely available without restriction. 5. The User Institution agrees only to transfer or disclose these Data, in whole or part, or any material derived from these Data, to the Authorized Personnel. Should the User Institution wish to share these Data with an External Collaborator, the External Collaborator must complete a separate application for access to these Data. 6. The User Institution agrees that the Data Producers, and all other parties involved in the creation, funding or protection of these Data: a) make no warranty or representation, express or implied as to the accuracy, quality or comprehensiveness of these Data; b) exclude to the fullest extent permitted by law all liability for actions, claims, proceedings, demands, losses (including but not limited to loss of profit), costs, awards damages and payments made by the Recipient that may arise (whether directly or indirectly) in any way whatsoever from the Recipient’s use of these Data or from the unavailability of, or break in access to, these Data for whatever reason and; c) bear no responsibility for the further analysis or interpretation of these Data. 7. The User Institution agrees to follow the Fort Lauderdale Guidelines (http://www.wellcome.ac.uk/stellent/groups/corporatesite/@policy_communications/documents/web_document/wtd003207.pdf ) and the Toronto Statement (http://www.nature.com/nature/journal/v461/n7261/full/461168a.html). This includes but is not limited to recognizing the contribution of the Data Producers and including a proper acknowledgement in all reports or publications resulting from the use of these Data. 8. The User Institution agrees to follow the Publication Policy in Appendix III. This includes respecting the moratorium period for the Data Producers to publish the first peer-reviewed report describing and analyzing these Data. 9. The User Institution agrees not to make intellectual property claims on these Data and not to use intellectual property protection in ways that would prevent or block access to, or use of, any element of these Data, or conclusion drawn directly from these Data. 10. The User Institution can elect to perform further research that would add intellectual and resource capital to these data and decide to obtain intellectual property rights on these downstream discoveries. In this case, the User Institution agrees to implement licensing policies that will not obstruct further research and to follow the U.S. National Institutes of Health Best Practices for the Licensing of Genomic Inventions (2005) (https://www.icgc.org/files/daco/NIH_BestPracticesLicensingGenomicInventions_2005_en.pdf ) in conformity with the Organization for Economic Co-operation and Development Guidelines for the Licensing of the Genetic Inventions (2006) (http://www.oecd.org/science/biotech/36198812.pdf ). 11. The User Institution agrees to destroy/discard the Data held, once it is no longer used for the Project, unless obliged to retain the data for archival purposes in conformity with audit or legal requirements. 12. The User Institution will notify FIISC-ITER within 30 days of any changes or departures of Authorized Personnel. 13. The User Institution will notify FIISC-ITER. prior to any significant changes to the protocol for the Project. 14. The User Institution will notify FIISC-ITER as soon as it becomes aware of a breach of the terms or conditions of this agreement. 15. ITER-FIISC may terminate this agreement by written notice to the User Institution. If this agreement terminates for any reason, the User Institution will be required to destroy any Data held, including copies and backup copies. This clause does not prevent the User Institution from retaining these data for archival purpose in conformity with audit or legal requirements. 16. The User Institution accepts that it may be necessary for the Data Producers to alter the terms of this agreement from time to time. As an example, this may include specific provisions relating to the Data required by Data Producers other than ITER-FIISC. In the event that changes are required, the Data Producers or their appointed agent will contact the User Institution to inform it of the changes and the User Institution may elect to accept the changes or terminate the agreement. 17. If requested, the User Institution will allow data security and management documentation to be inspected to verify that it is complying with the terms of this agreement. 18. The User Institution agrees to distribute a copy of these terms to the Authorized Personnel. The User Institution will procure that the Authorized Personnel comply with the terms of this agreement. Agreed for User Institution Signature: Name: Title: Date: Principal Investigator I confirm that I have read and understood this Agreement. Signature: Name: Title: Date: Agreed for ITER-FIISC Signature: Name: Title: Date: APPENDIX I – DATASET DETAILS (to be completed by the data producer before passing to applicant) - Dataset reference (EGA Study ID and Dataset Details) - Name of project that created the dataset - Names of other data producers/collaborators - Specific limitations on areas of research - Minimum protection measures required - File access: Data can be held in unencrypted files on an institutional compute system, with Unix user group read/write access for one or more appropriate groups but not Unix world read/write access behind a secure firewall. Laptops holding these data should have password protected logins and screenlocks (set to lock after 5 min of inactivity). If held on USB keys or other portable hard drives, the data must be encrypted. APPENDIX II – PROJECT DETAILS (to be completed by the Requestor) Details of dataset requested i.e., EGA Study and Dataset Accession Number Brief abstract of the Project in which the Data will be used (500 words max) All Individuals who the User Institution to be named as registered users +----------------------------------------------------------------------------+ | Name of Registered User | Email | Job Title | Supervisor* | +----------------------------------------------------------------------------+ All Individuals that should have an account created at the EGA +----------------------------------------------------------+ | Name of Registered User | Email | Job Title | +----------------------------------------------------------+ APPENDIX III – PUBLICATION POLICY ITER-FIISC intend to publish the results of their analysis of this dataset and do not consider its deposition into public databases to be the equivalent of such publications. ITER-FIISC anticipate that the dataset could be useful to other qualified researchers for a variety of purposes. However, some areas of work are subject to a publication moratorium. The publication moratorium covers any publications (including oral communications) that describe the use of the dataset. For research papers, submission for publication should not occur until 6 months after these data were first made available on the relevant hosting database, unless ITER-FIISC. has provided written consent to earlier submission. In any publications based on these data, please describe how the data can be accessed, including the name of the hosting database (e.g., The European Genome-phenome Archive at the European Bioinformatics Institute) and its accession numbers (e.g., EGAS00000000029), and acknowledge its use in a form agreed by the User Institution with ITER-FIISC.

Studies are experimental investigations of a particular phenomenon, e.g., case-control studies on a particular trait or cancer research projects reporting matching cancer normal genomes from patients.

Study ID Study Title Study Type
EGAS50000000782 Exome Sequencing

This table displays only public information pertaining to the files in the dataset. If you wish to access this dataset, please submit a request. If you already have access to these data files, please consult the download documentation.

ID File Type Size Located in
EGAF00008620723 tbi 332.2 kB
EGAF00008620725 vcf.gz 4.9 MB
EGAF00008620726 vcf.gz 4.6 MB
EGAF00008620727 vcf.gz 4.9 MB
EGAF00008620729 tbi 332.1 kB
EGAF00008620731 vcf.gz 4.9 MB
EGAF00008620735 vcf.gz 4.1 MB
EGAF00008620744 vcf.gz 4.1 MB
EGAF00008620747 tbi 321.6 kB
EGAF00008620748 vcf.gz 4.3 MB
EGAF00008620753 vcf.gz 5.0 MB
EGAF00008620754 tbi 336.7 kB
EGAF00008620757 vcf.gz 4.2 MB
EGAF00008620768 vcf.gz 4.3 MB
EGAF00008620769 vcf.gz 4.2 MB
EGAF00008620771 vcf.gz 4.1 MB
EGAF00008620772 vcf.gz 4.9 MB
EGAF00008620777 tbi 325.5 kB
EGAF00008620795 vcf.gz 4.8 MB
EGAF00008620796 tbi 331.9 kB
EGAF00008620797 vcf.gz 5.0 MB
EGAF00008620800 vcf.gz 4.3 MB
EGAF00008620801 tbi 314.6 kB
EGAF00008620807 vcf.gz 4.9 MB
EGAF00008620809 tbi 319.8 kB
EGAF00008620810 tbi 327.1 kB
EGAF00008620811 tbi 317.8 kB
EGAF00008620814 tbi 312.9 kB
EGAF00008620818 tbi 306.2 kB
EGAF00008620823 vcf.gz 4.4 MB
EGAF00008620824 vcf.gz 4.5 MB
EGAF00008620825 vcf.gz 4.3 MB
EGAF00008620827 vcf.gz 4.2 MB
EGAF00008620829 vcf.gz 4.7 MB
EGAF00008620839 tbi 311.9 kB
EGAF00008620841 tbi 306.6 kB
EGAF00008620845 vcf.gz 4.1 MB
EGAF00008620846 vcf.gz 4.4 MB
EGAF00008620847 tbi 308.5 kB
EGAF00008620850 vcf.gz 4.4 MB
EGAF00008620853 tbi 302.1 kB
EGAF00008620863 vcf.gz 4.4 MB
EGAF00008620871 vcf.gz 4.0 MB
EGAF00008620874 vcf.gz 4.3 MB
EGAF00008620875 vcf.gz 4.4 MB
EGAF00008620881 tbi 332.7 kB
EGAF00008620887 vcf.gz 4.3 MB
EGAF00008620899 tbi 315.8 kB
EGAF00008620908 vcf.gz 4.8 MB
EGAF00008620910 tbi 317.2 kB
EGAF00008620916 tbi 327.7 kB
EGAF00008620918 vcf.gz 4.8 MB
EGAF00008620948 tbi 312.1 kB
EGAF00008620953 tbi 317.0 kB
EGAF00008620958 tbi 318.9 kB
EGAF00008620960 tbi 316.2 kB
EGAF00008620966 vcf.gz 4.5 MB
EGAF00008620971 tbi 316.5 kB
EGAF00008620980 tbi 328.1 kB
EGAF00008620986 tbi 333.9 kB
EGAF00008620990 tbi 320.6 kB
EGAF00008620992 vcf.gz 4.8 MB
EGAF00008620993 tbi 334.3 kB
EGAF00008620994 tbi 326.7 kB
EGAF00008620997 vcf.gz 3.9 MB
EGAF00008620999 tbi 307.8 kB
EGAF00008621000 tbi 311.9 kB
EGAF00008621014 tbi 332.1 kB
EGAF00008621020 vcf.gz 4.2 MB
EGAF00008621021 vcf.gz 4.5 MB
EGAF00008621027 tbi 318.7 kB
EGAF00008621055 tbi 328.0 kB
EGAF00008621056 tbi 330.2 kB
EGAF00008621057 vcf.gz 4.6 MB
EGAF00008621085 tbi 330.7 kB
EGAF00008621143 tbi 316.2 kB
EGAF00008621151 tbi 316.8 kB
EGAF00008621153 tbi 306.5 kB
EGAF00008621162 vcf.gz 4.1 MB
EGAF00008621164 vcf.gz 4.6 MB
EGAF00008621180 tbi 329.5 kB
EGAF00008621181 tbi 326.9 kB
EGAF00008621182 vcf.gz 4.8 MB
EGAF00008621185 tbi 332.9 kB
EGAF00008621186 tbi 334.2 kB
EGAF00008621200 tbi 330.2 kB
EGAF00008621201 vcf.gz 4.3 MB
EGAF00008621214 tbi 292.0 kB
EGAF00008621215 vcf.gz 4.3 MB
EGAF00008621216 tbi 319.3 kB
EGAF00008621217 vcf.gz 4.7 MB
EGAF00008621218 vcf.gz 5.0 MB
EGAF00008621254 tbi 315.6 kB
EGAF00008621279 tbi 331.2 kB
EGAF00008621287 vcf.gz 4.9 MB
EGAF00008621309 tbi 306.3 kB
EGAF00008621338 vcf.gz 4.7 MB
EGAF00008621339 tbi 329.9 kB
EGAF00008621350 tbi 313.4 kB
EGAF00008621352 vcf.gz 4.8 MB
EGAF00008621353 vcf.gz 4.9 MB
EGAF00008621358 vcf.gz 4.5 MB
EGAF00008621381 tbi 318.5 kB
EGAF00008621382 vcf.gz 4.6 MB
EGAF00008621384 tbi 331.5 kB
EGAF00008621385 tbi 335.8 kB
EGAF00008621411 vcf.gz 3.6 MB
EGAF00008621412 vcf.gz 4.5 MB
EGAF00008621413 tbi 335.0 kB
EGAF00008621415 vcf.gz 5.0 MB
EGAF00008621416 vcf.gz 4.6 MB
EGAF00008621440 vcf.gz 4.8 MB
EGAF00008621496 tbi 333.0 kB
EGAF00008621530 tbi 331.1 kB
EGAF00008621568 vcf.gz 4.9 MB
EGAF00008621570 vcf.gz 4.9 MB
EGAF00008621571 vcf.gz 4.6 MB
EGAF00008621572 tbi 331.8 kB
EGAF00008621608 tbi 325.9 kB
EGAF00008621632 tbi 333.6 kB
EGAF00008621701 vcf.gz 5.1 MB
EGAF00008621704 vcf.gz 4.7 MB
122 Files (296.6 MB)