The IPM BioMe Biobank, founded in September 2007, is an ongoing, broadly-consented electronic health record (EHR)-linked clinical care biobank that enrolls participants non-selectively from the Mount Sinai Medical Center patient population. BioMe currently comprises >42,000 participants from diverse ancestries, characterized by a broad spectrum of longitudinal biomedical traits. Participants are enrolled through an opt-in process and consent to be followed throughout their clinical care (past, present, and future) in real-time, allowing us to integrate their genomic information with their EHRs for discovery research and clinical care implementation. BioMe participants consent for recall, based on their genotype and/or phenotype, permitting in-depth follow-up and functional studies for selected participants at any time. Phenotypic and genomic data are stored in a secure database and made available to investigators, contingent on approval by the BioMe Governing Board. BioMe uses a "data-broker" system to protect confidentiality. Ancestral diversity - BioMe participants represent a broad racial, ethnic and socioeconomic diversity with a distinct and population-specific disease burden. Specifically, BioMe participants are of African (AA), Hispanic/Latino (HL), European (EA) and other/mixed ancestry. BioMe participants are predominantly of African (AA, 24%), Hispanic/Latino (HL, 35%), European (EA, 32%), and other ancestry (OA, 10%). Participants who self-identify as Hispanic/Latino further report to be of Puerto Rican (39%), Dominican (23%), Central/South American (17%), Mexican (5%) or other Hispanic (16%) ancestry. More than 40% of European ancestry participants are genetically determined to be of Ashkenazi Jewish ancestry. With this broad ancestral diversity, BioMe is uniquely positioned to examine the impact of demographic and evolutionary forces that have shaped common disease risk. Phenotypes available in BioMe - BioMe has a high-quality and validated set of fully implemented clinical phenotype data that has been culled by a multi-disciplinary team of experienced investigators, clinicians, information technologists, data-managers, and programmers who apply advanced medical informatics and data mining tools to extract and harmonize EHRs. BioMe, as a cohort, offers a great versatility for designing nested case-control sample-sets, particularly for studying longitudinal traits and co-morbidity in disease burden. Biomedical and clinical outcomes: The BioMe Biobank is linked to Mount Sinai's system-wide Epic EHR, which captures a full spectrum of biomedical phenotypes, including clinical outcomes, covariate and exposure data from past, present and future health care encounters. As such, the BioMe Biobank has a longitudinal design as participants consent to make all of their EHR data from past (dating back as far as 2003), present and future inpatient or outpatient encounters available for research, without restriction. The median number of outpatient encounters is 21 per participant, reflecting predominant enrollment of participants with common chronic conditions from primary care facilities. Environmental data: The clinical and EHR information is complemented by detailed demographic and lifestyle information, including ancestry, residence history, country of origin, personal and familial medical history, education, socio-economic status, physical activity, smoking, dietary habits, alcohol intake, and body weight history, which is collected in a systematic manner by interview-based questionnaire at time of enrollment. The IPM BioMe Biobank contributed ~10,600 DNA samples for whole genome sequencing to the TOPMed program. Samples were selected for the Coronary Artery Disease (CAD) and the Chronic Obstructive Pulmonary Disease (COPD) working groups. Using a Case-Definition-Algorithm (CDA), we identified ~4,100 individuals with CAD (~50% women) and ~3,000 individuals as controls (65% women). In addition, we identified ~800 individuals with COPD (62% women) and 1800 individuals as controls (72% women). Another 600 BioMe participants with Atrial Fibrillation, all of African ancestry, were included.
Uploading files Users that holds an ega-box-XXX account can upload files using either INBOX or FTP. Users who have a Submitter role associated with their email will only be able to upload files using INBOX. Before uploading your files please make sure that any files that will be uploaded to EGA do not use special characters in their naming convention such as # ? ( ) [ ] / \ = + < > : ; " ' , * ^ | &. This can cause issues with the archiving process, leading to problems for end users. The EGA is a shared, public service with limited storage. In order to manage the available resources, we enforce a limit of 10Tb per submission account at any one time. Please do not exceed this limit. INBOX FTP The FTP is only compatible with files encrypted using the EGACryptor tool Before uploading Once your submission files have been prepared using the EGAryptor, the resulting encrypted files and associated md5sum files can be uploaded to your submission account using Aspera or FTP. The EGA is a shared, public service with limited resources. In order to manage the available resources, EGA submission boxes should not exceed 8Tb in size, and cannot exceed 12Tb. If you are approaching this limit please contact contact EGA Helpdesk so that we can advise on how to register the associated metadata and trigger the archiving of files, so that you can continue with your submission. If we note that your submission account increases above 10Tb on a consistent base your password will be changed until metadata is associated. Aspera Download Aspera Using Aspera FTP FTP windows FTP Linux / Unix FTP client (Filezilla) FTP and TLS Troubleshooting Troubleshooting Aspera Download Aspera is a commercial file transfer protocol that may provide faster transfer speeds than ftp especially over longer distances. The Aspera ascp command line client. Please select Aspera Connect. The ascp command line client is distributed as part of the aspera connect highperformance transfer browser plugin and is free to use, without registration. The minimum required version of the IBM Aspera Cli is V4. Further instructions. Using the Aspera ascp command line program The location of the ascp program in the filesystem: Mac: on the desktop go cd /Applications/Aspera\ Connect.app/Contents/Resources/ there you'll see the command line utilities where you're going to use ascp. Windows: the downloaded files are a bit hidden. For instance, in Windows 7 the ascp.exe is located in the users home directory in: AppData\Local\Programs\Aspera\Aspera Connect\bin\ascp.exe Linux: should be in your user's home directory, cd /home/username/.aspera/connect/bin/ there you'll see the command line utilities where you're going to use ascp. Your command should look similar to this: ascp -P33001 -O33001 -QT -l300M -L- /path/file ega-box-N@fasp.ega.ebi.ac.uk:/path If you wish to upload several files without being requested the password, please use the below command : ASPERA_SCP_PASS=ega-box-password ascp -P33001 -O33001 -QT -l300M /path/file ega-box-N@fasp.ega.ebi.ac.uk:/path/ Explanation of parameters l300M option sets the upload speed limit to 30MB/s. You may wish to lower this value to increase the reliability of the transfer. L option is for printing logs out while transferring files to upload can be a file mask (e.g. '/homes/submitter/*.srf) or a list of files. ega-box-N is your submission account login. Add k2 switch for transfer restarts Check the command line transfer usage for more configuration details. Using FTP to upload your prepared files Use your preferred ftp client. For example, lftp is a popular choice for Linux and Mac users. Use binary mode for file transfers. Use ftp.ega.ebi.ac.uk as the target host. Login with your ega-box username and password. Upload files to your private ega-box upload area. Depending on your network setting you might wish to start FTP in passive or active mode. Using default FTP command line client in Windows Start the command line interpreter: press WinR, type cmd, hit enter Enter ftp ftp.ega.ebi.ac.uk Enter your submission username Enter your submission password Type binary to enter binary mode for transfer To see a list of available ftp commands type help. Type ls command to check the content of your submission account. Type prompt to switch off confirmation for each file uploaded. Use mput command to upload files: mput *.bam* Use bye command to exit the ftp client. Use exit command to exit the command line interpreter. Using default FTP command line client in Linux / Unix Open a terminal and type ftp ftp.ega.ebi.ac.uk Enter your submission username Enter your submission password Type binary to enter binary mode for transfer To see a list of available ftp commands type help. Type ls command to check the content of your submission account. Type prompt to switch off confirmation for each file uploaded. Use mput command to upload files: mput *.bam* Use bye command to exit the ftp client. Using FTP client FileZilla We recommend the use of FileZilla, a free FTP client . FileZilla is open source software distributed free of charge under the terms of the GNU General Public License. Use the following connection details (File - Site Manager) and add yoursubmission account username and password: Using FTP client FileZilla Select the files you wish to upload and then select upload: Using FTP client Filezilla Using LFTP with TLS We recommend the following to force the use of a secure connection. lftp > set ftp:ssl-force yes We also recommend setting the following for not encrypting the bulk data itself for performance reasons (theauthentication will still be encrypted): lftp > ftp:ssl-protect-data no In order to verify the certificate,the recommended way would be to use the CA certificates from your machine. To do that use this command in lftp adjusting the path to your ca-certificates location. lftp > ssl:ca-file "/etc/ssl/certs/ca-certificates.crt" If that is not possible or certificates are old and you can't update them, you can download the certificates needed from Quo Vadis Digital Repository The two certificates to download in PEM format are: QuoVadis Root CA2 G3 QuoVadis EV SSL ICA G3 Then you can concatenate them in a file one after the other and save it as lftp-certificates.pem Once this is done you have to point the ssl:ca-file variable to the path lftp > set ssl:ca-file "/path/to/lftp-certificates.pem" Also note that you can save this configuration at ~/.lftp/rc Another option is to download the certificates and add them to the ca-certificates of your machine. For example: In RHEL7 and cenots and others box the process to add the certificates globally is: Download the two certificates (in PEM or DER format, doesn't matter) and save them to "/etc/pki/ca-trust/source/anchors/" Run "update-ca-trust extract" Another less secure option is to turn off certificate verification with the following command: lftp > set ssl:verify-certificate false Troubleshooting If you are having problems with Aspera connection timeouts, it can be down to either one of the following. Transfers cannot start the connection fails instantly. Ensure that TCP traffic on port 33001 is allowed (open) for outbound connections through your computer's firewall and network's firewall. The connection is made, transfers are started, but 0 bytes (0%) are uploaded for each file. Ensure that UDP traffic on port 33001 is allowed (open) for outbound connections through your computer's firewall and network's firewal
We wish to understand the degree of copy number alterations in normal human sun exposed skin. We have collected small punches of epithelia and using a bait set design have shown the these span a clone. WGS will provide copy number information on these.
This dataset contains RNA-seq raw data in fastq format from 14 tumor samples. The samples are from primary tumors or metastasis and represent various cancer entities. The samples are formalin-fixed paraffin-embedded (FFPE) treated. For target enrichment SureSelect XT Human All Exon V6 was used. The libraries were sequenced in paired-end mode (2 x 50 nt) on a NovaSeq6000 S2 flow cell.
Analysis of cocaine use disorder (CUD) associated epigenome-wide DNA methylation (DNAm) alterations in human postmortem brain tissue of Brodmann Area 9. Tissue samples from N=21 CUD cases and N=21 individuals without CUD originating from the Douglas Bell Canada Brain Bank (DBCBB) were included. Epigenome-wide DNAm was investigated using the Illumina Infinium MethylationEPIC array.
RNA sequencing of 168 pulmonary samples including lung preneoplasia atypical adenomatous hyperplasia (AAH, N=38), adenocarcinoma in situ (AIS, N=22), minimally invasive adenocarcinoma (MIA, N=19) and invasive lung adenocarcinoma (ADC, N=38) and adjacent lung tissues (Normal, N=62).
In order to perform comprehensive transcriptomics and gene fusion analyses of a cohort of BCP-LBL patients, we performed RNA sequencing of 49 tissue samples from BCP-LBL patients. Because the material was available as FFPE, and had a relative low quality, we used a capture-based approach, where NGS libraries obtained from total RNA were captured in 4-plex, using a whole exome capture panel.