File repositories on the Platform

Before you can analyze data on the Platform, the files need to be within a specific project. In other words, an analysis can't be performed on the data which is outside of the project. You can copy files from a file repository to your project, or you can upload your own data.

For instance, if you'd like to use an annotation file from the Public Reference Files repository for a task execution, you first need to copy it to your project.

There are several file repositories on the Platform.

  • Every project has its own Project Files. This repository is located within the project and contains the input and output files for workflows in that project. You can upload files or copy them from other projects and repositories.

  • BioData Catalyst hosts TOPMed studies, which can be accessed via the Data Browser from the Data tab on the top navigation bar.

  • Public Reference Files, a repository of files maintained by BioData Catalyst powered by Seven Bridges, contains the latest and most frequently used reference genomes and annotation files so you won't have to upload your own reference files every time you run a task. Many bioinformatics tools and workflows require reference and annotation files to work properly. Files stored in this repository can be copied to your Project Files for use in analyses.

  • Public Test Files, also a repository of files maintained by the Seven Bridges, which contains the common test samples.

You can copy files from any file repositories to your project. Or, you can upload your data directly to a project. For instance, if you'd like to use a file from the Public Reference Files or Public Test Files repository, you first need to copy it to your project. You can access these repositories from the Data menu.