Projects on BioData Catalyst powered by Seven Bridges


Projects are the core building blocks of BioData Catalyst powered by Seven Bridges. Each project corresponds to a distinct scientific investigation and serves as a container for its data, analysis workflows, and results. Multiple workflow executions can be carried out within a project.

Access to a project is restricted to the collaborators in the investigation. Each project has at least one administrator, who controls the project members' permissions to execute analyses.

You can be a member of multiple projects each with different teams of researchers.

Project types

BioData Catalyst powered by Seven Bridges hosts both Open Data and Controlled Data, which require different levels of access permissions. To protect Controlled Data, there are two types of projects: Open Data and Controlled Data projects.

Open Data Projects

Open Data Projects are designed to host both Open Data and your private data.

Open Data is available to all the users on the Platform upon sign up. Open Data contains data which is not unique to an individual, such as de-identified clinical data, gene expression data, copy number alterations in regions of the genome, epigenetic data, and summaries of data compiled across individuals.

Note that you cannot copy Controlled Data inside an Open Data Project.

Controlled Data Projects

Controlled Data Projects host both Open and Controlled Data as well as your private data.

Access to Controlled Data must be obtained through dbGaP. After obtaining permission, Controlled Data users need to register for BioData Catalyst powered by Seven Bridges with their eRA Commons credentials and agree to the data use and publication guidelines datasets. Learn more about signing up for the Platform or about dbGaP controlled data access.

Controlled Data contains data which may allow individuals to be identified, such as primary sequence data (CRAM files) and VCFs.

The Platform restricts access to Controlled Data following dbGaP's model. This security ensures that data is as widely available as possible while protecting the privacy of study participants. Only users with Database of Genotypes and Phenotypes (dbGaP) permissions can access Controlled Data.

Controlled Data Projects are labeled CONTROLLED with a red tag and a lock symbol so you can recognize them easily.

If a collaborator lose dbGaP Controlled Data access at any point, all Controlled Data Project resources will become read-only: they can see project resources and file metadata but cannot access and copy files or execute analyses.

Project locations

The BioData Catalyst powered by Seven Bridges currently works with three cloud providers: Amazon Web Services (AWS), Azure, and Google Cloud Platform (GCP). Learn more about project locations.