File archiving overview

Storing large amounts of data in the cloud is a convenient way to have it available for computation. Instant availability is, however, not always our primary concern

Sometimes we have to deal with "cold" data – files that are not required for processing and have a very low chance of being accessed over a longer time period, but must nonetheless remain available for compliance with local and federal laws, best practice guidelines and internal processes – such as input and output files belonging to completed analyses.

Data that will not be used for some time (typically over three months) can be moved into archival storage. Archived files are billed at a significantly reduced price compared to the data which is always available. This makes archival a good solution for infrequently accessed files.

📘

BioData Catalyst powered by Seven Bridges currently offers Amazon Glacier as the archiving back-end. For up-to-date pricing information in the storage services that the Platform supports, please refer to the official pricing plans at Amazon Glacier.

Cost Savings

Depending on the storage service used, storing data in an archive typically costs around a third as much as storing data that is always available.

As with all other costs for user-uploaded data hosted on the Platform, the Platform passes the charges that we incur for archiving data directly to the customer without markup.

In addition to data hosting charges, Amazon Glacier may charge additional archival, restoration or early deletion fees. If you incur these additional costs, then we will pass them on to you without markup.

However, if archival storage is accessed infrequently over a number of months, these charges should not be expected to affect the projected cost savings significantly.

Limitations of Archiving

Moving data to and from archival storage is not instantaneous. Depending on the type of archival storage used, it can take from several hours, up to a day or more to archive or restore large files.

When archived, files can not be used as inputs to the tasks, downloaded, visualized in the Genome Browser nor can their content be obtained in any way. Archived files must first be restored.