Compute costs

Compute costs depend on the cloud infrastructure provider. On BioData Catalyst powered by Seven Bridges, there are two underlying cloud infastructure providers. Amazon Web Services (AWS) with its Elastic Compute Cloud (EC2) instances and Google Cloud Platform with its Google Compute Engine instances are used to perform computation tasks. The cloud provider that will be used depends on the selected project location (cloud provider and region) during project creation.

The following table shows the compute service and charging unit for AWS and GCP:

Cloud service provider/region	Service name	Charging unit
Amazon Web Services (AWS US East)	EC2	per Second
Amazon Web Services (AWS US East)	EBS	per Second
Google Cloud Platform	Google Compute Engine	per Second (minimum 1 minute + 1 second increments)
Google Cloud Platform	Persistent SSD disk	per Second

Compute Costs: Amazon Web Services

If you are using an AWS-based location for your project, Amazon EC2 virtual computing environments, known as instances, are used to execute your analyses. There are various types of instances that have different configurations of CPU, memory, storage and networking capacity. See the list of AWS instances used on the Platform.

AWS charges for the use of their compute instances on a per second basis, but the rate depends on the AWS pricing model. The Platform uses two AWS pricing models:

On-Demand: you pay for compute capacity at a fixed hourly rate.
Spot: the hourly rate is dictated by the market (supply and demand) for AWS EC2 spare compute capacity.

All public workflows on the Platform are set up to use instances that offer the optimal ratio of price and compute power, while you are also able to perform optimization for your applications that you want to use on the Platform.

Amazon EC2 On-Demand instances

The rate of On-Demand instances depends on the instance type used and is shown in the following price list (select US East (N. Virginia)) from the Region dropdown menu based on the region where your project is created or where you run the Platform). AWS expresses all rates per hour, but the calculation of the price is done on a per-second basis: the per-hour price is divided by 3600 and then multiplied by the actual number of seconds the instance was running.

Amazon EC2 Spot Instances

When your task runs using Spot Instances, you get charged the current market price for that instance for the time period the instance was running. AWS expresses all rates per hour, but the calculation of the price is done on a per-second basis: the per-hour price is divided by 3600 and then multiplied by the actual number of seconds the instance was running.

The maximum hourly price that you will be charged for an instance is the bid price of that instance. Because the bid price used by the Platform is the On-Demand price for that particular instance, you will never pay more than the On-Demand hourly rate for an instance you're using.

If the market price of a Spot Instance you're using exceeds the bid price (i.e. the On-Demand price for that instance type), the Spot Instance is terminated and the the task will continue running on an On-Demand instance. If spot instance termination occurs during the first hour of running the task on the instance, you will not be charged for using the spot instance. However, if the spot instance is terminated at any point after the first 60 minutes, you will be charged for the entire number of seconds the instance was running.

Amazon EBS

Amazon EBS volumes are storage volumes that can be attached to compute instances to provide additional space for file storage while files are being used in computation tasks. EBS volumes can be attached to the following types of instances:

Instances that do not include any storage space (EBS-only instances), such as c4 and m4 instances. EBS storage is mandatory for these instances and you can define any size from 2 GB to 4096 GB depending on your task's storage requirements during computation.
Instances that already include storage space, in order to increase the available storage on the computation instance. In this case, the instance storage is completely replaced by the attached EBS storage, up to the maximum of 4096 GB. The option to increase storage size for instances with their own storage is especially convenient for bioinformatics workflows as the files that are used as inputs for computation tasks and the files produced as results can be very large.

EBS is billed on a per second basis, while AWS expresses charges for EBS disk space in GB*hour/month. Please note that Seven Bridges passes through EBS storage costs.

Cost example

Learn more about how EBS is charged from the example below. Note that the example only serves as an illustration of how EBS costs are calculated. For current EBS prices, refer to the official Amazon EBS pricing chart. Make sure to select US East (N. Virginia) from the Region dropdown menu.

Running RNA-seq Alignment - TopHat on a c4.2xlarge instance with 1TB of EBS:

c4.2xlarge = 8 CPUs, 15GB RAM (EBS Only) at $0.44 per Hour
Additional 1TB of EBS disk space at ~$0.10 per GB*hour/month

Assuming that the workflow took 14 hours and 25 minutes (51900 seconds) to complete:

c4.2xlarge x 14h25m = $6.34
1TB EBS x 14h25m = ($0.10 per GB*hour/month * 1024 GB * 51900 seconds) / (3600 seconds/hour * 24 hours/day * 30 days/month) = $2.05

Compute Costs: Google Cloud Platform

Google Compute Engine

If you selected a Google-based location for your project, Google Cloud Platform (GCP) offers a range of compute instances that can be used to run computation tasks. You will be able to select from n1-standard, n1-highmem and n1-highcpu instances to run tasks on the the Platform. The full list of GCP instances that can be used on the Platform is available on this page.

The use of Google Compute Engine instances is charged on a per second basis. All instances are charged for a minimum of 1 minute. After the initial minute, instances are charged in 1 second increments.

Google SSD persistent disks

Google Cloud Platform instances come without any integrated storage, so they need to have defined attached storage. If you are running a task in a project whose location is set to a Google region, SSD persistent disks are used as attached storage for computation instances during execution of computation tasks. You are able to configure the attached storage size using the sbg:GoogleInstanceType hint. The size of attached storage can be from 2 GB to 4096 GB.

Cost example

Learn more about how Google SSD persistent disks are charged from the example below. Note that the example only serves as an illustration of how costs are calculated. For current and detailed information and pricing, please refer to the official GCP documentation on persistent disks.

Running RNA-seq Alignment - TopHat on an n1-standard-4 instance with 1TB of attached persistent SSD storage:

n1-standard-4 = 4 CPUs, 15GB RAM at $0.19 per Hour
Additional 1TB SSD persistent disk at $0.17 per GB-month

Assuming that the workflow took 14 hours and 25 minutes (51900 seconds) to complete:

n1-standard-4 x 14h 25m = $2.74
1TB persistent SSD x 14h 25m = ($0.17 per GB*hour/month * 1024 GB * 51900 seconds) / (3600 seconds/hour * 24 hours/day * 30 day-month) = $3.48