JavaScript Cookbook
This page contains expressions written in sbg:draft-2 version of the Common Workflow Language. For CWL v1.0 expressions, please see this page.
This page provides examples of expressions that are entered in the Tool Editor when wrapping a tool for use on the CGC. Expressions can be used to dynamically set different tool properties such as the base command, arguments, secondary files, metadata, output file names, etc. Dynamic expressions commonly use the $self
and $job
predefined objects, which are determined at runtime and denote properties of the tool's inputs or outputs in a given execution and the ongoing tool execution (the job). The expressions are entered anywhere in the Tool Editor where the </> symbol is present. For general information about expressions, please refer to dynamic expressions in tool descriptions.
Capture the name and content of an input file
The following expression picks out the name of the data file input for each execution of the tool. The dynamic expression will be based on the $job
object; in particular, we will use the inputs
property of the $job
object. There is no property of the inputs
for file name, but there is a property, path
, which refers to the file path of the input to the tool in any given execution. Using string manipulation on the path object we can obtain the file name. In particular, we will use the following JavaScript expression:
$job.inputs.<input_port__ID_for_data_file>.path.split('/').slice(-1)[0]
In this expression, you should replace <input_port_ID_for_data_file>
with the ID for the input port that the data file goes into. This expression will then take the path of the input to the specified port, split the path on '/', and then select the last slice of the split list. This should be the file name, at the end of the path.
To refer to the content of the file, use the following expression:
$job.inputs.<input_ID_for_data_file>
As before, <input_ID_for_data_file>
refers to the ID of the input port that the data file gets inputted to. This expression will pick out the file object that is inputted there, and set it to the content of the file named in the expression above.
Name output files based on the Sample ID metadata field of input files
The expression below is used to name an output file of a tool based on the value in the Sample ID metadata field obtained from the input files. As some tools allow you to specify the output file name as a command line argument, this expression can be used to define the argument value in the Arguments section of the Tool Editorโs General tab.
{
input_files = [].concat($job.inputs.<input_port_ID>)
filename = input_files[0].path.split('/').slice(-1)[0];
if (input_files[0].metadata && input_files[0].metadata.sample_id)
{
filebase = input_files[0].metadata.sample_id
}
else
{
filebase = "sample_unknown"
}
return filebase.concat("<file_extension>")
}
The expression works as follows:
The following two lines of code get the file name(s) from the input file or array of files.
input_files = [].concat($job.inputs.<input_port_ID>)
filename = input_files[0].path.split('/').slice(-1)[0];
Make sure to replace <input_port_ID>
value with the the corresponding port ID of your app.
The next part of the expression checks whether the input file has a value set in the Sample ID metadata field. If there is a value, it is used as the base name of the output file. Otherwise, the base name will be sample_unknown
.
if (input_files[0].metadata && input_files[0].metadata.sample_id)
{
filebase = input_files[0].metadata.sample_id
}
else
{
filebase = "sample_unknown"
}
Finally, the extension is appended to the base name of the file:
return filebase.concat("<file_extension>")
Make sure to replace <file_extension>
with the extension of the tool's output file, including the dot (for example .bam).
Name output files based on input file names
The following expression will retrieve the name of the input file for the job and return it as the name of the output file. This expression can also be used with tools that allow output file name to be defined through a command line argument. It is entered as the argument value in the Arguments section of the Tool Editorโs General tab.
{
reads = [].concat($job.inputs.<input_port_ID>)
file_path = reads[0].path
filename = file_path.split('/').slice(-1)[0]
filebase = filename.split('.').slice(0, -1).join('.')
out_name = filebase.concat("<file_extension>")
return out_name
}
The expression works as follows:
The following two lines of code get the file name(s) from the file or array of files that have been provided as the input.
reads = [].concat($job.inputs.<input_port_ID>)
file_path = reads[0].path
In this part of the expression, make sure to replace <input_port_ID>
with the the corresponding port ID of your app.
The next part of the expression extracts the name of the input file. This will be used as the base name of the output file:
filename = file_path.split('/').slice(-1)[0]
filebase = filename.split('.').slice(0, -1).join('.')
The last part of the expression appends the extension to the base name and returns the full name of the output file:
out_name = filebase.concat("<file_extension>")
return out_name
You need to replace <file_extension>
in the code to match the extension you need for your output file. The extension should also include the dot (for example .fastq).
Set metadata fields of output files based on inputs for paired end 1 and 2
This expression is used to copy the values in the Paired-end metadata field from input files for a given job to their corresponding paired-end output files.
{
filename = $self.path.split('/').slice(-1)[0]
filebase = filename.split('.').slice(0, -3).join('.')
reads = [].concat($job.inputs.<input_port_ID>)
for (i=0; i<reads.length; i++)
{
input_filename = reads[i].path.split('/').slice(-1)[0]
input_filebase = input_filename.split('.').slice(0, -1).join('.')
if (filebase==input_filebase && $job.inputs.<input_port_ID>[i].metadata && $job.inputs.<input_port__ID>[i].metadata.paired_end)
{
return $job.inputs.<input_port_ID>[i].metadata.paired_end
}
}
}
The expression is entered in the Metadata Value field on the output port's setup screen. The Metadata Key value needs to be paired_end
.
The code is analyzed below:
The following two lines extract the base name of the output file on the output port:
filename = $self.path.split('/').slice(-1)[0]
filebase = filename.split('.').slice(0, -3).join('.')
The expression assumes that the file extension includes three dot-separated portions, such as <base_name>.pe_1.fastq.gz
. The slice(0, -3)
method in the second line above is used to extract the part of the file name before .pe_1.fastq.gz
, but can be adjusted to match the file naming convention for your tool's input paired-end files. For example, if the naming convention is <base_name>.fastq
, the method needs to be slice(0, -1)
.
The next line creates an array containing the paths of the input file(s):
reads = [].concat($job.inputs.<input_port_ID>)
Make sure to replace <input_port_ID>
with the actual ID of the input port for paired-end files.
Finally, there is a for loop that iterates through the array of input file paths and gets the base name of each of the files:
for (i=0; i<reads.length; i++)
{
input_filename = reads[i].path.split('/').slice(-1)[0]
input_filebase = input_filename.split('.').slice(0, -1).join('.')
if (filebase==input_filebase && $job.inputs.<input_port_ID>[i].metadata && $job.inputs.<input_port__ID>[i].metadata.paired_end)
{
return $job.inputs.<input_port__ID>[i].metadata.paired_end
}
}
Once it finds the input file whose name matches the name of the output file, it checks whether the input file has metadata and has a value in the Paired-end metadata field. If there is a value, the expression evaluates to that value, which becomes the Paired-end metadata value for the output file.
Even if you have properly entered the expression and replaced all placeholders with proper values, you might still get an error message when you save the expression. This warning is incorrect and it is a known bug on the CGC.
Order input reads based on paired-end metadata
Some tools, such as BWA MEM Bundle require paired-end input reads to be ordered in the correct sequence (paired-end 1 first, followed by paired-end 2). This expression will automatically order the input reads based on the values entered in the paired_end
metadata field:
{
if($job.inputs.input_reads[0] instanceof Array){
input_reads = $job.inputs.input_reads[0]
} else {
input_reads = $job.inputs.input_reads = [].concat($job.inputs.input_reads)
}
read_metadata = input_reads[0].metadata
if(!read_metadata) read_metadata = []
order = 0
if(read_metadata == []){ order = 0 }
else if('paired_end' in read_metadata){
pe1 = read_metadata.paired_end
if(pe1 != 1) order = 1
}
if (input_reads.length == 1){
return input_reads[0].path
}
else if (input_reads.length == 2){
if (order == 0) return input_reads[0].path + ' ' + input_reads[1].path
else return input_reads[1].path + ' ' + input_reads[0].path
}
}
The expression works as follows:
The first code block checks whether the input is an array or array of arrays and returns the correct value containing the input reads:
{
if($job.inputs.input_reads[0] instanceof Array){
input_reads = $job.inputs.input_reads[0]
} else {
input_reads = $job.inputs.input_reads = [].concat($job.inputs.input_reads)
}
The following part reads metadata from the first supplied input read. If there is no metadata, assigns an empty array to the read_metadata
variable:
read_metadata = input_reads[0].metadata
if(!read_metadata) read_metadata = []
The order
flag is used to mark the order of input reads. The starting assumption is that the reads are ordered correctly, which is denoted by assigning the value 0
to order
.
order = 0
The following code block starts by checking whether the first input read has any assigned metadata values. If it finds no values, no sorting can be done based on metadata and it assumes that the reads are in correct order (order = 0
). Otherwise, it checks whether paired-end 1 corresponds to the first given read:
if(read_metadata == []){ order = 0 }
else if('paired_end' in read_metadata){
pe1 = read_metadata.paired_end
if(pe1 != 1) order = 1 // change order
}
Finally, the expression checks how many reads there are and returns them in the correct order. If only one input read is present, this read is returned as there is no need for ordering. If there are two input reads, they are returned in the correct order based on the value of the order
flag.
if (input_reads.length == 1){
return input_reads[0].path
}
else if (input_reads.length == 2){
if (order == 0) return input_reads[0].path + ' ' + input_reads[1].path
else return input_reads[1].path + ' ' + input_reads[0].path
}
}
Configure a tool to unpack a TAR archive provided as its input
In some cases, input files taken by a tool come in the form of a TAR archive. TAR archives can be produced by e.g. aligner indexers, which are a set of indexing tools that index reference files and output an archive containing the reference and the index file(s). A tool that needs to use the files from a TAR archive can be configured to unpack it.
Prerequisite: In order to simplify the process of unpacking the archive using the expression below, the tool's input port that takes the archive file will have the Stage Input > Link option configured. This will make the TAR archive available directly in the tool's working directory. Learn more about (Stage Input)[doc:tool-input-ports#section-stage-input] and see how to configure the Stage Input option for a tool.
- Navigate to the Apps tab in your project.
- Click the pencil icon next to the tool you want to configure.
- Navigate to the General tab in the Tool Editor.
- Click + in the Base Command section. If the field(s) in the Base Command section have already been populated, copy the content of each field to the first blank field below it, until the very first field in the section becomes blank.
- Click </> next to the first field.
- Paste the following code:
{
var index_files_bundle = $job.inputs.<input_port_ID>.path.split('/').slice(-1)
return 'tar -xf ' + index_files_bundle + ' ; '
}
The first line of the expression retrieves the name of the archive file using the $job object. The second line appends the retrieved file name to the command that will unpack the archive file.
Please make sure to replace <input_port_ID>
in the above code with the ID value of your tool's input port that takes the archive file.
7. Click Save.
8. Click Save in the top-right corner of the Tool Editor.
Your tool is now configured to unpack a TAR archive it receives as its input.
Updated less than a minute ago