Release notes - January 3, 2024

Recently updated apps

We have updated the following tools on BioData Catalyst powered by Seven Bridges: 

  • Exomiser 13.3.0 – used to identify candidate causative variants from WES or WGS patient VCF data and phenotype HPO terms. 
  • PharmCAT 2.8.3 toolkit: 
    • PharmCAT VCF Preprocess – prepares an input VCF file for PharmCAT
    • PharmCAT – takes a single-sample VCF file and returns a report with guideline variants. 
  • Sambamba 1.0.1 toolkit: 
    • Sambamba Index – creates a BAI or FAI index for the provided BAM/FASTA file. 
    • Sambamba Slice – copies a slice (region) of the coordinate sorted and indexed input file in BAM or FASTA format. 
    • Sambamba Sort – sorts alignments in BAM format. 
    • Sambamba Markdup – marks or removes duplicate reads from an input BAM file. 
    • Sambamba Flagstat – creates read flag statistics from a BAM file. 
    • Sambamba Merge – merges alignments in BAM format. 
    • Sambamba View – inspects and filters alignments in SAM/BAM format. 
  • Clair3 1.0.4 – calls small germline variants from data generated by Nanopore, PacBio or Illumina sequencing technologies. 
  • cuteSV 2.1.0 – calls structural variation from sorted long read alignments. 
  • Twelve tools from the Bismark 0.24.1 toolkit: 
    • Bismark – takes files with bisulfite-treated reads and aligns them to a specified bisulfite genome. 
    • Bismark Methylation Extractor – extracts the methylation call for every Cytosine in Bismark result files. 
    • Bismark Genome Preparation – converts the specified reference genome into bisulfite converted genome. 
    • Bismark Deduplicate – removes duplicate Bisulfite-Sequencing (BS-Seq) reads from an alignment file. 
    • Bismark2BedGraph – generates bedGraph and coverage files sorted by chromosomal position. 
    • Bismark2Report – uses Bismark alignment, deduplication and methylation reports to generate a graphical HTML report. 
    • Bismark2Summary – uses Bismark report files of several samples to generate a graphical summary HTML report. 
    • Bismark Bam2Nuc – calculates the mono and di-nucleotide coverage and compares it to the average genomic sequence composition. 
    • Bismark Coverage2Cytosine – generates a cytosine methylation report for a genome of interest. 
    • Bismark Filter Non Conversion – filters incomplete bisulfite conversion in non-CG context in Bismark BAM files. 
    • Bismark Methylation Consistency – splits BAM files based on methylation consistency. 
    • Bismark NOMe Filtering – filters reads in a yacht file (output of Bismark Methylation Extractor). 
  • Bismark Analysis 0.24.1 workflow for analyzing DNA methylation, a type of epigenetic modification. The workflow processes reads from Whole Genome Bisulfite Sequencing (WGBS) and Reduced Representation Bisulfite Sequencing (RRBS). While suitable for any input size, it excels with larger input samples. 
  • Three tools from the Salmon 1.10.1 toolkit: 
    • Salmon Index – builds an index necessary for the Salmon Quant – Reads and Salmon Alevin tools. 
    • Salmon Quant – Reads – infers transcript abundance estimates from RNA-seq data, using Selective Alignment (SA) for mapping. 
    • Salmon Quant – Alignment – estimates transcript abundance from aligned RNA-seq data using the Variational Bayesian EM algorithm. 
  • Salmon Workflow 1.10.1 for estimating transcript abundances from RNA-Seq data using Selective Alignment for mapping. The workflow enables creation of the necessary index for quantification and provides the capability to process multiple samples at once. It also creates an expression matrix at both the transcript and gene levels, aggregating expression results across all samples. 
  • ENCODE Chip-Seq Pipeline (v2.2.1). This workflow represents the ENCODE transcription factor and histone ChIP-Seq analysis pipelines. ChIP-Seq Analysis studies chromatin modifications and binding patterns of transcription factors and other proteins. It combines chromatin immunoprecipitation (ChIP) assays with standard NGS sequencing. The steps of the ChIP-Seq Analysis workflow consist of mapping of reads including duplicate removal, post alignment QC, cross correlation analysis, peak calling with blacklist filtering and a statistical framework, applied to the replicated peaks at the end in order to assess concordance of biological replicates.
  • ENCODE ATAC-Seq Pipeline. ATAC-Seq analysis performs quality control and signal processing, producing alignments and measures of enrichment. The Assay for Transposase-Accessible Chromatin followed by sequencing (ATAC-Seq) experiment provides genome-wide profiles of chromatin accessibility. Briefly, the ATAC-seq method works as follows: loaded transposase inserts sequencing primers into open chromatin sites across the genome, and reads are then sequenced. The ends of the reads mark open chromatin sites.The workflow is based on the ENCODE ATAC-seq pipeline, developed by the ENCODE Consortium. The four major steps of the ATAC-Seq analysis are pre-alignment quality control, alignment, post-alignment processing and advanced ATAC-Seq-specific quality control, and peak calling in order to identify accessible regions (which is the basis for advanced downstream analysis).