Release notes - April 3, 2023

New Public Projects view

The new Public Projects gallery view is now available on BioData Catalyst powered by Seven Bridges. The new interface now resembles our Public Apps gallery and provides an overview of the purpose and content of each project from a single page, which should make the projects more accessible and allow you to have a better insight into their usefulness for your specific use cases.

Recently published apps

We have published the following new and updated apps in our Public Apps gallery:

  • ABySS 2.3.5 - a de novo sequence assembler intended for short paired-end reads and genomes of all sizes.

  • Minia 3.2.6 - a short-read assembler based on a de Bruijn graph.

  • IDBA 1.1.3 toolkit:

  • IDBA-Hybrid - a de novo assembler for hybrid sequencing data.

  • IDBA-UD - a short-read-data de novo assembler.

  • fq2fa - used for converting FASTQ format read data to FASTA format suitable for IDBA tools.

  • ABACAS 1.3.1 - used for contiguating reference-based assemblies.

  • Viralrecon Illumina De novo assembly workflow - designed for amplicon and metagenomics short-reads assembly. It is able to analyze metagenomics data obtained from shotgun sequencing (e.g. directly from clinical samples) and enrichment-based library preparation methods (e.g. amplicon-based or probe-capture-based data). It takes single or multiple sample Illumina short-reads, and performs reads trimming, removing host reads, assembly with one of the five included assemblers, blasting and different QC metrics calculating.

  • Picard 3.0.0 toolkit:

    • Picard CollectMultipleMetrics - collects BAM statistics by running multiple Picard modules at once.
    • Picard ValidateSamFile - validates an alignments file against the SAM specification.
    • Picard SortSam - sorts alignment files (BAM or SAM).
    • Picard RevertSam - reverts a BAM/SAM file to a previous state.
    • Picard MarkDuplicates - marks duplicate reads in alignment files.
    • Picard GenotypeConcordance - calculates genotype concordance between two VCF files.
    • Picard GatherBamFiles - merges BAM files after a scattered analysis.
    • Picard FixMateInformation - verifies and fixes mate-pair information.
    • Picard FastqToSam - converts FASTQ files to an unaligned SAM or BAM file.
    • Picard CrosscheckFingerprints - checks a set of data files for sample identity.
    • Picard CreateSequenceDictionary - creates a DICT index file for a sequence.
    • Picard CollectWgsMetricsWithNonZeroCoverage - evaluates the coverage and performance of WGS experiments.
    • Picard CollectVariantCallingMetrics - can be used to collect variant call statistics after variant calling.
    • Picard CollectSequencingArtifactMetrics - collects metrics to quantify single-base sequencing artifacts.
    • Picard CollectHsMetrics - collects hybrid-selection metrics for alignments in SAM or BAM format.
    • Picard CollectAlignmentSummaryMetrics - produces a summary of alignment metrics from a SAM or BAM file.
    • Picard CheckFingerprint - checks sample identity of provided data against known genotypes.
    • Picard BedToIntervalList - converts a BED file to a Picard INTERVAL_LIST format.
    • Picard AddOrReplaceReadGroups - assigns all reads to the specified read group.
  • SnpEff 5.1d toolkit:

    • SnpSift Filter - filters SnpEff-annotated VCF files using arbitrary expressions.
    • SnpEff - which is a variant annotation and effect prediction tool.
    • SnpSift Annotate - which annotates VCF files.
    • SnpSift dbNSFP - which allows annotation with dbNSFP (an integrated database of functional predictions from multiple algorithms, including SIFT, Polyphen2, LRT, MutationTaster, PhyloP and GERP++).

We have also published the following IEDB tools:

  • MHC-I Binding Prediction  (MHC I 3.1.2 toolkit) - used for prediction of peptides that bind to MHC I molecules.
  • MHC-II Binding Prediction (MHC II 3.1.6 toolkit) - used for prediction of peptides that bind to MHC II molecules.
  • MHCflurry Predict (MHCflurry 2.0.4 toolkit) - used for peptide/MHC I binding affinity prediction.
  • MHCflurry Scan (MHCflurry 2.0.4 toolkit) - designed to scan protein sequences and predict MHC-I ligands.
  • AXEL-F: Antigen eXpression based Epitope Likelihood-Function (AXEL-F 1.0.0 toolkit) - used for MHC-I epitope prediction.
  • NetChop (NetChop 3.0 toolkit) - a predictor of proteasomal processing based upon a neural network.
  • NetCTL (NetCTL 3.0 toolkit) - a T cell epitopes predictor.
  • NetCTLpan (NetCTLpan 3.0 toolkit) - a T cell epitopes predictor.
  • Class I Immunogenicity (Class I Immunogenicity 3.0 toolkit) - predicts the immunogenicity of a peptide MHC (pMHC) complex.
  • TCRMatch (TCRMatch 1.0.2 toolkit) - predicts T-Cell receptor specificity based on sequence similarity to characterized receptors.
  • BCell (BCell 3.1 toolkit) - predicts linear B cell epitopes based on the antigen characteristics.
  • ElliPro (ElliPro 1.0 toolkit) - predicts antibody epitopes based upon solvent-accessibility and flexibility.
  • Population Coverage (Population Coverage 3.0 toolkit) - calculates the fraction of individuals predicted to respond to a given set of epitopes.
  • Epitope Cluster Analysis (Epitope Cluster Analysis 1.0 toolkit) - groups epitopes into clusters based on sequence identity.