Genome binning with CONCOCTΒΆ

CONCOCT is automated genome binning software that can use coverage information across multiple samples, composition and paired-end linkage information.

The manuscript describing the software is available on arXiv:

CONCOCT: Clustering cONtigs on COverage and ComposiTion

The Github repository is here:

The basic steps are as follows:

  • Generate a co-assembly from all the reads in your dataset (or a subset if too large)
  • Map reads back on a per-sample basis to the contigs to generate coverage information
  • Run CONCOCT to produce clusters
  • (optionally) Link clusters with paired-end reads
  • Evaluate results, e.g. using taxonomic assignments of contigs and presence of conserved genes in clusters

To install CONCOCT, follow the instructions here:

A complete walk-through is available here:

A repository containing the input and output files for a simple, worked example is available here:

A recent presentation by Chris Quince on CONCOCT is available to “watch again” from Beatles and Bioinformatics at this link:

