Our list of good practices in (meta)genome assembly

Lex Nederbragt (corr) et al.

  • talk to the bioinformatician(s) before doing anything
  • QC your reads with fastqc, preqc (khmer?)
  • try different programs for assembly (not too many, but more than one)
  • map the reads back to the assembly and use QC/Validation programs
  • Use orthogonal data for QC/Validation * known genes * CEGMA/Phylosift * RNA-seq data * linkage map data/optical mapping data/fosmids or BAC data
  • do blobology to figure out what you actually assembled (for any genome/metagenome)
  • make reads, mapped reads and validation results available upon release of the genome (or before)
  • make the genome assembly work reproducible

Previous topic

The 10+ Commandments of Assembly

Next topic

Thoughts on sequencing strategy

This Page


LICENSE: This documentation and all textual/graphic site content is licensed under the Creative Commons - 0 License (CC0) -- fork @ github. Presentations (PPT/PDF) and PDFs are the property of their respective owners and are under the terms indicated within the presentation.

Development and posting of this material, and the associated workshop, were supported by Grant Number R25HG006243 from the National Human Genome Research Institute and an NSF OCI supplement to NSF DBI-0939454.


Edit this document!

This file can be edited directly through the Web. Anyone can update and fix errors in this document with few clicks -- no downloads needed.

  1. Go to Our list of good practices in (meta)genome assembly on GitHub.
  2. Edit files using GitHub's text editor in your web browser (see the 'Edit' tab on the top right of the file)
  3. Fill in the Commit message text box at the bottom of the page describing why you made the changes. Press the Propose file change button next to it when done.
  4. Then click Send a pull request.
  5. Your changes are now queued for review under the project's Pull requests tab on GitHub!

For an introduction to the documentation format please see the reST primer.