Assembly QC

Enable HTTPS in the security group.

Installing software

BWA aligner

Do:

cd /root
wget -O bwa-0.7.5.tar.bz2 http://sourceforge.net/projects/bio-bwa/files/bwa-0.7.5a.tar.bz2/download

tar xvfj bwa-0.7.5.tar.bz2
cd bwa-0.7.5a
make

cp bwa /usr/local/bin

samtools

Do:

cd /root
curl -L http://sourceforge.net/projects/samtools/files/latest/download?source=files >samtools.tar.bz2
tar xjf samtools.tar.bz2
mv samtools-* samtools-latest
cd samtools-latest/
make
cp samtools bcftools/bcftools misc/* /usr/local/bin

FRCAlign

First:

apt-get -y install libbz2-dev libboost-dev libboost-iostreams-dev libboost-program-options-dev libboost-thread-dev

Do:

 cd /root
 git clone https://github.com/vezzi/FRC_align
 cd FRC_align
 cd src/samtools;
 make
 cd ..
 cd ..
./configure
 make install

REAPR

Do:

cd /root
curl -O ftp://ftp.sanger.ac.uk/pub4/resources/software/reapr/Reapr_1.0.16.tar.gz
tar xzf Reapr_1.0.16.tar.gz
cd Reapr_1.0.16

export PERL_MM_USE_DEFAULT=1
export PERL_EXTUTILS_AUTOINSTALL=--defaultdeps
export MAKEFLAGS='-j4'
perl -MCPAN -e 'install File::Spec::Link'

./install.sh
ln -s /root/Reapr_1.0.16/reapr /usr/local/bin/reapr

Python modules

Do:

pip install biopython
pip install pysam

Install R

Do:

apt-get install r-base

scaffoldgap2bed

do:

cd /usr/local/bin
curl -O https://raw.github.com/lexnederbragt/sequencetools/master/scaffoldgap2bed.py
chmod 770 scaffoldgap2bed.py

An IPython notebook

do:

cd /usr/local/notebooks
curl -O https://raw.github.com/lexnederbragt/INF-BIO9120_fall2013_de_novo_assembly/master/practicals/Plot_insertsizes.ipynb

Data files

Create a new volume based on snapshot snap-78cf1764 and attach it to your running instance via the Amazon EC2 management interface. When it is attached, remember the partition (e.g. sdf is xvdf) and mount like:

mkdir /data2
mount /dev/xvdf /data2

Practicals handouts

Use the following practicals from https://github.com/lexnederbragt/INF-BIO9120_fall2013_de_novo_assembly/tree/master/practicals

Mapping reads to an assembly

Evaluating assemblies with FRCbam

Assembly improvement using REAPR

Note that you’ll have to adjust file paths and we’re skipping a few things (e.g. SNP calling)

Downloading data from the VM

use scp:

do:

scp -i /path/to/keyfile.pem root@ec-xx-xx-xx-.compute-1.amazonaws.com:/path/to/data ./

For IGV, download:

  • velvet fasta file
  • bam and bai files
  • bed and gff files once you have them

To connect to the IPython Notebook interface, connect to

(ignore/accept the security warning) and use the password ‘beacon’.

Table Of Contents

Previous topic

Metagenomics Practical

Next topic

Blobology

This Page


LICENSE: This documentation and all textual/graphic site content is licensed under the Creative Commons - 0 License (CC0) -- fork @ github. Presentations (PPT/PDF) and PDFs are the property of their respective owners and are under the terms indicated within the presentation.

Development and posting of this material, and the associated workshop, were supported by Grant Number R25HG006243 from the National Human Genome Research Institute and an NSF OCI supplement to NSF DBI-0939454.


Edit this document!

This file can be edited directly through the Web. Anyone can update and fix errors in this document with few clicks -- no downloads needed.

  1. Go to Assembly QC on GitHub.
  2. Edit files using GitHub's text editor in your web browser (see the 'Edit' tab on the top right of the file)
  3. Fill in the Commit message text box at the bottom of the page describing why you made the changes. Press the Propose file change button next to it when done.
  4. Then click Send a pull request.
  5. Your changes are now queued for review under the project's Pull requests tab on GitHub!

For an introduction to the documentation format please see the reST primer.