Alignment Demo

Overview

Coverage Analysis - observe coverage over annotated genes

1. Use Bowtie to run a sample alignment of Illumina reads to the target organism's genome
2. Use SAMtools to transform SAM files into BAM
3. Use IGV to visualize results

shell $> cp /ecg/seqprg/scripts/profile ~/.profile # the "." is important

close the terminal and open a new one.
check that the path is set properly

shell $> echo $PATH
* should see .:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/ecg/seqprg/bin

shell $> which bowtie
* should see /ecg/seqprg/bin/bowtie

shell $> /ecg/seqprg/scripts/init_meta2.sh

shell $> cd ~/meta2_work/01_alignment/

Using Bowtie to align sequences to a reference

* Run bowtie-build to create an index representing the reference sequence

shell $> bowtie-build -f reference/Bacteroides_vulgatus_ATCC_8482.fasta reference/Bacteroides_vulgatus_ATCC_8482

   -f = input is in FastA format
   first argument = input file
   second argument = filename prefix for index to be built

* Run Bowtie on reads from an Illumina RNA-Seq run against the Bacteroides vulgatus reference

shell $> bowtie -X 600 -v 2 -m 1 -S reference/Bacteroides_vulgatus_ATCC_8482 \
            -1 illumina_reads/inputSeqs_1.fastq -2 illumina_reads/inputSeqs_2.fastq output/alignments.sam

   -X = maximum insert size
   -v = # of mismatches allowed anywhere in the read (3 max)
   -m = require unique alignments ( when m = 1 )
   -S = produce output in SAM format
   -1 / -2 = matching mate files for paired-end reads

Using SAMtools on Bowtie output to create BAM files that can be viewed in IGV

* Make a BAM file from the existing SAM file

shell $> samtools view -bS -o output/alignments.bam output/alignments.sam

   view = extract all alignments from SAM or BAM file
   -b = output in BAM format
   -S = input is a SAM file
   -o = output file

* Sort the BAM file (i.e., sort reported alignments in order of leftmost coordinate) (improves access speed, reduces file size)

shell $> samtools sort output/alignments.bam output/alignments.sorted

* Index the sorted BAM file (further improves access speed)

shell $> samtools index output/alignments.sorted.bam

Visualization of Reads Along Genome

* Launch IGV

shell $> /ecg/seqprg/IGV/igv_mac-intel.command

* Import a new genome: File -> Import Genome

   o Name: bac_vul
   o Sequence File: meta2_work/01_alignment/reference/Bacteroides_vulgatus_ATCC_8482.fasta (fasta file of reference genome)
   o Cytoband File: N/A
   o Gene File: meta2_work/01_alignment/reference/Bacteroides_vulgatus_ATCC_8482.gff (gff3 file containing gene annotations)

   o Save to meta2_work/01_alignment/genome_repository/

* Import the BAM alignments file: File -> Load from File

   o File: meta2_work/01_alignment/output/alignments.sorted.bam


ECG Home page