Alignment Demo
Overview
Coverage Analysis - observe coverage over annotated genes
1. Use Bowtie to run a sample alignment of Illumina reads to the target organism's genome
2. Use SAMtools to transform SAM files into BAM
3. Use IGV to visualize results
shell $> cp /ecg/seqprg/scripts/profile ~/.profile # the "." is important
close the terminal and open a new one.
check that the path is set properly
shell $> echo $PATH
* should see .:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin:/ecg/seqprg/bin
shell $> which bowtie
* should see /ecg/seqprg/bin/bowtie
shell $> /ecg/seqprg/scripts/init_meta2.sh
shell $> cd ~/meta2_work/01_alignment/
Using Bowtie to align sequences to a reference
* Run bowtie-build to create an index representing the reference sequence
shell $> bowtie-build -f reference/Bacteroides_vulgatus_ATCC_8482.fasta reference/Bacteroides_vulgatus_ATCC_8482
-f = input is in FastA format
first argument = input file
second argument = filename prefix for index to be built
* Run Bowtie on reads from an Illumina RNA-Seq run against the Bacteroides vulgatus reference
shell $> bowtie -X 600 -v 2 -m 1 -S reference/Bacteroides_vulgatus_ATCC_8482 \
-1 illumina_reads/inputSeqs_1.fastq -2 illumina_reads/inputSeqs_2.fastq output/alignments.sam
-X = maximum insert size
-v = # of mismatches allowed anywhere in the read (3 max)
-m = require unique alignments ( when m = 1 )
-S = produce output in SAM format
-1 / -2 = matching mate files for paired-end reads
Using SAMtools on Bowtie output to create BAM files that can be viewed in IGV
* Make a BAM file from the existing SAM file
shell $> samtools view -bS -o output/alignments.bam output/alignments.sam
view = extract all alignments from SAM or BAM file
-b = output in BAM format
-S = input is a SAM file
-o = output file
* Sort the BAM file (i.e., sort reported alignments in order of leftmost coordinate) (improves access speed, reduces file size)
shell $> samtools sort output/alignments.bam output/alignments.sorted
* Index the sorted BAM file (further improves access speed)
shell $> samtools index output/alignments.sorted.bam
Visualization of Reads Along Genome
* Launch IGV
shell $> /ecg/seqprg/IGV/igv_mac-intel.command
* Import a new genome: File -> Import Genome
o Name: bac_vul
o Sequence File: meta2_work/01_alignment/reference/Bacteroides_vulgatus_ATCC_8482.fasta (fasta file of reference genome)
o Cytoband File: N/A
o Gene File: meta2_work/01_alignment/reference/Bacteroides_vulgatus_ATCC_8482.gff (gff3 file containing gene annotations)
o Save to meta2_work/01_alignment/genome_repository/
* Import the BAM alignments file: File -> Load from File
o File: meta2_work/01_alignment/output/alignments.sorted.bam
ECG Home page