These exercises use programs on the class-XX compute cluster. To run the exercises from the command line, you first must: ssh username@class-xx (02 - 04, 06,07,090)
To make certain the programs are set up properly, type:
fasta36 or fasta36 -help
To do a similarity search from the command line, we need: (1) a program; (2) a query sequence; and a database. The and programs databases have been installed. Today, we will use five search programs: fasta36, ssearch36, blastp (protein:protein), and fastx36/blastx (DNA:protein).
Before uing the blastdbcmd, you must type
module load bioware
blastdbcmd -entry gstt1_drome > gstt1_drome.aawill find the SwissProt entry gstt1_drome and write it to the file gstt1_drome.
fasta36 gstt1_drome.aa /class/shared/seq_db/pir1.lseg > gstt1_drome.fa_outWill compare the gstt1_drome.aa sequence to the pir1.lseg database. You can also search SwissProt with '/class/shared/seq_db/swissprot.lseg'. (You can abbreviate the pir1 database as 'a' and the SwissProt database as 's'.)
To see a complete list of fasta36 options, type:
fasta36 -help
fasta36 -s BP62 gstt1_drome.aa s > gstt1_drome_swissprot_BP62.fa_out(note that the scoring matrix option -s BP62 MUST preceed the query and library file names.)
Try a shallow scoring matrix, e.g. -s MD40 or -s MD20.
A complete list of options is available here.
ssearch36 gstt1_drome.aa/class/shared/seq_db/pir1.lseg> gstt1_drome.ss_outHow do the results compare with fasta36. Do you find more significant scores?
BLAST uses a different command line syntax; every argument has a name, e.g. -query gstt1_drome.aa -db swissprot. To see a complete list of blastp options, type:
blastp -help
blastp -query gstt1_drome.aa -db /class/shared/seq_db/pir1 > gstt1_drome.bl_out
blastp -query gstt1_drome.aa -db swissprot > gstt1_drome_swissprot.bl_out