Command line unix (Linux) (20-Jan-2017)
This will be your homework for Monday, Jan 23. My goal is to be
certain that (1) you can log into a Unix machine, (2) run some simple
commads to create files and directories, (3) to edit a file using
emacs (but you can use a different editor), (4) transfer a file from
your laptop to Unix, and (5) use the 'curl' command to transfer files
from a web site to a unix file.
- Get an ITS unix account
On your computer, login to your account on interactive.hpc.virginia.edu or interactive.hpc.virginia.edu (use whichever one works; both computers access the same account directories).
Outside UVA, you MUST use Cisco Anyconnect
Look at your $PATH variable (used for finding programs)
send the output of the echo $PATH to a file:
echo $PATH > path.file
list the contents of the file
less path.file # type "q" to return to the shell prompt
Make a copy of the file
cp path.file path.copy
Make a sub-directory (folder) called "biol4230"
Move into that directory with cd biol4230
Make a new subdirectory in biol4230 called hwk1
Move the path.copy file into the biol4230/hwk1 folder
- List the contents of the biol4230/hwk1 folder
What is the extra information in the second listing?
Save the contents (directory listing) of the data folder to a file
using the same strategy you used to create path.file.
Use the emacs text editor to edit the directory listing file. Make several copies of some of the lines in the file, and save it.
Check the contents of the biol4230/hwk1 directory.
Transfer file of accessions from your laptop to interactive.hpc.virginia.edu.
At the NCBI web site, look up: glutathione S-transferase AND human[orgn] AND srcdb_refseq[prop] in the protein database.
Use the Send to link on the search result page to send to a file the accessions.
use scp (Mac) or SecureFX (Windows) to copy the file of accessions to interactive.hpc or interactive.hpc.
Search for glutathione S-transferase in the SwissProt database by using srcdb_swiss_prot in place of srcdb_refseq, and then download a file of accessions. Transfer this file to interactive.hpc.
Use the curl command (on interactive.hpc) to download a sequence from uniprot:
Check to see if the sequence has appeared in your directory.
Look at the sequence (file). Is it in FASTA format?
Put all the scripts and results files in a directory on interactive.hpc (or interactive.hpc) called "biol4230/hwk1" and be certain that I can read it.
cd # go to home directory
chmod go+rx . # make it readable by others
chmod go+rx biol4230/ # make biol4230 readable
chmod go+rx biol4230/hwk1 # and hwk1
chmod go+r * # make all files in the directory readable by others (me)
When your homework is done, I should see:
- Your biol4230/hwk1/path.file
- a directory listing with several duplicated lines in it (from part 10, emacs)
- a list of accessions for human refseq glutathione transferases
- a list of accessions for human SwissProt glutathione transferases from NCBI
- a list of accessions for human SwissProt glutathione transferases from Uniprot
- A fasta format file for P09488 from Uniprot
- A fasta format file for P09488 from NCBI