Command line unix (Linux) (20-Jan-2017)

This will be your homework for Monday, Jan 23. My goal is to be certain that (1) you can log into a Unix machine, (2) run some simple commads to create files and directories, (3) to edit a file using emacs (but you can use a different editor), (4) transfer a file from your laptop to Unix, and (5) use the 'curl' command to transfer files from a web site to a unix file.

  1. Get an ITS unix account
  2. On your computer, login to your account on or (use whichever one works; both computers access the same account directories).

    Outside UVA, you MUST use Cisco Anyconnect

  3. Look at your $PATH variable (used for finding programs)
    echo $PATH

  4. list the contents of the file
    cat path.file
    more path.file
    less path.file  # type "q" to return to the shell prompt

  5. Make a copy of the file
    cp path.file path.copy

  6. Make a sub-directory (folder) called "biol4230"
    mkdir biol4230
    Move into that directory with cd biol4230
    Make a new subdirectory in biol4230 called hwk1

  7. Move the path.copy file into the biol4230/hwk1 folder

  8. List the contents of the biol4230/hwk1 folder
    cd ~/biol4230/hwk1
    ls -l
    What is the extra information in the second listing?

  9. Save the contents (directory listing) of the data folder to a file using the same strategy you used to create path.file.

  10. Use the emacs text editor to edit the directory listing file. Make several copies of some of the lines in the file, and save it.

    Check the contents of the biol4230/hwk1 directory.

  11. Transfer file of accessions from your laptop to
    1. At the NCBI web site, look up: glutathione S-transferase AND human[orgn] AND srcdb_refseq[prop] in the protein database.
    2. Use the Send to link on the search result page to send to a file the accessions.
    3. use scp (Mac) or SecureFX (Windows) to copy the file of accessions to interactive.hpc or interactive.hpc.

  12. Search for glutathione S-transferase in the SwissProt database by using srcdb_swiss_prot in place of srcdb_refseq, and then download a file of accessions. Transfer this file to interactive.hpc.

  13. Use the curl command (on interactive.hpc) to download a sequence from uniprot:
    1. Check to see if the sequence has appeared in your directory.
    2. Look at the sequence (file). Is it in FASTA format?

  14. Put all the scripts and results files in a directory on interactive.hpc (or interactive.hpc) called "biol4230/hwk1" and be certain that I can read it.
    cd                # go to home directory
    chmod go+rx .     # make it readable by others
    chmod go+rx biol4230/  # make biol4230 readable
    chmod go+rx biol4230/hwk1  # and hwk1
    cd biol4230/hwk1
    chmod go+r * # make all files in the directory readable by others (me) 

  15. When your homework is done, I should see:
    1. Your biol4230/hwk1/path.file
    2. a directory listing with several duplicated lines in it (from part 10, emacs)
    3. a list of accessions for human refseq glutathione transferases
    4. a list of accessions for human SwissProt glutathione transferases from NCBI
    5. a list of accessions for human SwissProt glutathione transferases from Uniprot
    6. A fasta format file for P09488 from Uniprot
    7. A fasta format file for P09488 from NCBI

Biol4230 Schedule