Genome Browsers -- UCSC and IGV

The goal of this exercise is to gain some experience with the UCSC Genome browser (genome.ucsc.edu), and the Integrated Genome Viewer (IGV).

  1. Go to the UCSC Genome browser UCSC and find the human GSTM1 gene.
    1. How many different versions of the human genome are available?
    2. Which one are you using?
    3. For compatibility with IGV, select hg18

  2. Zooming in and out:
    1. How much Human Genomic DNA are you seeing? How many GSTM genes?
    2. Use the zoom out option to see the next closest GSTM gene. Now how much DNA are you seeing?
    3. Use the << to move to the left (towards the GSTM2 gene). How much DNA are you seeing.
    4. All the GSTM1-5 genes have 8 exons, with the termination codon in the 8-th exon. GSTM2 is also annotated to have a 9th exon. Click on the exon to see the evidence.
    5. Expand out the view of the GSTM1 cluster until you can see all five GSTM1-5 genes. How long is the DNA sequence range being displayed?
  3. Use the options below the genome display to turn some of the lanes on and off.
    1. In the section titled Genes and Gene Predictions, turn on Augustus, CCDS, Geneid Genes, and Genescan Genes. Do you see any additional genes? Do some of the genes have a different structure? What has changed? CCDS are probably the most reliable gene predictions. Which gene models differ from CCDS?
    2. Take a look at the GSTM1/GSTM2 predictions using the NCBI genome browser.. Which GSTM2 transcripts does NCBI support? Which GSTM1 transcripts?
    3. Looking in the Regulation section, turn on Encode Regulation. What new lanes appear?
  4. Download some sets of data from the UCSC browser.
    1. Select the Tools menu option from the top of the page, and select Table Browser.
    2. Use the group: drop-down menu to select regulation
    3. Without specifying an output file, use get output to download the coordinates of the CpG islands. How many are there? Do they agree with the map you were looking at? Download the CpG islands to a file using GTF format (be certain to name the file ".gtf").
    4. Also look at the Layered H3K4Me1 track. This data is in a different format (wiggle) for displaying continuous curves. Download it to a wiggle (".wig") file.
  5. Download the Integrated Genome Viewer from IGV Downloads
    1. Which version of the Human Genome assembly are you using?
    2. Again, look up the GSTM1 gene. How many tracks do you see?
    3. Add some additional tracks using the File Menu to Load from Server. Click on the Annotation box, then add some Annotation tracks (e.g. Genetic Association Studies (GAD)from Phenotype and Disease Associations and PhastCons from Comparative Genomics). Make sure to add one or two lanes at a time, not sets of lanes.
    4. Add some of the data files you downloaded from UCSC by using the File and Load from File menu.

Course home page