Visualizing histone modification data

The purpose of this exercise is to become familiar with how histone modification data can be displayed in the UCSC Genome browser and to understand some of the simple properties of histone modifications that can be observed directly in the browser. We will use histone modification data produced by ENCODE. We are assuming that the initial analysis tasks of mapping reads and identifying "peaks" or broader enriched domains have already been done. We will look at three different histone modifications: H3K4me3, H3K36me3 and H3K27me3. The K4 and K36 methylation are usually associated with activation (I will call them activating marks) and the K27 is most commonly associated with repressed transcription (a repressive mark). The underlying biology is far from clear, and so names like "activating" might ultimately not be completely accurate.


We begin by looking at these marks over an entire chromosome, and we will use the data produced by the Broad Institute for the ENCODE project in normal human skeletal muscle myoblasts (HSMM) cells. The steps to setup the browser tracks are as follows:

  1. Go to http://genome.ucsc.edu
  2. Reset the browser with the "reset" button
  3. Select hg18 and click "submit"
  4. Select "hide all tracks"
  5. Select chr2 and zoom all the way out (hit the 10X zoom out button a few times)
  6. Scroll down to the "regulation" section and click the "Broad Histone" link
  7. We will focus on HSMM cells (tough to pick a favorite ENCODE cell line -- most are boring!)
  8. Check only the boxes for H3K4me3, H3K27me3 and H3K36me3 in HSMM
  9. Set the "peaks" display mode to "dense"
  10. Looking at the 3 "profile" tracks, right click each (in turn) and select "configure HSMM H3..."
  11. Configure the appearance of the tracks. Set vertical viewing range max=4, smoothing window=8 pixels
  12. Turn on the "UCSC Gene" track by setting visibility to "squish" and at the same time uncheck the "show splice variants" box to remove clutter.

At this point you should be able to see profiles of scores along the genome (called "wiggle" tracks) for each of the 3 histone modifications. If you don't think you were successful in following the configuration steps, you can turn on the above setup by simply following this link. The three profiles should not look too different from each other at this level of resolution. Notice also that the profile track scores are more frequently hitting the ceiling we have imposed (i.e. a max score of 4, considering smoothing). These scores are proportional to the amount of data, and in many cases it will not be a good idea to put different tracks on the same scale. At this point we can also see that these three histone modifications tend to be enriched where genes are also enriched.


Now we will look a bit more closely so we can actually see some differences between the different tracks. We will reconfigure the browser as follows:

  1. Zoom in around a 10Mb region either using the cursor or entering the coordinates in the text box. I will use the region chr2:20,000,000-30,000,000, but you should see similar features anywhere that has a good density of genes.
  2. Change the display mode for the UCSC Genes to "pack".
  3. Notice that the "peaks" for H3K4me3 don't seem to match the tops of the profiles very well. Change the H3K4me3 "S" track vertical display range max to 20, and the smoothing window to 2 pixels.

Again, if you had some problems configuring the browser, this link should set you up. There are several things to notice here:


Now we will look even more closely at the relationships between the three marks, and how each relates to genes. The only reconfiguration step we will take is to:

  1. Zoom in around a 1Mb region. I will select chr2:26,000,000-27,000,000

And if needed use this link. There are several things to observe:


We have looked at differences between histone modifications, but what about differences between cell types? The ENCODE project also produced histone modification data for H1 ESCs. This link will turn on those tracks, and take us to the HOXD cluster. There are several interesting observations we can make about on the histone modifications around these genes:


ECG Home page