zPicture
and Mulan
produce alignments between chromosome regions (up to 1 Mb each) that
can be used to identify exons and other conserved functional
regions.
The ECR browser
has precomputed alignments for many vertebrate species with access to Rvista (see below).
Conserved regions from Drosophila - eyegone (eyg)
At the MULAN site, enter 3 for the number of sequences to align.
MULAN can align
two types of data, FINISHED sequences, where all of the
sequences are contiguous, and DRAFT, for multiple, not
necessarily ordered or aligned, sequences are input.
Click the vertical SELECT link on the left center for FINISHED sequences.
On the sequence entry page, MULAN can download genome sequences from UCSC. Use the upload link under SEQUENCE 1 (and later SEQUENCE 2 and SEQUENCE 3) to enter the following coordinates:
Organism
coordinates
1.
D. melanogaster (dm3, April 2006 )
chr3L:12,450,477-12,474,838
2.
D. simulans (droSim1)
chr3L:11832451-11856670
3.
D. Virilis (droVir2)
scaffold_13049:9,296,946-9,329,999
You must press TWICE for each coordinate entry.
After you upload the third sequence, the program will begin its alignment.
Before showing the multiple alignment output, MULAN asks you to check the tree. The default tree is appropriate (for three taxa, there is only one unrooted tree), but you can see a more sensible looking tree by using (seq3, (seq1, seq2). Approve the tree and continue.
After the analysis is done, you have the option to view the 3-way alignment (Dynamic visualization, top panel), or each of the two 2-way alignments. 2-way alignments can be visualized using either a PIP-plot (Pairwise dynamic plots) or a Dot-plot.
Look at the Dot-plot first, and compare D. melanogaster (seq1) to D. virilis (seq 3). Identify regions in D. melanogaster that are missing in D. simulans. You can also compare D. melanogaster and D. virilis, but they are very similar.
Now go back and examine the Dynamic visualization (click on the PIP plot). Identify each of the lines in the graphic (where are the ECR's plotted?, where are the genes?).
In addition to highlighting conservation between two sequences, MULAN
(and zPicture) can search for transcription factor binding motifs (but
only for sequences that are less than 1 Mb. Go back to the main
results page and click MultiTF. Select the insects
transcription factor set, select all, and submit
Look at how transcription factor binding sites line up with conserved regions.
Conserved regions from vertebrates - Tbx18
At the zPicture site, click on SEQUENCE 1 / Upload, select the human genome, hg18 (March 2006) version, and enter the Position:
chr6:85500876-85530618
and Submit and then Submit again.
For SEQUENCE 2, select Upload and select mouse genome mm9 (July 2007), Position:
chr9:87,599,034-87,626,095
If you do not know the coordinates of your region of interest, you can
identify syntenic homologous regions by using BLAT at UCSC to
align your own DNA sequences to a UCSC genome (human, mouse, rat,
etc.) BLAT is limited to 25 Kb, so you need to truncate the sequence
coordinates.
Again, use the Dot-plot option to examine the overall synteny between the two regions, focussing on insertions and deletions. Then use the Dynamic display to examine the PIP plot. Try to identify the insertion/deletion regions in the PIP plot.
Note the correspondence/non-correspondence between ECRs (extremely
conserved regions) and exons. You will not see conservation in
repeat regions, because they have been masked out.
Invert (Base-top switch) the reference genome and note the
changes in the plot.
You can also use rVISTA to look for regulatory sites in your
alignment. rVISTA is like MultiTF in MULAN. For
the human/mouse (or human/chicken, below), you should use the
vertebrate set of transcription factors.
For a more distant comparison, examine chicken.
For SEQUENCE 2, select Upload and select chicken, Position:
chr3:79,654,026-80,251,165
and Submit and then Submit again.
How well are exons preserved between human and chicken?
Are there well-conserved regions that do not map to human exons?
Select one of the longer conserved regions, click on it, and see a
FASTA file of the aligned region. Try taking that file and
blastx/fastx'ing it against the swissprot, human, or refseqprotein
database, to see if this conserved region might code for a protein.
The zPicture program can also be used to align/order/orient
reads against a reference genome. Pick your favorite gene region of
interest, and compare it to a annotated region from a related genome.
For bacterial sequences, you should be able to download gene sequences
from the ENSEMBL site, and upload them to zPicture or MULAN