Multiple Sequence Alignments:
Orthology and Paralogy - Solution
  1. What is the problem with the initial alignment?

    • The initial alignment is obviously correct [aln]

    • It can easily be turned into a [tree]

    • On this tree, Human appears more related to the Xenope than to the Mouse.

    • The reason for this wrong phylogeny is that although the proteins are homologous, they are not orthologous.

    • In order to gain a better understanding of what is going on, one solution will be to add new sequences.

  1. Adding in new sequences

    • This may be done by running a Blast or PSI-Blast against SwissProt.

      From the MSA Hub send the MSA of the 3 sequences to the PSI-Blast, start the PSI-Blast against SwissProt, and select cluster match at 100% identity (i.e., no clustering). Send the results to the MSA Hub, then to ClustalW and Jalview.

    • Here is a sample alignment obtained this way: [aln]

    • That alignment gave the following tree [tree]

    • There is still a problem: STMN2_HUMAN(Q93045), STMN2_RAT(P21818) and STMN2_MOUSE(P55821) are not in the right order (mouse should be closer to rat than human).

    • The reason is that their sequences are too closely related at the protein level.

    • A possible solution would be to use the nucleotide sequences rather than the protein sequences.

    • There are more neutral sites in coding nucleotide sequences. These accounts for a faster evolution that renders phylogenetic analysis easier.

  1. CONCLUSION

    The first part of this exercise shows that while Blast is useful for making a judgement on the existance of some homology between two sequences, more complicated analysis such as orthology or paralogy require a tree as a support.