GCP Home Page

Comparing protein sequences

It is much more difficult and time-consuming to sequence proteins than DNA. Thanks to the genetic code, the protein sequence can be deduced from the DNA sequence (but not vice versa, because most amino acids are encoded by more than one codon, see earlier section). Thus, protein sequences are usually derived by deduction (translation) from the DNA sequence.

 

Using protein sequences for distantly related species

Nevertheless, sometimes comparing protein, rather then DNA, sequences is useful. Because of the redundancy in the genetic code (see slide 7), two orthologs can differ markedly in DNA sequence but still encode the same amino acid sequence. Therefore, two organisms which are distantly related may have many differences in their DNA sequence, but have less differences in their amino acid sequence and thus can be compared at this level more easily.

Over many generations, an organism may acquire point mutations but most of these will not result in any change in the protein sequence.