1.7: Gene Evolution - Fast or Slow?
The genomes of eukaryotes are punctuated by long stretches of sequence which do not code for proteins or RNAs. Although some of these regions do contain crucial regulatory sequences, the vast majority of this DNA serves no known function. Typically, these regions of the genome are the ones in which the fastest change, in evolutionary terms, is observed, because there is typically little to no selection pressure acting on these regions to preserve their sequences.
In contrast, regions which code for a protein might experience high selection pressure, because any changes in their sequence are likely to result in a protein which is less capable of performing its function optimally. However, occasionally a mutation in one of these regions will result in a beneficial outcome that contributes to the overall fitness of the organism, and such mutations often persist and may even become fixed in populations. When comparing the frequency of these mutation events to the relatively regular changes seen in non-coding sequences, this is exceedingly rare, and so in general coding regions are considered as evolving slowly.
It is also true that there is a measurable amount of variation in the levels of sequence conservation within coding sequences, and this is seen across all organisms. For instance, take the example of a receptor protein. Such proteins typically have different regions that may perform functions such as ligand binding, or intracellular signaling, or membrane integration. In this case, a mutation in the region that is involved in ligand binding may produce a protein that is less efficient at binding the ligand. Therefore, selection pressure would likely be high on the particular nucleotides coding for this part of the protein. However, in the section of the protein which spans the membrane, there may be less effect seen if an amino acid substitution occurs, and therefore lower levels of selection pressure. Under these conditions, we might see that two regions of the same protein-coding gene might have different rates of evolution.
Sequencing Genes or Genomic Regions to Build Phylogenies
This variation in the speed of genome evolution over different regions can be studied to answer questions about evolutionary relationships. Genes and gene regions can be selected and sequenced over groups of individuals to answer questions as narrow as “are these populations potentially different species?” or as broad as “how do these phyla place into the tree of life?”. For the former, selecting a gene that has a relatively lightly conserved region would help to identify population-level differences. Conversely, to answer questions over groups as diverse as phyla, a highly conserved gene region may provide enough homology to produce a phylogeny of such groups. Commonly used regions for molecular phylogenetic analyses such as these include ribosomal rRNA genes (such as 16s rRNA, 18s rRNA, or 28s rRNA), or genomic regions known as ITS (Internal Transcribed Spacers, I or II) which sit between the ribosomal rRNA subunit genes.