10.7: The Eukaryotic Promoter Region
The eukaryotic promoter region is a segment of DNA located upstream of a gene. It contains an RNA polymerase binding site, a transcription start site, and several cis-regulatory sequences. The proximal promoter region is located in the vicinity of the gene and has cis-regulatory sequences and the core promoter. The core promoter is the binding site for RNA polymerase and is usually located between -35 and +35 nucleotides from the transcription start site. The distal promoter regions are cis-regulatory sequences, thousands of base pairs away from a gene. The length of a promoter region can vary significantly from gene to gene.
The core promoter contains characteristic motifs where general transcription factors can bind and recruit RNA polymerase. The TATA box is a motif located 25-30 base pairs upstream from the transcription start site. It is more flexible and less thermodynamically stable than other promoter motifs due to its high A-T content, allowing the efficient binding of the transcription machinery. It is found in genes that require high levels of expression under specific conditions, such as those genes involved in cell differentiation. The TATA box is often flanked by a set of short nucleotide motifs, known as B-recognition elements. Transcription Factor II B, an important component involved in the assembly of the transcription machinery at the TATA box, binds to these B-elements.
The Initiator element, composed of the degenerate sequence YYANWYY*, contains the transcription start site. Downstream of the initiator element is another characteristic motif, known as the downstream promoter element (DPE), made up of the degenerate sequence RGWYVT. The TATA box and the DPE regulate similar types of genes, and a eukaryotic promoter can have either a TATA box or a DPE. The initiator element can function synergistically either with a TATA box or DPE to regulate transcription.
The CpG islands are another type of core promoter motif that regulates the expression of other types of genes, like housekeeping genes, that require constant expression in small amounts. They are called CpG islands because they contain sequences that are high in cytosine followed by guanine. The “p” represents the phosphodiester bond that links C to G. CpG islands are also known to occur in distal promoter regions.
*R codes for either A or G; W codes for either A or T; Y codes for C or T; V codes for A or G or C; and N codes for any of the four bases.