A protein domain is a self-contained section of a protein that is capable of independently folding into its three-dimensional structure.
There are limited ways protein domains can fold since certain specific three-dimensional arrangements of alpha helices, beta sheets, and loops are more energetically favorable than others. There are tens of thousands of different proteins, but there are less than 1500 known domains because similar domains appear in many proteins.
Proteins with short amino acid sequences will likely have only one domain, but the majority of proteins contain multiple domains with distinct functions.
The multiple domains of a protein work together to allow it to perform its role, and often one domain acts to regulate the function of another domain. For example, when a ligand binds to the ligand-binding domain of a receptor – this can trigger the enzymatic function of the catalytic domain to be enhanced.
The protein domains have evolved as modules that can be rearranged to create new proteins with unique functions in a process called domain shuffling. These units will combine in different orientations depending on the location of the N- and C-terminal tails of the amino acid chain.
If the N terminus and C terminus of the domain are in close proximity, the protein will end up as a compact globular structure, while if the N and C termini are on opposite ends of the domain, the overall structure will end up elongated linear.
Src is a protein containing three distinct protein domains. Two of these domains, called the Src Homology domains SH2 and SH3 allow proteins containing this domain to bind to specific sequences of amino acid chains.
Each of these domains is conserved across different proteins. The SH2 domain binds to phosphorylated tyrosines and is found in over 115 different proteins. The SH3 domain binds to proline-rich sequences and is found 300 times in the human genome.
These protein domains and others are genetic building blocks for creating proteins that combine multiple functions in unique ways.
Protein domains are small structurally independent units that are part of a single amino acid chain. Although these domains are often structurally independent, they may rely on synergistic effects to perform their functions as part of a larger protein. Protein domains may be conserved within the same organism, as well as across different organisms.
A limited set of protein domains often duplicate and recombine during evolution. These domains can be organized in different combinations to form functionally distinct molecules in a process known as domain shuffling. The tertiary protein structure of evolutionarily related proteins is often more similar than the primary amino acid sequence; therefore, analyzing the three-dimensional structure of a protein domain, in addition to its sequence, is essential to study protein domain conservation.
The Argonaute protein family has three essential, conserved domains – PAZ, MID, and PIWI. These proteins have highly specialized binding modules that associate with small RNA components, including microRNAs, short interfering RNAs, and Piwi-interacting RNAs, to participate in gene silencing regulation. These small RNAs silence gene function only when they associate with Argonaute proteins. The PAZ domain’s characteristic feature is the binding pocket for the 3’-protruding end of the small RNAs. The PIWI domain, a domain that exhibits slicer activity, is structurally similar to bacterial RNase H, a protein responsible for hydrolyzing RNA in an RNA-DNA complex. The MID domain is present between the PAZ and PIWI domain and has a binding pocket for 5′ phosphate of the small RNA. One of the conserved motifs in these domains is the aspartic acid-aspartic acid-histidine (DDH) motif that participates in its catalytic function.
Argonaute proteins are conserved across organisms and have multiple families in different organisms ranging from five in Drosophila, eight in humans, ten in Arabidopsis, and twenty-seven in C. elegans.