The active sites of enzymes consist of residues necessary for catalysis and structurally important noncatalytic residues that together maintain the architecture and function of the active site. Examples of evolutionary interactions between catalytic and noncatalytic residues have been difficult to define and experimentally validate due to a general intolerance of these residues to substitution. Here, using computational methods to predict coevolving residues, we identify a network of positions consisting of two catalytic metal-binding residues and two adjacent noncatalytic residues in LAGLIDADG homing endonucleases (LHEs). Distinct combinations of the four residues in the network map to distinct LHE subfamilies, with a striking distribution of the metal-binding Asp (D) and Glu (E) residues. Mutation of these four positions in three LHEs--I-LtrI, I-OnuI, and I-HjeMI--indicate that the combinations of residues tolerated are specific to each enzyme. Kinetic analyses under single-turnover conditions revealed that I-LtrI activity could be modulated over an ?100-fold range by mutation of residues in the coevolving network. I-LtrI catalytic site variants with low activity could be rescued by compensatory mutations at adjacent noncatalytic sites that restore an optimal coevolving network and vice versa. Our results demonstrate that LHE activity is constrained by an evolutionary barrier of residues with strong context-dependent effects. Creation of optimal coevolving active-site networks is therefore an important consideration in engineering of LHEs and other enzymes.
Positions in a protein are thought to coevolve to maintain important structural and functional interactions over evolutionary time. The detection of putative coevolving positions can provide important new insights into a protein family in the same way that knowledge is gained by recognizing evolutionarily conserved characters and characteristics. Putatively coevolving positions can be detected with statistical methods that identify covarying positions. However, positions in protein alignments can covary for many other reasons than coevolution; thus, it is crucial to create high-quality multiple sequence alignments for coevolution inference. Furthermore, it is important to understand common signs and sources of error. When confounding factors are accounted for, coevolution is a rich resource for protein engineering information.
Homing endonucleases mobilize their own genes by generating double-strand breaks at individual target sites within potential host DNA. Because of their high specificity, these proteins are used for "genome editing" in higher eukaryotes. However, alteration of homing endonuclease specificity is quite challenging. Here we describe the identification and phylogenetic analysis of over 200 naturally occurring LAGLIDADG homing endonucleases (LHEs). Biochemical and structural characterization of endonucleases from one clade within the phylogenetic tree demonstrates strong conservation of protein structure contrasted against highly diverged DNA target sites and indicates that a significant fraction of these proteins are sufficiently stable and active to serve as engineering scaffolds. This information was exploited to create a targeting enzyme to disrupt the endogenous monoamine oxidase B gene in human cells. The ubiquitous presence and diversity of LHEs described in this study may facilitate the creation of many tailored nucleases for genome editing.
We developed a low-cost, high-throughput microbiome profiling method that uses combinatorial sequence tags attached to PCR primers that amplify the rRNA V6 region. Amplified PCR products are sequenced using an Illumina paired-end protocol to generate millions of overlapping reads. Combinatorial sequence tagging can be used to examine hundreds of samples with far fewer primers than is required when sequence tags are incorporated at only a single end. The number of reads generated permitted saturating or near-saturating analysis of samples of the vaginal microbiome. The large number of reads allowed an in-depth analysis of errors, and we found that PCR-induced errors composed the vast majority of non-organism derived species variants, an observation that has significant implications for sequence clustering of similar high-throughput data. We show that the short reads are sufficient to assign organisms to the genus or species level in most cases. We suggest that this method will be useful for the deep sequencing of any short nucleotide region that is taxonomically informative; these include the V3, V5 regions of the bacterial 16S rRNA genes and the eukaryotic V9 region that is gaining popularity for sampling protist diversity.
Women living with HIV and co-infected with bacterial vaginosis (BV) are at higher risk for transmitting HIV to a partner or newborn. It is poorly understood which bacterial communities constitute BV or the normal vaginal microbiota among this population and how the microbiota associated with BV responds to antibiotic treatment.
There is currently no way to verify the quality of a multiple sequence alignment that is independent of the assumptions used to build it. Sequence alignments are typically evaluated by a number of established criteria: sequence conservation, the number of aligned residues, the frequency of gaps, and the probable correct gap placement. Covariation analysis is used to find putatively important residue pairs in a sequence alignment. Different alignments of the same protein family give different results demonstrating that covariation depends on the quality of the sequence alignment. We thus hypothesized that current criteria are insufficient to build alignments for use with covariation analyses.
The use of sequence alignments to understand protein families is ubiquitous in molecular biology. High quality alignments are difficult to build and protein alignment remains one of the largest open problems in computational biology. Misalignments can lead to inferential errors about protein structure, folding, function, phylogeny, and residue importance. Identifying alignment errors is difficult because alignments are built and validated on the same primary criteria: sequence conservation. Local covariation identifies systematic misalignments and is independent of conservation. We demonstrate an alignment curation tool, LoCo, that integrates local covariation scores with the Jalview alignment editor. Using LoCo, we illustrate how local covariation is capable of identifying alignment errors due to the reduction of positional independence in the region of misalignment. We highlight three alignments from the benchmark database, BAliBASE 3, that contain regions of high local covariation, and investigate the causes to illustrate these types of scenarios. Two alignments contain sequential and structural shifts that cause elevated local covariation. Realignment of these misaligned segments reduces local covariation; these alternative alignments are supported with structural evidence. We also show that local covariation identifies active site residues in a validated alignment of paralogous structures. Loco is available at https://sourceforge.net/projects/locoprotein/files/.
Related JoVE Video
Journal of Visualized Experiments
What is Visualize?
JoVE Visualize is a tool created to match the last 5 years of PubMed publications to methods in JoVE's video library.
How does it work?
We use abstracts found on PubMed and match them to JoVE videos to create a list of 10 to 30 related methods videos.
Video X seems to be unrelated to Abstract Y...
In developing our video relationships, we compare around 5 million PubMed articles to our library of over 4,500 methods videos. In some cases the language used in the PubMed abstracts makes matching that content to a JoVE video difficult. In other cases, there happens not to be any content in our video library that is relevant to the topic of a given abstract. In these cases, our algorithms are trying their best to display videos with relevant content, which can sometimes result in matched videos with only a slight relation.