In vivo methods such as ChIP-chip are well-established techniques used to determine global gene targets for transcription factors. However, they are of limited use in exploring bacterial two component regulatory systems with uncharacterized activation conditions. Such systems regulate transcription only when activated in the presence of unique signals. Since these signals are often unknown, the in vitro microarray based method described in this video article can be used to determine gene targets and binding sites for response regulators. This DNA-affinity-purified-chip method may be used for any purified regulator in any organism with a sequenced genome. The protocol involves allowing the purified tagged protein to bind to sheared genomic DNA and then affinity purifying the protein-bound DNA, followed by fluorescent labeling of the DNA and hybridization to a custom tiling array. Preceding steps that may be used to optimize the assay for specific regulators are also described. The peaks generated by the array data analysis are used to predict binding site motifs, which are then experimentally validated. The motif predictions can be further used to determine gene targets of orthologous response regulators in closely related species. We demonstrate the applicability of this method by determining the gene targets and binding site motifs and thus predicting the function for a sigma54-dependent response regulator DVU3023 in the environmental bacterium Desulfovibrio vulgaris Hildenborough.
26 Related JoVE Articles!
TransFLP — A Method to Genetically Modify Vibrio cholerae Based on Natural Transformation and FLP-recombination
Institutions: Ecole Polytechnique Fédérale de Lausanne (EPFL).
Several methods are available to manipulate bacterial chromosomes1-3
. Most of these protocols rely on the insertion of conditionally replicative plasmids (e.g.
harboring pir-dependent or temperature-sensitive replicons1,2
). These plasmids are integrated into bacterial chromosomes based on homology-mediated recombination. Such insertional mutants are often directly used in experimental settings. Alternatively, selection for plasmid excision followed by its loss can be performed, which for Gram-negative bacteria often relies on the counter-selectable levan sucrase enzyme encoded by the sacB
. The excision can either restore the pre-insertion genotype or result in an exchange between the chromosome and the plasmid-encoded copy of the modified gene. A disadvantage of this technique is that it is time-consuming. The plasmid has to be cloned first; it requires horizontal transfer into V. cholerae
(most notably by mating with an E. coli
donor strain) or artificial transformation of the latter; and the excision of the plasmid is random and can either restore the initial genotype or create the desired modification if no positive selection is exerted. Here, we present a method for rapid manipulation of the V. cholerae
). This TransFLP method is based on the recently discovered chitin-mediated induction of natural competence in this organism6
and other representative of the genus Vibrio
such as V. fischeri7
. Natural competence allows the uptake of free DNA including PCR-generated DNA fragments. Once taken up, the DNA recombines with the chromosome given the presence of a minimum of 250-500 bp of flanking homologous region8
. Including a selection marker in-between these flanking regions allows easy detection of frequently occurring transformants.
This method can be used for different genetic manipulations of V. cholerae
and potentially also other naturally competent bacteria. We provide three novel examples on what can be accomplished by this method in addition to our previously published study on single gene deletions and the addition of affinity-tag sequences5
. Several optimization steps concerning the initial protocol of chitin-induced natural transformation6
are incorporated in this TransFLP protocol. These include among others the replacement of crab shell fragments by commercially available chitin flakes8
, the donation of PCR-derived DNA as transforming material9
, and the addition of FLP-recombination target sites (FRT)5
. FRT sites allow site-directed excision of the selection marker mediated by the Flp recombinase10
Immunology, Issue 68, Microbiology, Genetics, natural transformation, DNA uptake, FLP recombination, chitin, Vibrio cholerae
Genome-wide Analysis using ChIP to Identify Isoform-specific Gene Targets
Institutions: University of Illinois Chicago - UIC, Universitat Pompeu Fabra, Whitehead Institute for Biomedical Research.
Recruitment of transcriptional and epigenetic factors to their targets is a key step in their regulation. Prominently featured in recruitment are the protein domains that bind to specific histone modifications. One such domain is the plant homeodomain (PHD), found in several chromatin-binding proteins. The epigenetic factor RBP2 has multiple PHD domains, however, they have different functions (Figure 4). In particular, the C-terminal PHD domain, found in a RBP2 oncogenic fusion in human leukemia, binds to trimethylated lysine 4 in histone H3 (H3K4me3)1
. The transcript corresponding to the RBP2 isoform containing the C-terminal PHD accumulates during differentiation of promonocytic, lymphoma-derived, U937 cells into monocytes2
. Consistent with both sets of data, genome-wide analysis showed that in differentiated U937 cells, the RBP2 protein gets localized to genomic regions highly enriched for H3K4me33
. Localization of RBP2 to its targets correlates with a decrease in H3K4me3 due to RBP2 histone demethylase activity and a decrease in transcriptional activity. In contrast, two other PHDs of RBP2 are unable to bind H3K4me3. Notably, the C-terminal domain PHD of RBP2 is absent in the smaller RBP2 isoform4
. It is conceivable that the small isoform of RBP2, which lacks interaction with H3K4me3, differs from the larger isoform in genomic location. The difference in genomic location of RBP2 isoforms may account for the observed diversity in RBP2 function. Specifically, RBP2 is a critical player in cellular differentiation mediated by the retinoblastoma protein (pRB). Consistent with these data, previous genome-wide analysis, without distinction between isoforms, identified two distinct groups of RBP2 target genes: 1) genes bound by RBP2 in a manner that is independent of differentiation; 2) genes bound by RBP2 in a differentiation-dependent manner.
To identify differences in localization between the isoforms we performed genome-wide location analysis by ChIP-Seq. Using antibodies that detect both RBP2 isoforms we have located all RBP2 targets. Additionally we have antibodies that only bind large, and not small RBP2 isoform (Figure 4). After identifying the large isoform targets, one can then subtract them from all RBP2 targets to reveal the targets of small isoform. These data show the contribution of chromatin-interacting domain in protein recruitment to its binding sites in the genome.
Biochemistry, Issue 41, chromatin immunoprecipitation, ChIP-Seq, RBP2, JARID1A, KDM5A, isoform-specific recruitment
Primer-Free Aptamer Selection Using A Random DNA Library
Institutions: Pennsylvania State University, Pennsylvania State University, Pennsylvania State University, Pennsylvania State University.
Aptamers are highly structured oligonucleotides (DNA or RNA) that can bind to targets with affinities comparable to antibodies 1
. They are identified through an in vitro selection process called Systematic Evolution of Ligands by EXponential enrichment (SELEX) to recognize a wide variety of targets, from small molecules to proteins and other macromolecules 2-4
. Aptamers have properties that are well suited for in vivo diagnostic and/or therapeutic applications: Besides good specificity and affinity, they are easily synthesized, survive more rigorous processing conditions, they are poorly immunogenic, and their relatively small size can result in facile penetration of tissues.
Aptamers that are identified through the standard SELEX process usually comprise ~80 nucleotides (nt), since they are typically selected from nucleic acid libraries with ~40 nt long randomized regions plus fixed primer sites of ~20 nt on each side. The fixed primer sequences thus can comprise nearly ~50% of the library sequences, and therefore may positively or negatively compromise identification of aptamers in the selection process 3
, although bioinformatics approaches suggest that the fixed sequences do not contribute significantly to aptamer structure after selection 5
. To address these potential problems, primer sequences have been blocked by complementary oligonucleotides or switched to different sequences midway during the rounds of SELEX 6
, or they have been trimmed to 6-9 nt 7, 8
. Wen and Gray 9
designed a primer-free genomic SELEX method, in which the primer sequences were completely removed from the library before selection and were then regenerated to allow amplification of the selected genomic fragments. However, to employ the technique, a unique genomic library has to be constructed, which possesses limited diversity, and regeneration after rounds of selection relies on a linear reamplification step. Alternatively, efforts to circumvent problems caused by fixed primer sequences using high efficiency partitioning are met with problems regarding PCR amplification 10
We have developed a primer-free (PF) selection method that significantly simplifies SELEX procedures and effectively eliminates primer-interference problems 11, 12
. The protocols work in a straightforward manner. The central random region of the library is purified without extraneous flanking sequences and is bound to a suitable target (for example to a purified protein or complex mixtures such as cell lines). Then the bound sequences are obtained, reunited with flanking sequences, and re-amplified to generate selected sub-libraries. As an example, here we selected aptamers to S100B, a protein marker for melanoma. Binding assays showed Kd s in the 10-7
M range after a few rounds of selection, and we demonstrate that the aptamers function effectively in a sandwich binding format.
Cellular Biology, Issue 41, aptamer, selection, S100B, sandwich
Transmembrane Domain Oligomerization Propensity determined by ToxR Assay
Institutions: University of Colorado at Boulder.
The oversimplified view of protein transmembrane domains as merely anchors in phospholipid bilayers has long since been disproven. In many cases membrane-spanning proteins have evolved highly sophisticated mechanisms of action.1-3
One way in which membrane proteins can modulate their structures and functions is by direct and specific contact of hydrophobic helices, forming structured transmembrane oligomers.4,5
Much recent work has focused on the distribution of amino acids preferentially found in the membrane environment in comparison to aqueous solution and the different intermolecular forces that drive protein association.6,7
Nevertheless, studies of molecular recognition at the transmembrane domain of proteins still lags behind those of water-soluble regions. A major hurdle remains: despite the remarkable specificity and affinity that transmembrane oligomerization can achieve,8
direct measurement of their association is challenging. Traditional methodologies applied to the study of integral membrane protein function can be hampered by the inherent insolubility of the sequences under examination. Biophysical insights gained from studying synthetic peptides representing transmembrane domains can provide useful structural insight. However, the biological relevance of the detergent micellar or liposome systems used in these studies to mimic cellular membranes is often questioned; do peptides adopt a native-like structure under these conditions and does their functional behaviour truly reflect the mode of action within a native membrane? In order to study the interactions of transmembrane sequences in natural phospholipid bilayers, the Langosch lab developed ToxR transcriptional reporter assays.9
The transmembrane domain of interest is expressed as a chimeric protein with maltose binding protein for location to the periplasm and ToxR to provide a report of the level of oligomerization (Figure 1).
In the last decade, several other groups (e.g. Engelman, DeGrado, Shai) further optimized and applied this ToxR reporter assay.10-13
The various ToxR assays have become a gold standard to test protein-protein interactions in cell membranes. We herein demonstrate a typical experimental operation conducted in our laboratory that primarily follows protocols developed by Langosch. This generally applicable method is useful for the analysis of transmembrane domain self-association in E. coli
, where β-galactosidase production is used to assess the TMD oligomerization propensity. Upon TMD-induced dimerization, ToxR binds to the ctx
promoter causing up-regulation of the LacZ
gene for β-galactosidase. A colorimetric readout is obtained by addition of ONPG to lyzed cells. Hydrolytic cleavage of ONPG by β-galactosidase results in the production of the light absorbing species o-nitrophenolate (ONP) (Figure 2).
Cellular Biology, Issue 51, Transmembrane domain, oligomerization, transcriptional reporter, ToxR, latent membrane protein-1
Microwave-assisted Functionalization of Poly(ethylene glycol) and On-resin Peptides for Use in Chain Polymerizations and Hydrogel Formation
Institutions: University of Rochester, University of Rochester, University of Rochester Medical Center.
One of the main benefits to using poly(ethylene glycol) (PEG) macromers in hydrogel formation is synthetic versatility. The ability to draw from a large variety of PEG molecular weights and configurations (arm number, arm length, and branching pattern) affords researchers tight control over resulting hydrogel structures and properties, including Young’s modulus and mesh size. This video will illustrate a rapid, efficient, solvent-free, microwave-assisted method to methacrylate PEG precursors into poly(ethylene glycol) dimethacrylate (PEGDM). This synthetic method provides much-needed starting materials for applications in drug delivery and regenerative medicine. The demonstrated method is superior to traditional methacrylation methods as it is significantly faster and simpler, as well as more economical and environmentally friendly, using smaller amounts of reagents and solvents. We will also demonstrate an adaptation of this technique for on-resin methacrylamide functionalization of peptides. This on-resin method allows the N-terminus of peptides to be functionalized with methacrylamide groups prior to deprotection and cleavage from resin. This allows for selective addition of methacrylamide groups to the N-termini of the peptides while amino acids with reactive side groups (e.g.
primary amine of lysine, primary alcohol of serine, secondary alcohols of threonine, and phenol of tyrosine) remain protected, preventing functionalization at multiple sites. This article will detail common analytical methods (proton Nuclear Magnetic Resonance spectroscopy (;
H-NMR) and Matrix Assisted Laser Desorption Ionization Time of Flight mass spectrometry (MALDI-ToF)) to assess the efficiency of the functionalizations. Common pitfalls and suggested troubleshooting methods will be addressed, as will modifications of the technique which can be used to further tune macromer functionality and resulting hydrogel physical and chemical properties. Use of synthesized products for the formation of hydrogels for drug delivery and cell-material interaction studies will be demonstrated, with particular attention paid to modifying hydrogel composition to affect mesh size, controlling hydrogel stiffness and drug release.
Chemistry, Issue 80, Poly(ethylene glycol), peptides, polymerization, polymers, methacrylation, peptide functionalization, 1H-NMR, MALDI-ToF, hydrogels, macromer synthesis
Specificity Analysis of Protein Lysine Methyltransferases Using SPOT Peptide Arrays
Institutions: Stuttgart University.
Lysine methylation is an emerging post-translation modification and it has been identified on several histone and non-histone proteins, where it plays crucial roles in cell development and many diseases. Approximately 5,000 lysine methylation sites were identified on different proteins, which are set by few dozens of protein lysine methyltransferases. This suggests that each PKMT methylates multiple proteins, however till now only one or two substrates have been identified for several of these enzymes. To approach this problem, we have introduced peptide array based substrate specificity analyses of PKMTs. Peptide arrays are powerful tools to characterize the specificity of PKMTs because methylation of several substrates with different sequences can be tested on one array. We synthesized peptide arrays on cellulose membrane using an Intavis SPOT synthesizer and analyzed the specificity of various PKMTs. Based on the results, for several of these enzymes, novel substrates could be identified. For example, for NSD1 by employing peptide arrays, we showed that it methylates K44 of H4 instead of the reported H4K20 and in addition H1.5K168 is the highly preferred substrate over the previously known H3K36. Hence, peptide arrays are powerful tools to biochemically characterize the PKMTs.
Biochemistry, Issue 93, Peptide arrays, solid phase peptide synthesis, SPOT synthesis, protein lysine methyltransferases, substrate specificity profile analysis, lysine methylation
Split-and-pool Synthesis and Characterization of Peptide Tertiary Amide Library
Institutions: The Scripps Research Institute.
Peptidomimetics are great sources of protein ligands. The oligomeric nature of these compounds enables us to access large synthetic libraries on solid phase by using combinatorial chemistry. One of the most well studied classes of peptidomimetics is peptoids. Peptoids are easy to synthesize and have been shown to be proteolysis-resistant and cell-permeable. Over the past decade, many useful protein ligands have been identified through screening of peptoid libraries. However, most of the ligands identified from peptoid libraries do not display high affinity, with rare exceptions. This may be due, in part, to the lack of chiral centers and conformational constraints in peptoid molecules. Recently, we described a new synthetic route to access peptide tertiary amides (PTAs). PTAs are a superfamily of peptidomimetics that include but are not limited to peptides, peptoids and N-methylated peptides. With side chains on both α-carbon and main chain nitrogen atoms, the conformation of these molecules are greatly constrained by sterical hindrance and allylic 1,3 strain. (Figure 1
) Our study suggests that these PTA molecules are highly structured in solution and can be used to identify protein ligands. We believe that these molecules can be a future source of high-affinity protein ligands. Here we describe the synthetic method combining the power of both split-and-pool and sub-monomer strategies to synthesize a sample one-bead one-compound (OBOC) library of PTAs.
Chemistry, Issue 88, Split-and-pool synthesis, peptide tertiary amide, PTA, peptoid, high-throughput screening, combinatorial library, solid phase, triphosgene (BTC), one-bead one-compound, OBOC
Methods to Identify the NMR Resonances of the 13C-Dimethyl N-terminal Amine on Reductively Methylated Proteins
Institutions: Louisiana State University.
Nuclear magnetic resonance (NMR) spectroscopy is a proven technique for protein structure and dynamic studies. To study proteins with NMR, stable magnetic isotopes are typically incorporated metabolically to improve the sensitivity and allow for sequential resonance assignment. Reductive 13
C-methylation is an alternative labeling method for proteins that are not amenable to bacterial host over-expression, the most common method of isotope incorporation. Reductive 13
C-methylation is a chemical reaction performed under mild conditions that modifies a protein's primary amino groups (lysine ε-amino groups and the N
-terminal α-amino group) to 13
C-dimethylamino groups. The structure and function of most proteins are not altered by the modification, making it a viable alternative to metabolic labeling. Because reductive 13
C-methylation adds sparse, isotopic labels, traditional methods of assigning the NMR signals are not applicable. An alternative assignment method using mass spectrometry (MS) to aid in the assignment of protein 13
C-dimethylamine NMR signals has been developed. The method relies on partial and different amounts of 13
C-labeling at each primary amino group. One limitation of the method arises when the protein's N
-terminal residue is a lysine because the α- and ε-dimethylamino groups of Lys1 cannot be individually measured with MS. To circumvent this limitation, two methods are described to identify the NMR resonance of the 13
C-dimethylamines associated with both the N
-terminal α-amine and the side chain ε-amine. The NMR signals of the N
-terminal α-dimethylamine and the side chain ε-dimethylamine of hen egg white lysozyme, Lys1, are identified in 1
C heteronuclear single-quantum coherence spectra.
Chemistry, Issue 82, Boranes, Formaldehyde, Dimethylamines, Tandem Mass Spectrometry, nuclear magnetic resonance, MALDI-TOF, Reductive methylation, lysozyme, dimethyllysine, mass spectrometry, NMR
Protease- and Acid-catalyzed Labeling Workflows Employing 18O-enriched Water
Institutions: Boston Biomedical Research Institute.
Stable isotopes are essential tools in biological mass spectrometry. Historically, 18
O-stable isotopes have been extensively used to study the catalytic mechanisms of proteolytic enzymes1-3
. With the advent of mass spectrometry-based proteomics, the enzymatically-catalyzed incorporation of 18
O-atoms from stable isotopically enriched water has become a popular method to quantitatively compare protein expression levels (
reviewed by Fenselau and Yao4
, Miyagi and Rao5
and Ye et al.6)
O-labeling constitutes a simple and low-cost alternative to chemical (e.g.
iTRAQ, ICAT) and metabolic (e.g.
SILAC) labeling techniques7
. Depending on the protease utilized, 18
O-labeling can result in the incorporation of up to two 18
O-atoms in the C-terminal carboxyl group of the cleavage product3
. The labeling reaction can be subdivided into two independent processes, the peptide bond cleavage and the carboxyl oxygen exchange reaction8
. In our PALeO (p
-enriched water) adaptation of enzymatic 18
O-labeling, we utilized 50% 18
O-enriched water to yield distinctive isotope signatures. In combination with high-resolution matrix-assisted laser desorption ionization time-of-flight tandem mass spectrometry (MALDI-TOF/TOF MS/MS), the characteristic isotope envelopes can be used to identify cleavage products with a high level of specificity. We previously have used the PALeO-methodology to detect and characterize endogenous proteases9
and monitor proteolytic reactions10-11
. Since PALeO encodes the very essence of the proteolytic cleavage reaction, the experimental setup is simple and biochemical enrichment steps of cleavage products can be circumvented. The PALeO-method can easily be extended to (i) time course experiments that monitor the dynamics of proteolytic cleavage reactions and (ii) the analysis of proteolysis in complex biological samples that represent physiological conditions. PALeO-TimeCourse experiments help identifying rate-limiting processing steps and reaction intermediates in complex proteolytic pathway reactions. Furthermore, the PALeO-reaction allows us to identify proteolytic enzymes such as the serine protease trypsin that is capable to rebind its cleavage products and catalyze the incorporation of a second 18
O-atom. Such "double-labeling" enzymes can be used for postdigestion 18
O-labeling, in which peptides are exclusively labeled by the carboxyl oxygen exchange reaction. Our third strategy extends labeling employing 18
O-enriched water beyond enzymes and uses acidic pH conditions to introduce 18
O-stable isotope signatures into peptides.
Biochemistry, Issue 72, Molecular Biology, Proteins, Proteomics, Chemistry, Physics, MALDI-TOF mass spectrometry, proteomics, proteolysis, quantification, stable isotope labeling, labeling, catalyst, peptides, 18-O enriched water
Expression and Purification of the Cystic Fibrosis Transmembrane Conductance Regulator Protein in Saccharomyces cerevisiae
Institutions: University of Manchester.
The cystic fibrosis transmembrane conductance regulator (CFTR) is a chloride channel, that when mutated, can give rise to cystic fibrosis in humans.There is therefore considerable interest in this protein, but efforts to study its structure and activity have been hampered by the difficulty of expressing and purifying sufficient amounts of the protein1-3
. Like many 'difficult' eukaryotic membrane proteins, expression in a fast-growing organism is desirable, but challenging, and in the yeast S. cerevisiae
, so far low amounts were obtained and rapid degradation of the recombinant protein was observed 4-9
. Proteins involved in the processing of recombinant CFTR in yeast have been described6-9
.In this report we describe a methodology for expression of CFTR in yeast and its purification in significant amounts. The protocol describes how the earlier proteolysis problems can be overcome and how expression levels of CFTR can be greatly improved by modifying the cell growth conditions and by controlling the induction conditions, in particular the time period prior to cell harvesting. The reagants associated with this protocol (murine CFTR-expressing yeast cells or yeast plasmids) will be distributed via the US Cystic Fibrosis Foundation, which has sponsored the research. An article describing the design and synthesis of the CFTR construct employed in this report will be published separately (Urbatsch, I.; Thibodeau, P. et al.
, unpublished). In this article we will explain our method beginning with the transformation of the yeast cells with the CFTR construct - containing yeast plasmid (Fig. 1). The construct has a green fluorescent protein (GFP) sequence fused to CFTR at its C-terminus and follows the system developed by Drew et al.
. The GFP allows the expression and purification of CFTR to be followed relatively easily. The JoVE visualized protocol finishes after the preparation of microsomes from the yeast cells, although we include some suggestions for purification of the protein from the microsomes. Readers may wish to add their own modifications to the microsome purification procedure, dependent on the final experiments to be carried out with the protein and the local equipment available to them. The yeast-expressed CFTR protein can be partially purified using metal ion affinity chromatography, using an intrinsic polyhistidine purification tag. Subsequent size-exclusion chromatography yields a protein that appears to be >90% pure, as judged by SDS-PAGE and Coomassie-staining of the gel.
Molecular Biology, Issue 61, Membrane protein, cystic fibrosis, CFTR, protein expression, Cystic Fibrosis Foundation, expression system, green fluorescent protein
Analysis of Fatty Acid Content and Composition in Microalgae
Institutions: Wageningen University and Research Center, Wageningen University and Research Center, Wageningen University and Research Center.
A method to determine the content and composition of total fatty acids present in microalgae is described. Fatty acids are a major constituent of microalgal biomass. These fatty acids can be present in different acyl-lipid classes. Especially the fatty acids present in triacylglycerol (TAG) are of commercial interest, because they can be used for production of transportation fuels, bulk chemicals, nutraceuticals (ω-3 fatty acids), and food commodities. To develop commercial applications, reliable analytical methods for quantification of fatty acid content and composition are needed. Microalgae are single cells surrounded by a rigid cell wall. A fatty acid analysis method should provide sufficient cell disruption to liberate all acyl lipids and the extraction procedure used should be able to extract all acyl lipid classes.
With the method presented here all fatty acids present in microalgae can be accurately and reproducibly identified and quantified using small amounts of sample (5 mg) independent of their chain length, degree of unsaturation, or the lipid class they are part of.
This method does not provide information about the relative abundance of different lipid classes, but can be extended to separate lipid classes from each other.
The method is based on a sequence of mechanical cell disruption, solvent based lipid extraction, transesterification of fatty acids to fatty acid methyl esters (FAMEs), and quantification and identification of FAMEs using gas chromatography (GC-FID). A TAG internal standard (tripentadecanoin) is added prior to the analytical procedure to correct for losses during extraction and incomplete transesterification.
Environmental Sciences, Issue 80, chemical analysis techniques, Microalgae, fatty acid, triacylglycerol, lipid, gas chromatography, cell disruption
Preparation of the Mgm101 Recombination Protein by MBP-based Tagging Strategy
Institutions: State University of New York Upstate Medical University.
gene was identified 20 years ago for its role in the maintenance of mitochondrial DNA. Studies from several groups have suggested that the Mgm101 protein is involved in the recombinational repair of mitochondrial DNA. Recent investigations have indicated that Mgm101 is related to the Rad52-type recombination protein family. These proteins form large oligomeric rings and promote the annealing of homologous single stranded DNA molecules. However, the characterization of Mgm101 has been hindered by the difficulty in producing the recombinant protein. Here, a reliable procedure for the preparation of recombinant Mgm101 is described. Maltose Binding Protein (MBP)-tagged Mgm101 is first expressed in Escherichia coli
. The fusion protein is initially purified by amylose affinity chromatography. After being released by proteolytic cleavage, Mgm101 is separated from MBP by cationic exchange chromatography. Monodispersed Mgm101 is then obtained by size exclusion chromatography. A yield of ~0.87 mg of Mgm101 per liter of bacterial culture can be routinely obtained. The recombinant Mgm101 has minimal contamination of DNA. The prepared samples are successfully used for biochemical, structural and single particle image analyses of Mgm101. This protocol may also be used for the preparation of other large oligomeric DNA-binding proteins that may be misfolded and toxic to bacterial cells.
Biochemistry, Issue 76, Genetics, Molecular Biology, Cellular Biology, Microbiology, Bacteria, Proteins, Mgm101, Rad52, mitochondria, recombination, mtDNA, maltose-binding protein, MBP, E. coli., yeast, Saccharomyces cerevisiae, chromatography, electron microscopy, cell culture
Using SecM Arrest Sequence as a Tool to Isolate Ribosome Bound Polypeptides
Institutions: Cleveland State University.
Extensive research has provided ample evidences suggesting that protein folding in the cell is a co-translational process1-5
. However, the exact pathway that polypeptide chain follows during co-translational folding to achieve its functional form is still an enigma. In order to understand this process and to determine the exact conformation of the co-translational folding intermediates, it is essential to develop techniques that allow the isolation of RNCs carrying nascent chains of predetermined sizes to allow their further structural analysis.
SecM (secretion monitor) is a 170 amino acid E. coli
protein that regulates expression of the downstream SecA (secretion driving) ATPase in the secM-secA
. Nakatogawa and Ito originally found that a 17 amino acid long sequence (150-FSTPVWISQAQGIRAG
P-166) in the C-terminal region of the SecM protein is sufficient and necessary to cause stalling of SecM elongation at Gly165, thereby producing peptidyl-glycyl-tRNA stably bound to the ribosomal P-site7-9
. More importantly, it was found that this 17 amino acid long sequence can be fused to the C-terminus of virtually any full-length and/or truncated protein thus allowing the production of RNCs carrying nascent chains of predetermined sizes7
. Thus, when fused or inserted into the target protein, SecM stalling sequence produces arrest of the polypeptide chain elongation and generates stable RNCs both in vivo
in E. coli
cells and in vitro
in a cell-free system. Sucrose gradient centrifugation is further utilized to isolate RNCs.
The isolated RNCs can be used to analyze structural and functional features of the co-translational folding intermediates. Recently, this technique has been successfully used to gain insights into the structure of several ribosome bound nascent chains10,11
. Here we describe the isolation of bovine Gamma-B Crystallin RNCs fused to SecM and generated in an in vitro
Molecular Biology, Issue 64, Ribosome, nascent polypeptides, co-translational protein folding, translational arrest, in vitro translation
Isolation of Translating Ribosomes Containing Peptidyl-tRNAs for Functional and Structural Analyses
Institutions: University of Alabama Huntsville, Stanford University .
Recently, structural and biochemical studies have detailed many of the molecular events that occur in the ribosome during inhibition of protein synthesis by antibiotics and during nascent polypeptide synthesis. Some of these antibiotics, and regulatory nascent polypeptides mostly in the form of peptidyl-tRNAs, inhibit either peptide bond formation or translation termination1-7
. These inhibitory events can stop the movement of the ribosome, a phenomenon termed "translational arrest". Translation arrest induced by either an antibiotic or a nascent polypeptide has been shown to regulate the expression of genes involved in diverse cellular functions such as cell growth, antibiotic resistance, protein translocation and cell metabolism8-13
. Knowledge of how antibiotics and regulatory nascent polypeptides alter ribosome function is essential if we are to understand the complete role of the ribosome in translation, in every organism.
Here, we describe a simple methodology that can be used to purify, exclusively, for analysis, those ribosomes translating a specific mRNA and containing a specific peptidyl-tRNA14
. This procedure is based on selective isolation of translating ribosomes bound to a biotin-labeled mRNA. These translational complexes are separated from other ribosomes in the same mixture, using streptavidin paramagnetic beads (SMB) and a magnetic field (MF). Biotin-labeled mRNAs are synthesized by run-off transcription assays using as templates PCR-generated DNA fragments that contain T7 transcriptional promoters. T7 RNA polymerase incorporates biotin-16-UMP from biotin-UTP; under our conditions approximately ten biotin-16-UMP molecules are incorporated in a 600 nt mRNA with a 25% UMP content. These biotin-labeled mRNAs are then isolated, and used in in vitro
translation assays performed with release factor 2 (RF2)-depleted cell-free extracts obtained from Escherichia coli
strains containing wild type or mutant ribosomes. Ribosomes translating the biotin-labeled mRNA sequences are stalled at the stop codon region, due to the absence of the RF2 protein, which normally accomplishes translation termination. Stalled ribosomes containing the newly synthesized peptidyl-tRNA are isolated and removed from the translation reactions using SMB and an MF. These beads only bind biotin-containing messages.
The isolated, translational complexes, can be used to analyze the structural and functional features of wild type or mutant ribosomal components, or peptidyl-tRNA sequences, as well as determining ribosome interaction with antibiotics or other molecular factors 1,14-16
. To examine the function of these isolated ribosome complexes, peptidyl-transferase assays can be performed in the presence of the antibiotic puromycin1
. To study structural changes in translational complexes, well established procedures can be used, such as i) crosslinking to specific amino acids14
and/or ii) alkylation protection assays1,14,17
Molecular Biology, Issue 48, Ribosome stalling, ribosome isolation, peptidyl-tRNA, in vitro translation, RNA chemical modification, puromycin, antibiotics.
A Toolkit to Enable Hydrocarbon Conversion in Aqueous Environments
Institutions: Delft University of Technology, Delft University of Technology.
This work puts forward a toolkit that enables the conversion of alkanes by Escherichia coli
and presents a proof of principle of its applicability. The toolkit consists of multiple standard interchangeable parts (BioBricks)9
addressing the conversion of alkanes, regulation of gene expression and survival in toxic hydrocarbon-rich environments.
A three-step pathway for alkane degradation was implemented in E. coli
to enable the conversion of medium- and long-chain alkanes to their respective alkanols, alkanals and ultimately alkanoic-acids. The latter were metabolized via the native β-oxidation pathway. To facilitate the oxidation of medium-chain alkanes (C5-C13) and cycloalkanes (C5-C8), four genes (alkB2
) of the alkane hydroxylase system from Gordonia
were transformed into E. coli
. For the conversion of long-chain alkanes (C15-C36), theladA
gene from Geobacillus thermodenitrificans
was implemented. For the required further steps of the degradation process, ADH
and ALDH (
originating from G. thermodenitrificans
) were introduced10,11
. The activity was measured by resting cell assays. For each oxidative step, enzyme activity was observed.
To optimize the process efficiency, the expression was only induced under low glucose conditions: a substrate-regulated promoter, pCaiF, was used. pCaiF is present in E. coli
K12 and regulates the expression of the genes involved in the degradation of non-glucose carbon sources.
The last part of the toolkit - targeting survival - was implemented using solvent tolerance genes, PhPFDα and β, both from Pyrococcus horikoshii
OT3. Organic solvents can induce cell stress and decreased survivability by negatively affecting protein folding. As chaperones, PhPFDα and β improve the protein folding process e.g.
under the presence of alkanes. The expression of these genes led to an improved hydrocarbon tolerance shown by an increased growth rate (up to 50%) in the presences of 10% n
-hexane in the culture medium were observed.
Summarizing, the results indicate that the toolkit enables E. coli
to convert and tolerate hydrocarbons in aqueous environments. As such, it represents an initial step towards a sustainable solution for oil-remediation using a synthetic biology approach.
Bioengineering, Issue 68, Microbiology, Biochemistry, Chemistry, Chemical Engineering, Oil remediation, alkane metabolism, alkane hydroxylase system, resting cell assay, prefoldin, Escherichia coli, synthetic biology, homologous interaction mapping, mathematical model, BioBrick, iGEM
Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
Institutions: Princeton University.
The aim of de novo
protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo
protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.
To disseminate these methods for broader use we present Protein WISDOM (https://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
Genetics, Issue 77, Molecular Biology, Bioengineering, Biochemistry, Biomedical Engineering, Chemical Engineering, Computational Biology, Genomics, Proteomics, Protein, Protein Binding, Computational Biology, Drug Design, optimization (mathematics), Amino Acids, Peptides, and Proteins, De novo protein and peptide design, Drug design, In silico sequence selection, Optimization, Fold specificity, Binding affinity, sequencing
Characterization of Complex Systems Using the Design of Experiments Approach: Transient Protein Expression in Tobacco as a Case Study
Institutions: RWTH Aachen University, Fraunhofer Gesellschaft.
Plants provide multiple benefits for the production of biopharmaceuticals including low costs, scalability, and safety. Transient expression offers the additional advantage of short development and production times, but expression levels can vary significantly between batches thus giving rise to regulatory concerns in the context of good manufacturing practice. We used a design of experiments (DoE) approach to determine the impact of major factors such as regulatory elements in the expression construct, plant growth and development parameters, and the incubation conditions during expression, on the variability of expression between batches. We tested plants expressing a model anti-HIV monoclonal antibody (2G12) and a fluorescent marker protein (DsRed). We discuss the rationale for selecting certain properties of the model and identify its potential limitations. The general approach can easily be transferred to other problems because the principles of the model are broadly applicable: knowledge-based parameter selection, complexity reduction by splitting the initial problem into smaller modules, software-guided setup of optimal experiment combinations and step-wise design augmentation. Therefore, the methodology is not only useful for characterizing protein expression in plants but also for the investigation of other complex systems lacking a mechanistic description. The predictive equations describing the interconnectivity between parameters can be used to establish mechanistic models for other complex systems.
Bioengineering, Issue 83, design of experiments (DoE), transient protein expression, plant-derived biopharmaceuticals, promoter, 5'UTR, fluorescent reporter protein, model building, incubation conditions, monoclonal antibody
Efficient Production and Purification of Recombinant Murine Kindlin-3 from Insect Cells for Biophysical Studies
Institutions: University of Oxford.
Kindlins are essential coactivators, with talin, of the cell surface receptors integrins and also participate in integrin outside-in signalling, and the control of gene transcription in the cell nucleus. The kindlins are ~75 kDa multidomain proteins and bind to an NPxY motif and upstream T/S cluster of the integrin β-subunit cytoplasmic tail. The hematopoietically-important kindlin isoform, kindlin-3, is critical for platelet aggregation during thrombus formation, leukocyte rolling in response to infection and inflammation and osteoclast podocyte formation in bone resorption. Kindlin-3's role in these processes has resulted in extensive cellular and physiological studies. However, there is a need for an efficient method of acquiring high quality milligram quantities of the protein for further studies. We have developed a protocol, here described, for the efficient expression and purification of recombinant murine kindlin-3 by use of a baculovirus-driven expression system in Sf9 cells yielding sufficient amounts of high purity full-length protein to allow its biophysical characterization. The same approach could be taken in the study of the other mammalian kindlin isoforms.
Virology, Issue 85, Heterologous protein expression, insect cells, Spodoptera frugiperda, baculovirus, protein purification, kindlin, cell adhesion
Generation of Enterobacter sp. YSU Auxotrophs Using Transposon Mutagenesis
Institutions: Youngstown State University.
Prototrophic bacteria grow on M-9 minimal salts medium supplemented with glucose (M-9 medium), which is used as a carbon and energy source. Auxotrophs can be generated using a transposome. The commercially available, Tn5
-derived transposome used in this protocol consists of a linear segment of DNA containing an R6Kγ
replication origin, a gene for kanamycin resistance and two mosaic sequence ends, which serve as transposase binding sites. The transposome, provided as a DNA/transposase protein complex, is introduced by electroporation into the prototrophic strain, Enterobacter
sp. YSU, and randomly incorporates itself into this host’s genome. Transformants are replica plated onto Luria-Bertani agar plates containing kanamycin, (LB-kan) and onto M-9 medium agar plates containing kanamycin (M-9-kan). The transformants that grow on LB-kan plates but not on M-9-kan plates are considered to be auxotrophs. Purified genomic DNA from an auxotroph is partially digested, ligated and transformed into a pir+ Escherichia coli
) strain. The R6Kγ
replication origin allows the plasmid to replicate in pir+ E. coli
strains, and the kanamycin resistance marker allows for plasmid selection. Each transformant possesses a new plasmid containing the transposon flanked by the interrupted chromosomal region. Sanger sequencing and the Basic Local Alignment Search Tool (BLAST) suggest a putative identity of the interrupted gene. There are three advantages to using this transposome mutagenesis strategy. First, it does not rely on the expression of a transposase gene by the host. Second, the transposome is introduced into the target host by electroporation, rather than by conjugation or by transduction and therefore is more efficient. Third, the R6Kγ
replication origin makes it easy to identify the mutated gene which is partially recovered in a recombinant plasmid. This technique can be used to investigate the genes involved in other characteristics of Enterobacter
sp. YSU or of a wider variety of bacterial strains.
Microbiology, Issue 92, Auxotroph, transposome, transposon, mutagenesis, replica plating, glucose minimal medium, complex medium, Enterobacter
A Protocol for Computer-Based Protein Structure and Function Prediction
Institutions: University of Michigan , University of Kansas.
Genome sequencing projects have ciphered millions of protein sequence, which require knowledge of their structure and function to improve the understanding of their biological role. Although experimental methods can provide detailed information for a small fraction of these proteins, computational modeling is needed for the majority of protein molecules which are experimentally uncharacterized. The I-TASSER server is an on-line workbench for high-resolution modeling of protein structure and function. Given a protein sequence, a typical output from the I-TASSER server includes secondary structure prediction, predicted solvent accessibility of each residue, homologous template proteins detected by threading and structure alignments, up to five full-length tertiary structural models, and structure-based functional annotations for enzyme classification, Gene Ontology terms and protein-ligand binding sites. All the predictions are tagged with a confidence score which tells how accurate the predictions are without knowing the experimental data. To facilitate the special requests of end users, the server provides channels to accept user-specified inter-residue distance and contact maps to interactively change the I-TASSER modeling; it also allows users to specify any proteins as template, or to exclude any template proteins during the structure assembly simulations. The structural information could be collected by the users based on experimental evidences or biological insights with the purpose of improving the quality of I-TASSER predictions. The server was evaluated as the best programs for protein structure and function predictions in the recent community-wide CASP experiments. There are currently >20,000 registered scientists from over 100 countries who are using the on-line I-TASSER server.
Biochemistry, Issue 57, On-line server, I-TASSER, protein structure prediction, function prediction
The ChroP Approach Combines ChIP and Mass Spectrometry to Dissect Locus-specific Proteomic Landscapes of Chromatin
Institutions: European Institute of Oncology.
Chromatin is a highly dynamic nucleoprotein complex made of DNA and proteins that controls various DNA-dependent processes. Chromatin structure and function at specific regions is regulated by the local enrichment of histone post-translational modifications (hPTMs) and variants, chromatin-binding proteins, including transcription factors, and DNA methylation. The proteomic characterization of chromatin composition at distinct functional regions has been so far hampered by the lack of efficient protocols to enrich such domains at the appropriate purity and amount for the subsequent in-depth analysis by Mass Spectrometry (MS). We describe here a newly designed chromatin proteomics strategy, named ChroP (Chromatin Proteomics
), whereby a preparative chromatin immunoprecipitation is used to isolate distinct chromatin regions whose features, in terms of hPTMs, variants and co-associated non-histonic proteins, are analyzed by MS. We illustrate here the setting up of ChroP for the enrichment and analysis of transcriptionally silent heterochromatic regions, marked by the presence of tri-methylation of lysine 9 on histone H3. The results achieved demonstrate the potential of ChroP
in thoroughly characterizing the heterochromatin proteome and prove it as a powerful analytical strategy for understanding how the distinct protein determinants of chromatin interact and synergize to establish locus-specific structural and functional configurations.
Biochemistry, Issue 86, chromatin, histone post-translational modifications (hPTMs), epigenetics, mass spectrometry, proteomics, SILAC, chromatin immunoprecipitation , histone variants, chromatome, hPTMs cross-talks
Rapid Synthesis and Screening of Chemically Activated Transcription Factors with GFP-based Reporters
Institutions: Princeton University, Princeton University, California Institute of Technology.
Synthetic biology aims to rationally design and build synthetic circuits with desired quantitative properties, as well as provide tools to interrogate the structure of native control circuits. In both cases, the ability to program gene expression in a rapid and tunable fashion, with no off-target effects, can be useful. We have constructed yeast strains containing the ACT1
promoter upstream of a URA3
cassette followed by the ligand-binding domain of the human estrogen receptor and VP16. By transforming this strain with a linear PCR product containing a DNA binding domain and selecting against the presence of URA3
, a constitutively expressed artificial transcription factor (ATF) can be generated by homologous recombination. ATFs engineered in this fashion can activate a unique target gene in the presence of inducer, thereby eliminating both the off-target activation and nonphysiological growth conditions found with commonly used conditional gene expression systems. A simple method for the rapid construction of GFP reporter plasmids that respond specifically to a native or artificial transcription factor of interest is also provided.
Genetics, Issue 81, transcription, transcription factors, artificial transcription factors, zinc fingers, Zif268, synthetic biology
Peptide-based Identification of Functional Motifs and their Binding Partners
Institutions: Morehouse School of Medicine, Institute for Systems Biology, Universiti Sains Malaysia.
Specific short peptides derived from motifs found in full-length proteins, in our case HIV-1 Nef, not only retain their biological function, but can also competitively inhibit the function of the full-length protein. A set of 20 Nef scanning peptides, 20 amino acids in length with each overlapping 10 amino acids of its neighbor, were used to identify motifs in Nef responsible for its induction of apoptosis. Peptides containing these apoptotic motifs induced apoptosis at levels comparable to the full-length Nef protein. A second peptide, derived from the Secretion Modification Region (SMR) of Nef, retained the ability to interact with cellular proteins involved in Nef's secretion in exosomes (exNef). This SMRwt peptide was used as the "bait" protein in co-immunoprecipitation experiments to isolate cellular proteins that bind specifically to Nef's SMR motif. Protein transfection and antibody inhibition was used to physically disrupt the interaction between Nef and mortalin, one of the isolated SMR-binding proteins, and the effect was measured with a fluorescent-based exNef secretion assay. The SMRwt peptide's ability to outcompete full-length Nef for cellular proteins that bind the SMR motif, make it the first inhibitor of exNef secretion. Thus, by employing the techniques described here, which utilize the unique properties of specific short peptides derived from motifs found in full-length proteins, one may accelerate the identification of functional motifs in proteins and the development of peptide-based inhibitors of pathogenic functions.
Virology, Issue 76, Biochemistry, Immunology, Infection, Infectious Diseases, Molecular Biology, Medicine, Genetics, Microbiology, Genomics, Proteins, Exosomes, HIV, Peptides, Exocytosis, protein trafficking, secretion, HIV-1, Nef, Secretion Modification Region, SMR, peptide, AIDS, assay
Orthogonal Protein Purification Facilitated by a Small Bispecific Affinity Tag
Institutions: Royal Institute of Technology.
Due to the high costs associated with purification of recombinant proteins the protocols need to be rationalized. For high-throughput efforts there is a demand for general methods that do not require target protein specific optimization1
. To achieve this, purification tags that genetically can be fused to the gene of interest are commonly used2
. The most widely used affinity handle is the hexa-histidine tag, which is suitable for purification under both native and denaturing conditions3
. The metabolic burden for producing the tag is low, but it does not provide as high specificity as competing affinity chromatography based strategies1,2
Here, a bispecific purification tag with two different binding sites on a 46 amino acid, small protein domain has been developed. The albumin-binding domain is derived from Streptococcal protein G and has a strong inherent affinity to human serum albumin (HSA). Eleven surface-exposed amino acids, not involved in albumin-binding4
, were genetically randomized to produce a combinatorial library. The protein library with the novel randomly arranged binding surface (Figure 1) was expressed on phage particles to facilitate selection of binders by phage display technology. Through several rounds of biopanning against a dimeric Z-domain derived from Staphylococcal protein A5
, a small, bispecific molecule with affinity for both HSA and the novel target was identified6
The novel protein domain, referred to as ABDz1, was evaluated as a purification tag for a selection of target proteins with different molecular weight, solubility and isoelectric point. Three target proteins were expressed in Escherishia coli
with the novel tag fused to their N-termini and thereafter affinity purified. Initial purification on either a column with immobilized HSA or Z-domain resulted in relatively pure products. Two-step affinity purification with the bispecific tag resulted in substantial improvement of protein purity. Chromatographic media with the Z-domain immobilized, for example MabSelect SuRe, are readily available for purification of antibodies and HSA can easily be chemically coupled to media to provide the second matrix.
This method is especially advantageous when there is a high demand on purity of the recovered target protein. The bifunctionality of the tag allows two different chromatographic steps to be used while the metabolic burden on the expression host is limited due to the small size of the tag. It provides a competitive alternative to so called combinatorial tagging where multiple tags are used in combination1,7
Molecular Biology, Issue 59, Affinity chromatography, albumin-binding domain, human serum albumin, Z-domain
Biochemical and High Throughput Microscopic Assessment of Fat Mass in Caenorhabditis Elegans
Institutions: Massachusetts General Hospital and Harvard Medical School, Massachusetts Institute of Technology.
The nematode C. elegans
has emerged as an important model for the study of conserved genetic pathways regulating fat metabolism as it relates to human obesity and its associated pathologies. Several previous methodologies developed for the visualization of C. elegans
triglyceride-rich fat stores have proven to be erroneous, highlighting cellular compartments other than lipid droplets. Other methods require specialized equipment, are time-consuming, or yield inconsistent results. We introduce a rapid, reproducible, fixative-based Nile red staining method for the accurate and rapid detection of neutral lipid droplets in C. elegans
. A short fixation step in 40% isopropanol makes animals completely permeable to Nile red, which is then used to stain animals. Spectral properties of this lipophilic dye allow it to strongly and selectively fluoresce in the yellow-green spectrum only when in a lipid-rich environment, but not in more polar environments. Thus, lipid droplets can be visualized on a fluorescent microscope equipped with simple GFP imaging capability after only a brief Nile red staining step in isopropanol. The speed, affordability, and reproducibility of this protocol make it ideally suited for high throughput screens. We also demonstrate a paired method for the biochemical determination of triglycerides and phospholipids using gas chromatography mass-spectrometry. This more rigorous protocol should be used as confirmation of results obtained from the Nile red microscopic lipid determination. We anticipate that these techniques will become new standards in the field of C. elegans
Genetics, Issue 73, Biochemistry, Cellular Biology, Molecular Biology, Developmental Biology, Physiology, Anatomy, Caenorhabditis elegans, Obesity, Energy Metabolism, Lipid Metabolism, C. elegans, fluorescent lipid staining, lipids, Nile red, fat, high throughput screening, obesity, gas chromatography, mass spectrometry, GC/MS, animal model
Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes
Institutions: Dartmouth College.
SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference1
. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data1
. In this article, we utilize a web version of SCOPE2
to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs3,4
and has been used in other studies5-8
The three algorithms that comprise SCOPE are BEAM9
, which finds non-degenerate motifs (ACCGGT), PRISM10
, which finds degenerate motifs (ASCGWT), and SPACER11
, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well.
Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor.
Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run.
Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail1,2,9-11
Genetics, Issue 51, gene regulation, computational biology, algorithm, promoter sequence motif