Many researchers, across incredibly diverse foci, are applying phylogenetics to their research question(s). However, many researchers are new to this topic and so it presents inherent problems. Here we compile a practical introduction to phylogenetics for nonexperts. We outline in a step-by-step manner, a pipeline for generating reliable phylogenies from gene sequence datasets. We begin with a user-guide for similarity search tools via online interfaces as well as local executables. Next, we explore programs for generating multiple sequence alignments followed by protocols for using software to determine best-fit models of evolution. We then outline protocols for reconstructing phylogenetic relationships via maximum likelihood and Bayesian criteria and finally describe tools for visualizing phylogenetic trees. While this is not by any means an exhaustive description of phylogenetic approaches, it does provide the reader with practical starting information on key software applications commonly utilized by phylogeneticists. The vision for this article would be that it could serve as a practical training tool for researchers embarking on phylogenetic studies and also serve as an educational resource that could be incorporated into a classroom or teaching-lab.
26 Related JoVE Articles!
Discovery of New Intracellular Pathogens by Amoebal Coculture and Amoebal Enrichment Approaches
Institutions: University Hospital Center and University of Lausanne.
Intracellular pathogens such as legionella, mycobacteria and Chlamydia-like organisms are difficult to isolate because they often grow poorly or not at all on selective media that are usually used to cultivate bacteria. For this reason, many of these pathogens were discovered only recently or following important outbreaks. These pathogens are often associated with amoebae, which serve as host-cell and allow the survival and growth of the bacteria. We intend here to provide a demonstration of two techniques that allow isolation and characterization of intracellular pathogens present in clinical or environmental samples: the amoebal coculture and the amoebal enrichment. Amoebal coculture allows recovery of intracellular bacteria by inoculating the investigated sample onto an amoebal lawn that can be infected and lysed by the intracellular bacteria present in the sample. Amoebal enrichment allows recovery of amoebae present in a clinical or environmental sample. This can lead to discovery of new amoebal species but also of new intracellular bacteria growing specifically in these amoebae. Together, these two techniques help to discover new intracellular bacteria able to grow in amoebae. Because of their ability to infect amoebae and resist phagocytosis, these intracellular bacteria might also escape phagocytosis by macrophages and thus, be pathogenic for higher eukaryotes.
Immunology, Issue 80, Environmental Microbiology, Soil Microbiology, Water Microbiology, Amoebae, microorganisms, coculture, obligate intracellular bacteria
Substrate Generation for Endonucleases of CRISPR/Cas Systems
Institutions: Max-Planck-Institute for Terrestrial Microbiology.
The interaction of viruses and their prokaryotic hosts shaped the evolution of bacterial and archaeal life. Prokaryotes developed several strategies to evade viral attacks that include restriction modification, abortive infection and CRISPR/Cas systems. These adaptive immune systems found in many Bacteria and most Archaea consist of clustered regularly interspaced short palindromic repeat (CRISPR) sequences and a number of CRISPR associated (Cas) genes (Fig. 1) 1-3
. Different sets of Cas proteins and repeats define at least three major divergent types of CRISPR/Cas systems 4
. The universal proteins Cas1 and Cas2 are proposed to be involved in the uptake of viral DNA that will generate a new spacer element between two repeats at the 5' terminus of an extending CRISPR cluster 5
. The entire cluster is transcribed into a precursor-crRNA containing all spacer and repeat sequences and is subsequently processed by an enzyme of the diverse Cas6 family into smaller crRNAs 6-8
. These crRNAs consist of the spacer sequence flanked by a 5' terminal (8 nucleotides) and a 3' terminal tag derived from the repeat sequence 9
. A repeated infection of the virus can now be blocked as the new crRNA will be directed by a Cas protein complex (Cascade) to the viral DNA and identify it as such via base complementarity10
. Finally, for CRISPR/Cas type 1 systems, the nuclease Cas3 will destroy the detected invader DNA 11,12
These processes define CRISPR/Cas as an adaptive immune system of prokaryotes and opened a fascinating research field for the study of the involved Cas proteins. The function of many Cas proteins is still elusive and the causes for the apparent diversity of the CRISPR/Cas systems remain to be illuminated. Potential activities of most Cas proteins were predicted via detailed computational analyses. A major fraction of Cas proteins are either shown or proposed to function as endonucleases 4
Here, we present methods to generate crRNAs and precursor-cRNAs for the study of Cas endoribonucleases. Different endonuclease assays require either short repeat sequences that can directly be synthesized as RNA oligonucleotides or longer crRNA and pre-crRNA sequences that are generated via in vitro
T7 RNA polymerase run-off transcription. This methodology allows the incorporation of radioactive nucleotides for the generation of internally labeled endonuclease substrates and the creation of synthetic or mutant crRNAs. Cas6 endonuclease activity is utilized to mature pre-crRNAs into crRNAs with 5'-hydroxyl and a 2',3'-cyclic phosphate termini.
Molecular biology, Issue 67, CRISPR/Cas, endonuclease, in vitro transcription, crRNA, Cas6
Bromodeoxyuridine (BrdU) Labeling and Subsequent Fluorescence Activated Cell Sorting for Culture-independent Identification of Dissolved Organic Carbon-degrading Bacterioplankton
Institutions: Kent State University, University of Georgia (UGA).
Microbes are major agents mediating the degradation of numerous dissolved organic carbon (DOC) substrates in aquatic environments. However, identification of bacterial taxa that transform specific pools of DOC in nature poses a technical challenge.
Here we describe an approach that couples bromodeoxyuridine (BrdU) incorporation, fluorescence activated cell sorting (FACS), and 16S rRNA gene-based molecular analysis that allows culture-independent identification of bacterioplankton capable of degrading a specific DOC compound in aquatic environments. Triplicate bacterioplankton microcosms are set up to receive both BrdU and a model DOC compound (DOC amendments), or only BrdU (no-addition control). BrdU substitutes the positions of thymidine in newly synthesized bacterial DNA and BrdU-labeled DNA can be readily immunodetected 1,2
. Through a 24-hr incubation, bacterioplankton that are able to use the added DOC compound are expected to be selectively activated, and therefore have higher levels of BrdU incorporation (HI cells) than non-responsive cells in the DOC amendments and cells in no-addition controls (low BrdU incorporation cells, LI cells). After fluorescence immunodetection, HI cells are distinguished and physically separated from the LI cells by fluorescence activated cell sorting (FACS) 3
. Sorted DOC-responsive cells (HI cells) are extracted for DNA and taxonomically identified through subsequent 16S rRNA gene-based analyses including PCR, clone library construction and sequencing.
Molecular Biology, Issue 55, BrdU incorporation, fluorescence-activated cell sorting, FACS, flow cytometry, microbial community, culture-independent, bacterioplankton
An Allelotyping PCR for Identifying Salmonella enterica serovars Enteritidis, Hadar, Heidelberg, and Typhimurium
Institutions: University of Georgia.
Current commercial PCRs tests for identifying Salmonella
target genes unique to this genus. However, there are two species, six subspecies, and over 2,500 different Salmonella
serovars, and not all are equal in their significance to public health. For example, finding S. enterica subspecies
IIIa Arizona on a table egg layer farm is insignificant compared to the isolation of S. enterica
subspecies I serovar Enteritidis, the leading cause of salmonellosis linked to the consumption of table eggs. Serovars are identified based on antigenic differences in lipopolysaccharide (LPS)(O antigen) and flagellin (H1 and H2 antigens). These antigenic differences are the outward appearance of the diversity of genes and gene alleles associated with this phenotype.
We have developed an allelotyping, multiplex PCR that keys on genetic differences between four major S. enterica
subspecies I serovars found in poultry and associated with significant human disease in the US. The PCR primer pairs were targeted to key genes or sequences unique to a specific Salmonella
serovar and designed to produce an amplicon with size specific for that gene or allele. Salmonella
serovar is assigned to an isolate based on the combination of PCR test results for specific LPS and flagellin gene alleles. The multiplex PCRs described in this article are specific for the detection of S. enterica
subspecies I serovars Enteritidis, Hadar, Heidelberg, and Typhimurium.
Here we demonstrate how to use the multiplex PCRs to identify serovar for a Salmonella
Immunology, Issue 53, PCR, Salmonella, multiplex, Serovar
A Restriction Enzyme Based Cloning Method to Assess the In vitro Replication Capacity of HIV-1 Subtype C Gag-MJ4 Chimeric Viruses
Institutions: Emory University, Emory University.
The protective effect of many HLA class I alleles on HIV-1 pathogenesis and disease progression is, in part, attributed to their ability to target conserved portions of the HIV-1 genome that escape with difficulty. Sequence changes attributed to cellular immune pressure arise across the genome during infection, and if found within conserved regions of the genome such as Gag, can affect the ability of the virus to replicate in vitro
. Transmission of HLA-linked polymorphisms in Gag to HLA-mismatched recipients has been associated with reduced set point viral loads. We hypothesized this may be due to a reduced replication capacity of the virus. Here we present a novel method for assessing the in vitro
replication of HIV-1 as influenced by the gag
gene isolated from acute time points from subtype C infected Zambians. This method uses restriction enzyme based cloning to insert the gag
gene into a common subtype C HIV-1 proviral backbone, MJ4. This makes it more appropriate to the study of subtype C sequences than previous recombination based methods that have assessed the in vitro
replication of chronically derived gag-pro
sequences. Nevertheless, the protocol could be readily modified for studies of viruses from other subtypes. Moreover, this protocol details a robust and reproducible method for assessing the replication capacity of the Gag-MJ4 chimeric viruses on a CEM-based T cell line. This method was utilized for the study of Gag-MJ4 chimeric viruses derived from 149 subtype C acutely infected Zambians, and has allowed for the identification of residues in Gag that affect replication. More importantly, the implementation of this technique has facilitated a deeper understanding of how viral replication defines parameters of early HIV-1 pathogenesis such as set point viral load and longitudinal CD4+ T cell decline.
Infectious Diseases, Issue 90, HIV-1, Gag, viral replication, replication capacity, viral fitness, MJ4, CEM, GXR25
Single Read and Paired End mRNA-Seq Illumina Libraries from 10 Nanograms Total RNA
Institutions: Morgridge Institute for Research, University of Wisconsin, University of California.
Whole transcriptome sequencing by mRNA-Seq is now used extensively to perform global gene expression, mutation, allele-specific expression and other genome-wide analyses. mRNA-Seq even opens the gate for gene expression analysis of non-sequenced genomes. mRNA-Seq offers high sensitivity, a large dynamic range and allows measurement of transcript copy numbers in a sample. Illumina’s genome analyzer performs sequencing of a large number (> 107
) of relatively short sequence reads (< 150 bp).The "paired end" approach, wherein a single long read is sequenced at both its ends, allows for tracking alternate splice junctions, insertions and deletions, and is useful for de novo
One of the major challenges faced by researchers is a limited amount of starting material. For example, in experiments where cells are harvested by laser micro-dissection, available starting total RNA may measure in nanograms. Preparation of mRNA-Seq libraries from such samples have been described1, 2
but involves significant PCR amplification that may introduce bias. Other RNA-Seq library construction procedures with minimal PCR amplification have been published3, 4
but require microgram amounts of starting total RNA.
Here we describe a protocol for the Illumina Genome Analyzer II platform for mRNA-Seq sequencing for library preparation that avoids significant PCR amplification and requires only 10 nanograms of total RNA. While this protocol has been described previously and validated for single-end sequencing5
, where it was shown to produce directional libraries without introducing significant amplification bias, here we validate it further for use as a paired end protocol. We selectively amplify polyadenylated messenger RNAs from starting total RNA using the T7 based Eberwine linear amplification method, coined "T7LA" (T7 linear amplification). The amplified poly-A mRNAs are fragmented, reverse transcribed and adapter ligated to produce the final sequencing library. For both single read and paired end runs, sequences are mapped to the human transcriptome6
and normalized so that data from multiple runs can be compared. We report the gene expression measurement in units of transcripts per million (TPM), which is a superior measure to RPKM when comparing samples7
Molecular Biology, Issue 56, Genetics, mRNA-Seq, Illumina-Seq, gene expression profiling, high throughput sequencing
V3 Stain-free Workflow for a Practical, Convenient, and Reliable Total Protein Loading Control in Western Blotting
Institutions: Bio-Rad Laboratories.
The western blot is a very useful and widely adopted lab technique, but its execution is challenging. The workflow is often characterized as a "black box" because an experimentalist does not know if it has been performed successfully until the last of several steps. Moreover, the quality of western blot data is sometimes challenged due to a lack of effective quality control tools in place throughout the western blotting process. Here we describe the V3 western workflow, which applies stain-free technology to address the major concerns associated with the traditional western blot protocol. This workflow allows researchers: 1) to run a gel in about 20-30 min; 2) to visualize sample separation quality within 5 min after the gel run; 3) to transfer proteins in 3-10 min; 4) to verify transfer efficiency quantitatively; and most importantly 5) to validate changes in the level of the protein of interest using total protein loading control. This novel approach eliminates the need of stripping and reprobing the blot for housekeeping proteins such as β-actin, β-tubulin, GAPDH, etc.
The V3 stain-free workflow makes the western blot process faster, transparent, more quantitative and reliable.
Basic Protocol, Issue 82, Biotechnology, Pharmaceutical, Protein electrophoresis, Western blot, Stain-Free, loading control, total protein normalization, stain-free technology
Determining Genetic Expression Profiles in C. elegans Using Microarray and Real-time PCR
Institutions: Southwestern Oklahoma State University.
Synapses are composed of a presynaptic active zone in the signaling cell and a postsynaptic terminal in the target cell. In the case of chemical synapses, messages are carried by neurotransmitters released from presynaptic terminals and received by receptors on postsynaptic cells. Our previous research in Caenorhabditis elegans
has shown that VSM-1 negatively regulates exocytosis. Additionally, analysis of synapses in vsm-1
mutants showed that animals lacking a fully functional VSM-1 have increased synaptic connectivity. Based on these preliminary findings, we hypothesized that C. elegans
VSM-1 may play a crucial role in synaptogenesis. To test this hypothesis, double-labeled microarray analysis was performed, and gene expression profiles were determined. First, total RNA was isolated, reversely transcribed to cDNA, and hybridized to the DNA microarrays. Then, in-silico analysis of fluorescent probe hybridization revealed significant induction of many genes coding for members of the major sperm protein family (MSP) in mutants with enhanced synaptogenesis. MSPs are the major component of sperm in C. elegans
and appear to signal nematode oocyte maturation and ovulation . In fruit flies, Chai and colleagues 1
demonstrated that MSP-like molecules regulate presynaptic bouton number and size at the neuromuscular junction. Moreover, analysis performed by Tsuda and coworkers 2
suggested that MSPs may act as ligands for Eph receptors and trigger receptor tyrosine kinase signaling cascades. Lastly, real time PCR analysis corroborated that the gene coding for MSP-32 is induced in vsm-1(ok1468)
mutants. Taken together, research performed by our laboratory has shown that vsm-1
mutants have a significant increase in synaptic density, which could be mediated by MSP-32 signaling.
Molecular Biology, Issue 53, microarray, C. elegans, real-time PCR, neuroscience
Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
Institutions: Princeton University.
The aim of de novo
protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo
protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.
To disseminate these methods for broader use we present Protein WISDOM (https://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
Genetics, Issue 77, Molecular Biology, Bioengineering, Biochemistry, Biomedical Engineering, Chemical Engineering, Computational Biology, Genomics, Proteomics, Protein, Protein Binding, Computational Biology, Drug Design, optimization (mathematics), Amino Acids, Peptides, and Proteins, De novo protein and peptide design, Drug design, In silico sequence selection, Optimization, Fold specificity, Binding affinity, sequencing
A Protocol for Analyzing Hepatitis C Virus Replication
Institutions: Cedars-Sinai Medical Center, David Geffen School of Medicine at UCLA.
Hepatitis C Virus (HCV) affects 3% of the world’s population and causes serious liver ailments including chronic hepatitis, cirrhosis, and hepatocellular carcinoma. HCV is an enveloped RNA virus belonging to the family Flaviviridae
. Current treatment is not fully effective and causes adverse side effects. There is no HCV vaccine available. Thus, continued effort is required for developing a vaccine and better therapy. An HCV cell culture system is critical for studying various stages of HCV growth including viral entry, genome replication, packaging, and egress. In the current procedure presented, we used a wild-type intragenotype 2a chimeric virus, FNX-HCV, and a recombinant FNX-Rluc virus carrying a Renilla
luciferase reporter gene to study the virus replication. A human hepatoma cell line (Huh-7 based) was used for transfection of in vitro
transcribed HCV genomic RNAs. Cell-free culture supernatants, protein lysates and total RNA were harvested at various time points post-transfection to assess HCV growth. HCV genome replication status was evaluated by quantitative RT-PCR and visualizing the presence of HCV double-stranded RNA. The HCV protein expression was verified by Western blot and immunofluorescence assays using antibodies specific for HCV NS3 and NS5A proteins. HCV RNA transfected cells released infectious particles into culture supernatant and the viral titer was measured. Luciferase assays were utilized to assess the replication level and infectivity of reporter HCV. In conclusion, we present various virological assays for characterizing different stages of the HCV replication cycle.
Infectious Diseases, Issue 88, Hepatitis C Virus, HCV, Tumor-virus, Hepatitis C, Cirrhosis, Liver Cancer, Hepatocellular Carcinoma
Establishment of Microbial Eukaryotic Enrichment Cultures from a Chemically Stratified Antarctic Lake and Assessment of Carbon Fixation Potential
Institutions: Miami University .
Lake Bonney is one of numerous permanently ice-covered lakes located in the McMurdo Dry Valleys, Antarctica. The perennial ice cover maintains a chemically stratified water column and unlike other inland bodies of water, largely prevents external input of carbon and nutrients from streams. Biota are exposed to numerous environmental stresses, including year-round severe nutrient deficiency, low temperatures, extreme shade, hypersalinity, and 24-hour darkness during the winter 1
. These extreme environmental conditions limit the biota in Lake Bonney almost exclusively to microorganisms 2
Single-celled microbial eukaryotes (called "protists") are important players in global biogeochemical cycling 3
and play important ecological roles in the cycling of carbon in the dry valley lakes, occupying both primary and tertiary roles in the aquatic food web. In the dry valley aquatic food web, protists that fix inorganic carbon (autotrophy) are the major producers of organic carbon for organotrophic organisms 4, 2
. Phagotrophic or heterotrophic protists capable of ingesting bacteria and smaller protists act as the top predators in the food web 5
. Last, an unknown proportion of the protist population is capable of combined mixotrophic metabolism 6, 7
. Mixotrophy in protists involves the ability to combine photosynthetic capability with phagotrophic ingestion of prey microorganisms. This form of mixotrophy differs from mixotrophic metabolism in bacterial species, which generally involves uptake dissolved carbon molecules. There are currently very few protist isolates from permanently ice-capped polar lakes, and studies of protist diversity and ecology in this extreme environment have been limited 8, 4, 9, 10, 5
. A better understanding of protist metabolic versatility in the simple dry valley lake food web will aid in the development of models for the role of protists in the global carbon cycle.
We employed an enrichment culture approach to isolate potentially phototrophic and mixotrophic protists from Lake Bonney. Sampling depths in the water column were chosen based on the location of primary production maxima and protist phylogenetic diversity 4, 11
, as well as variability in major abiotic factors affecting protist trophic modes: shallow sampling depths are limited for major nutrients, while deeper sampling depths are limited by light availability. In addition, lake water samples were supplemented with multiple types of growth media to promote the growth of a variety of phototrophic organisms.
RubisCO catalyzes the rate limiting step in the Calvin Benson Bassham (CBB) cycle, the major pathway by which autotrophic organisms fix inorganic carbon and provide organic carbon for higher trophic levels in aquatic and terrestrial food webs 12
. In this study, we applied a radioisotope assay modified for filtered samples 13
to monitor maximum carboxylase activity as a proxy for carbon fixation potential and metabolic versatility in the Lake Bonney enrichment cultures.
Microbiology, Issue 62, Antarctic lake, McMurdo Dry Valleys, Enrichment cultivation, Microbial eukaryotes, RubisCO
Genomic MRI - a Public Resource for Studying Sequence Patterns within Genomic DNA
Institutions: University of Toledo Health Science Campus.
Non-coding genomic regions in complex eukaryotes, including intergenic areas, introns, and untranslated segments of exons, are profoundly non-random in their nucleotide composition and consist of a complex mosaic of sequence patterns. These patterns include so-called Mid-Range Inhomogeneity (MRI) regions -- sequences 30-10000 nucleotides in length that are enriched by a particular base or combination of bases (e.g. (G+T)-rich, purine-rich, etc.). MRI regions are associated with unusual (non-B-form) DNA structures that are often involved in regulation of gene expression, recombination, and other genetic processes (Fedorova & Fedorov 2010). The existence of a strong fixation bias within MRI regions against mutations that tend to reduce their sequence inhomogeneity additionally supports the functionality and importance of these genomic sequences (Prakash et al.
Here we demonstrate a freely available Internet resource -- the Genomic MRI
program package -- designed for computational analysis of genomic sequences in order to find and characterize various MRI patterns within them (Bechtel et al.
2008). This package also allows generation of randomized sequences with various properties and level of correspondence to the natural input DNA sequences. The main goal of this resource is to facilitate examination of vast regions of non-coding DNA that are still scarcely investigated and await thorough exploration and recognition.
Genetics, Issue 51, bioinformatics, computational biology, genomics, non-randomness, signals, gene regulation, DNA conformation
Rescue of Recombinant Newcastle Disease Virus from cDNA
Institutions: Icahn School of Medicine at Mount Sinai, Icahn School of Medicine at Mount Sinai, Icahn School of Medicine at Mount Sinai, University of Rochester.
Newcastle disease virus (NDV), the prototype member of the Avulavirus
genus of the family Paramyxoviridae1
, is a non-segmented, negative-sense, single-stranded, enveloped RNA virus (Figure 1)
with potential applications as a vector for vaccination and treatment of human diseases. In-depth exploration of these applications has only become possible after the establishment of reverse genetics techniques to rescue recombinant viruses from plasmids encoding their complete genomes as cDNA2-5
. Viral cDNA can be conveniently modified in vitro
by using standard cloning procedures to alter the genotype of the virus and/or to include new transcriptional units. Rescue of such genetically modified viruses provides a valuable tool to understand factors affecting multiple stages of infection, as well as allows for the development and improvement of vectors for the expression and delivery of antigens for vaccination and therapy. Here we describe a protocol for the rescue of recombinant NDVs.
Immunology, Issue 80, Paramyxoviridae, Vaccines, Oncolytic Virotherapy, Immunity, Innate, Newcastle disease virus (NDV), MVA-T7, reverse genetics techniques, plasmid transfection, recombinant virus, HA assay
The ITS2 Database
Institutions: University of Würzburg, University of Würzburg.
The internal transcribed spacer 2 (ITS2) has been used as a phylogenetic marker for more than two decades. As ITS2 research mainly focused on the very variable ITS2 sequence, it confined this marker to low-level phylogenetics only. However, the combination of the ITS2 sequence and its highly conserved secondary structure improves the phylogenetic resolution1
and allows phylogenetic inference at multiple taxonomic ranks, including species delimitation2-8
The ITS2 Database9
presents an exhaustive dataset of internal transcribed spacer 2 sequences from NCBI GenBank11
. Following an annotation by profile Hidden Markov Models (HMMs), the secondary structure of each sequence is predicted. First, it is tested whether a minimum energy based fold12
(direct fold) results in a correct, four helix conformation. If this is not the case, the structure is predicted by homology modeling13
. In homology modeling, an already known secondary structure is transferred to another ITS2 sequence, whose secondary structure was not able to fold correctly in a direct fold.
The ITS2 Database is not only a database for storage and retrieval of ITS2 sequence-structures. It also provides several tools to process your own ITS2 sequences, including annotation, structural prediction, motif detection and BLAST14
search on the combined sequence-structure information. Moreover, it integrates trimmed versions of 4SALE15,16
for multiple sequence-structure alignment calculation and Neighbor Joining18
tree reconstruction. Together they form a coherent analysis pipeline from an initial set of sequences to a phylogeny based on sequence and secondary structure.
In a nutshell, this workbench simplifies first phylogenetic analyses to only a few mouse-clicks, while additionally providing tools and data for comprehensive large-scale analyses.
Genetics, Issue 61, alignment, internal transcribed spacer 2, molecular systematics, secondary structure, ribosomal RNA, phylogenetic tree, homology modeling, phylogeny
Mapping Bacterial Functional Networks and Pathways in Escherichia Coli using Synthetic Genetic Arrays
Institutions: University of Toronto, University of Toronto, University of Regina.
Phenotypes are determined by a complex series of physical (e.g.
protein-protein) and functional (e.g.
gene-gene or genetic) interactions (GI)1
. While physical interactions can indicate which bacterial proteins are associated as complexes, they do not necessarily reveal pathway-level functional relationships1. GI screens, in which the growth of double mutants bearing two deleted or inactivated genes is measured and compared to the corresponding single mutants, can illuminate epistatic dependencies between loci and hence provide a means to query and discover novel functional relationships2
. Large-scale GI maps have been reported for eukaryotic organisms like yeast3-7
, but GI information remains sparse for prokaryotes8
, which hinders the functional annotation of bacterial genomes. To this end, we and others have developed high-throughput quantitative bacterial GI screening methods9, 10
Here, we present the key steps required to perform quantitative E. coli
Synthetic Genetic Array (eSGA) screening procedure on a genome-scale9
, using natural bacterial conjugation and homologous recombination to systemically generate and measure the fitness of large numbers of double mutants in a colony array format.
Briefly, a robot is used to transfer, through conjugation, chloramphenicol (Cm) - marked mutant alleles from engineered Hfr (High frequency of recombination) 'donor strains' into an ordered array of kanamycin (Kan) - marked F- recipient strains. Typically, we use loss-of-function single mutants bearing non-essential gene deletions (e.g.
the 'Keio' collection11
) and essential gene hypomorphic mutations (i.e.
alleles conferring reduced protein expression, stability, or activity9, 12, 13
) to query the functional associations of non-essential and essential genes, respectively. After conjugation and ensuing genetic exchange mediated by homologous recombination, the resulting double mutants are selected on solid medium containing both antibiotics. After outgrowth, the plates are digitally imaged and colony sizes are quantitatively scored using an in-house automated image processing system14
. GIs are revealed when the growth rate of a double mutant is either significantly better or worse than expected9
. Aggravating (or negative) GIs often result between loss-of-function mutations in pairs of genes from compensatory pathways that impinge on the same essential process2
. Here, the loss of a single gene is buffered, such that either single mutant is viable. However, the loss of both pathways is deleterious and results in synthetic lethality or sickness (i.e.
slow growth). Conversely, alleviating (or positive) interactions can occur between genes in the same pathway or protein complex2
as the deletion of either gene alone is often sufficient to perturb the normal function of the pathway or complex such that additional perturbations do not reduce activity, and hence growth, further. Overall, systematically identifying and analyzing GI networks can provide unbiased, global maps of the functional relationships between large numbers of genes, from which pathway-level information missed by other approaches can be inferred9
Genetics, Issue 69, Molecular Biology, Medicine, Biochemistry, Microbiology, Aggravating, alleviating, conjugation, double mutant, Escherichia coli, genetic interaction, Gram-negative bacteria, homologous recombination, network, synthetic lethality or sickness, suppression
Automated, Quantitative Cognitive/Behavioral Screening of Mice: For Genetics, Pharmacology, Animal Cognition and Undergraduate Instruction
Institutions: Rutgers University, Koç University, New York University, Fairfield University.
We describe a high-throughput, high-volume, fully automated, live-in 24/7 behavioral testing system for assessing the effects of genetic and pharmacological manipulations on basic mechanisms of cognition and learning in mice. A standard polypropylene mouse housing tub is connected through an acrylic tube to a standard commercial mouse test box. The test box has 3 hoppers, 2 of which are connected to pellet feeders. All are internally illuminable with an LED and monitored for head entries by infrared (IR) beams. Mice live in the environment, which eliminates handling during screening. They obtain their food during two or more daily feeding periods by performing in operant (instrumental) and Pavlovian (classical) protocols, for which we have written protocol-control software and quasi-real-time data analysis and graphing software. The data analysis and graphing routines are written in a MATLAB-based language created to simplify greatly the analysis of large time-stamped behavioral and physiological event records and to preserve a full data trail from raw data through all intermediate analyses to the published graphs and statistics within a single data structure. The data-analysis code harvests the data several times a day and subjects it to statistical and graphical analyses, which are automatically stored in the "cloud" and on in-lab computers. Thus, the progress of individual mice is visualized and quantified daily. The data-analysis code talks to the protocol-control code, permitting the automated advance from protocol to protocol of individual subjects. The behavioral protocols implemented are matching, autoshaping, timed hopper-switching, risk assessment in timed hopper-switching, impulsivity measurement, and the circadian anticipation of food availability. Open-source protocol-control and data-analysis code makes the addition of new protocols simple. Eight test environments fit in a 48 in x 24 in x 78 in cabinet; two such cabinets (16 environments) may be controlled by one computer.
Behavior, Issue 84, genetics, cognitive mechanisms, behavioral screening, learning, memory, timing
Using Coculture to Detect Chemically Mediated Interspecies Interactions
Institutions: University of North Carolina at Chapel Hill .
In nature, bacteria rarely exist in isolation; they are instead surrounded by a diverse array of other microorganisms that alter the local environment by secreting metabolites. These metabolites have the potential to modulate the physiology and differentiation of their microbial neighbors and are likely important factors in the establishment and maintenance of complex microbial communities. We have developed a fluorescence-based coculture screen to identify such chemically mediated microbial interactions. The screen involves combining a fluorescent transcriptional reporter strain with environmental microbes on solid media and allowing the colonies to grow in coculture. The fluorescent transcriptional reporter is designed so that the chosen bacterial strain fluoresces when it is expressing a particular phenotype of interest (i.e.
biofilm formation, sporulation, virulence factor production, etc
.) Screening is performed under growth conditions where this phenotype is not
expressed (and therefore the reporter strain is typically nonfluorescent). When an environmental microbe secretes a metabolite that activates this phenotype, it diffuses through the agar and activates the fluorescent reporter construct. This allows the inducing-metabolite-producing microbe to be detected: they are the nonfluorescent colonies most proximal to the fluorescent colonies. Thus, this screen allows the identification of environmental microbes that produce diffusible metabolites that activate a particular physiological response in a reporter strain. This publication discusses how to: a) select appropriate coculture screening conditions, b) prepare the reporter and environmental microbes for screening, c) perform the coculture screen, d) isolate putative inducing organisms, and e) confirm their activity in a secondary screen. We developed this method to screen for soil organisms that activate biofilm matrix-production in Bacillus subtilis
; however, we also discuss considerations for applying this approach to other genetically tractable bacteria.
Microbiology, Issue 80, High-Throughput Screening Assays, Genes, Reporter, Microbial Interactions, Soil Microbiology, Coculture, microbial interactions, screen, fluorescent transcriptional reporters, Bacillus subtilis
Unraveling the Unseen Players in the Ocean - A Field Guide to Water Chemistry and Marine Microbiology
Institutions: San Diego State University, University of California San Diego.
Here we introduce a series of thoroughly tested and well standardized research protocols adapted for use in remote marine environments. The sampling protocols include the assessment of resources available to the microbial community (dissolved organic carbon, particulate organic matter, inorganic nutrients), and a comprehensive description of the viral and bacterial communities (via direct viral and microbial counts, enumeration of autofluorescent microbes, and construction of viral and microbial metagenomes). We use a combination of methods, which represent a dispersed field of scientific disciplines comprising already established protocols and some of the most recent techniques developed. Especially metagenomic sequencing techniques used for viral and bacterial community characterization, have been established only in recent years, and are thus still subjected to constant improvement. This has led to a variety of sampling and sample processing procedures currently in use. The set of methods presented here provides an up to date approach to collect and process environmental samples. Parameters addressed with these protocols yield the minimum on information essential to characterize and understand the underlying mechanisms of viral and microbial community dynamics. It gives easy to follow guidelines to conduct comprehensive surveys and discusses critical steps and potential caveats pertinent to each technique.
Environmental Sciences, Issue 93, dissolved organic carbon, particulate organic matter, nutrients, DAPI, SYBR, microbial metagenomics, viral metagenomics, marine environment
Characterization of Complex Systems Using the Design of Experiments Approach: Transient Protein Expression in Tobacco as a Case Study
Institutions: RWTH Aachen University, Fraunhofer Gesellschaft.
Plants provide multiple benefits for the production of biopharmaceuticals including low costs, scalability, and safety. Transient expression offers the additional advantage of short development and production times, but expression levels can vary significantly between batches thus giving rise to regulatory concerns in the context of good manufacturing practice. We used a design of experiments (DoE) approach to determine the impact of major factors such as regulatory elements in the expression construct, plant growth and development parameters, and the incubation conditions during expression, on the variability of expression between batches. We tested plants expressing a model anti-HIV monoclonal antibody (2G12) and a fluorescent marker protein (DsRed). We discuss the rationale for selecting certain properties of the model and identify its potential limitations. The general approach can easily be transferred to other problems because the principles of the model are broadly applicable: knowledge-based parameter selection, complexity reduction by splitting the initial problem into smaller modules, software-guided setup of optimal experiment combinations and step-wise design augmentation. Therefore, the methodology is not only useful for characterizing protein expression in plants but also for the investigation of other complex systems lacking a mechanistic description. The predictive equations describing the interconnectivity between parameters can be used to establish mechanistic models for other complex systems.
Bioengineering, Issue 83, design of experiments (DoE), transient protein expression, plant-derived biopharmaceuticals, promoter, 5'UTR, fluorescent reporter protein, model building, incubation conditions, monoclonal antibody
Isolation and Chemical Characterization of Lipid A from Gram-negative Bacteria
Institutions: The University of Texas at Austin, The University of Texas at Austin, The University of Texas at Austin.
Lipopolysaccharide (LPS) is the major cell surface molecule of gram-negative bacteria, deposited on the outer leaflet of the outer membrane bilayer. LPS can be subdivided into three domains: the distal O-polysaccharide, a core oligosaccharide, and the lipid A domain consisting of a lipid A molecular species and 3-deoxy-D-manno-oct-2-ulosonic acid residues (Kdo). The lipid A domain is the only component essential for bacterial cell survival. Following its synthesis, lipid A is chemically modified in response to environmental stresses such as pH or temperature, to promote resistance to antibiotic compounds, and to evade recognition by mediators of the host innate immune response. The following protocol details the small- and large-scale isolation of lipid A from gram-negative bacteria. Isolated material is then chemically characterized by thin layer chromatography (TLC) or mass-spectrometry (MS). In addition to matrix-assisted laser desorption/ionization-time of flight (MALDI-TOF) MS, we also describe tandem MS protocols for analyzing lipid A molecular species using electrospray ionization (ESI) coupled to collision induced dissociation (CID) and newly employed ultraviolet photodissociation (UVPD) methods. Our MS protocols allow for unequivocal determination of chemical structure, paramount to characterization of lipid A molecules that contain unique or novel chemical modifications. We also describe the radioisotopic labeling, and subsequent isolation, of lipid A from bacterial cells for analysis by TLC. Relative to MS-based protocols, TLC provides a more economical and rapid characterization method, but cannot be used to unambiguously assign lipid A chemical structures without the use of standards of known chemical structure. Over the last two decades isolation and characterization of lipid A has led to numerous exciting discoveries that have improved our understanding of the physiology of gram-negative bacteria, mechanisms of antibiotic resistance, the human innate immune response, and have provided many new targets in the development of antibacterial compounds.
Chemistry, Issue 79, Membrane Lipids, Toll-Like Receptors, Endotoxins, Glycolipids, Lipopolysaccharides, Lipid A, Microbiology, Lipids, lipid A, Bligh-Dyer, thin layer chromatography (TLC), lipopolysaccharide, mass spectrometry, Collision Induced Dissociation (CID), Photodissociation (PD)
From Voxels to Knowledge: A Practical Guide to the Segmentation of Complex Electron Microscopy 3D-Data
Institutions: Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory.
Modern 3D electron microscopy approaches have recently allowed unprecedented insight into the 3D ultrastructural organization of cells and tissues, enabling the visualization of large macromolecular machines, such as adhesion complexes, as well as higher-order structures, such as the cytoskeleton and cellular organelles in their respective cell and tissue context. Given the inherent complexity of cellular volumes, it is essential to first extract the features of interest in order to allow visualization, quantification, and therefore comprehension of their 3D organization. Each data set is defined by distinct characteristics, e.g.
, signal-to-noise ratio, crispness (sharpness) of the data, heterogeneity of its features, crowdedness of features, presence or absence of characteristic shapes that allow for easy identification, and the percentage of the entire volume that a specific region of interest occupies. All these characteristics need to be considered when deciding on which approach to take for segmentation.
The six different 3D ultrastructural data sets presented were obtained by three different imaging approaches: resin embedded stained electron tomography, focused ion beam- and serial block face- scanning electron microscopy (FIB-SEM, SBF-SEM) of mildly stained and heavily stained samples, respectively. For these data sets, four different segmentation approaches have been applied: (1) fully manual model building followed solely by visualization of the model, (2) manual tracing segmentation of the data followed by surface rendering, (3) semi-automated approaches followed by surface rendering, or (4) automated custom-designed segmentation algorithms followed by surface rendering and quantitative analysis. Depending on the combination of data set characteristics, it was found that typically one of these four categorical approaches outperforms the others, but depending on the exact sequence of criteria, more than one approach may be successful. Based on these data, we propose a triage scheme that categorizes both objective data set characteristics and subjective personal criteria for the analysis of the different data sets.
Bioengineering, Issue 90, 3D electron microscopy, feature extraction, segmentation, image analysis, reconstruction, manual tracing, thresholding
Isolation of Fidelity Variants of RNA Viruses and Characterization of Virus Mutation Frequency
Institutions: Institut Pasteur .
RNA viruses use RNA dependent RNA polymerases to replicate their genomes. The intrinsically high error rate of these enzymes is a large contributor to the generation of extreme population diversity that facilitates virus adaptation and evolution. Increasing evidence shows that the intrinsic error rates, and the resulting mutation frequencies, of RNA viruses can be modulated by subtle amino acid changes to the viral polymerase. Although biochemical assays exist for some viral RNA polymerases that permit quantitative measure of incorporation fidelity, here we describe a simple method of measuring mutation frequencies of RNA viruses that has proven to be as accurate as biochemical approaches in identifying fidelity altering mutations. The approach uses conventional virological and sequencing techniques that can be performed in most biology laboratories. Based on our experience with a number of different viruses, we have identified the key steps that must be optimized to increase the likelihood of isolating fidelity variants and generating data of statistical significance. The isolation and characterization of fidelity altering mutations can provide new insights into polymerase structure and function1-3
. Furthermore, these fidelity variants can be useful tools in characterizing mechanisms of virus adaptation and evolution4-7
Immunology, Issue 52, Polymerase fidelity, RNA virus, mutation frequency, mutagen, RNA polymerase, viral evolution
In Situ Neutron Powder Diffraction Using Custom-made Lithium-ion Batteries
Institutions: University of Sydney, University of Wollongong, Australian Synchrotron, Australian Nuclear Science and Technology Organisation, University of Wollongong, University of New South Wales.
Li-ion batteries are widely used in portable electronic devices and are considered as promising candidates for higher-energy applications such as electric vehicles.1,2
However, many challenges, such as energy density and battery lifetimes, need to be overcome before this particular battery technology can be widely implemented in such applications.3
This research is challenging, and we outline a method to address these challenges using in situ
NPD to probe the crystal structure of electrodes undergoing electrochemical cycling (charge/discharge) in a battery. NPD data help determine the underlying structural mechanism responsible for a range of electrode properties, and this information can direct the development of better electrodes and batteries.
We briefly review six types of battery designs custom-made for NPD experiments and detail the method to construct the ‘roll-over’ cell that we have successfully used on the high-intensity NPD instrument, WOMBAT, at the Australian Nuclear Science and Technology Organisation (ANSTO). The design considerations and materials used for cell construction are discussed in conjunction with aspects of the actual in situ
NPD experiment and initial directions are presented on how to analyze such complex in situ
Physics, Issue 93, In operando, structure-property relationships, electrochemical cycling, electrochemical cells, crystallography, battery performance
Extracting DNA from the Gut Microbes of the Termite (Zootermopsis Angusticollis) and Visualizing Gut Microbes
Institutions: California Institute of Technology - Caltech.
Termites are among the few animals known to have the capacity to subsist solely by consuming wood. The termite gut tract contains a dense and species-rich microbial population that assists in the degradation of lignocellulose predominantly into acetate, the key nutrient fueling termite metabolism (Odelson & Breznak, 1983). Within these microbial populations are bacteria, methanogenic archaea and, in some ("lower") termites, eukaryotic protozoa. Thus, termites are excellent research subjects for studying the interactions among microbial species and the numerous biochemical functions they perform to the benefit of their host. The species composition of microbial populations in termite guts as well as key genes involved in various biochemical processes has been explored using molecular techniques (Kudo et al., 1998; Schmit-Wagner et al., 2003; Salmassi & Leadbetter, 2003). These techniques depend on the extraction and purification of high-quality nucleic acids from the termite gut environment. The extraction technique described in this video is a modified compilation of protocols developed for extraction and purification of nucleic acids from environmental samples (Mor et al., 1994; Berthelet et al., 1996; Purdy et al., 1996; Salmassi & Leadbetter, 2003; Ottesen et al. 2006) and it produces DNA from termite hindgut material suitable for use as template for polymerase chain reaction (PCR).
Microbiology, issue 4, microbial community, DNA, extraction, gut, termite
Molecular Evolution of the Tre Recombinase
Institutions: Max Plank Institute for Molecular Cell Biology and Genetics, Dresden.
Here we report the generation of Tre recombinase through directed, molecular evolution. Tre recombinase recognizes a pre-defined target sequence within the LTR sequences of the HIV-1 provirus, resulting in the excision and eradication of the provirus from infected human cells.
We started with Cre, a 38-kDa recombinase, that recognizes a 34-bp double-stranded DNA sequence known as loxP. Because Cre can effectively eliminate genomic sequences, we set out to tailor a recombinase that could remove the sequence between the 5'-LTR and 3'-LTR of an integrated HIV-1 provirus. As a first step we identified sequences within the LTR sites that were similar to loxP and tested for recombination activity. Initially Cre and mutagenized Cre libraries failed to recombine the chosen loxLTR sites of the HIV-1 provirus. As the start of any directed molecular evolution process requires at least residual activity, the original asymmetric loxLTR sequences were split into subsets and tested again for recombination activity. Acting as intermediates, recombination activity was shown with the subsets. Next, recombinase libraries were enriched through reiterative evolution cycles. Subsequently, enriched libraries were shuffled and recombined. The combination of different mutations proved synergistic and recombinases were created that were able to recombine loxLTR1 and loxLTR2. This was evidence that an evolutionary strategy through intermediates can be successful. After a total of 126 evolution cycles individual recombinases were functionally and structurally analyzed. The most active recombinase -- Tre -- had 19 amino acid changes as compared to Cre. Tre recombinase was able to excise the HIV-1 provirus from the genome HIV-1 infected HeLa cells (see "HIV-1 Proviral DNA Excision Using an Evolved Recombinase", Hauber J., Heinrich-Pette-Institute for Experimental Virology and Immunology, Hamburg, Germany). While still in its infancy, directed molecular evolution will allow the creation of custom enzymes that will serve as tools of "molecular surgery" and molecular medicine.
Cell Biology, Issue 15, HIV-1, Tre recombinase, Site-specific recombination, molecular evolution
Electroporation of Mycobacteria
Institutions: Barts and the London School of Medicine and Dentistry, Barts and the London School of Medicine and Dentistry.
High efficiency transformation is a major limitation in the study of mycobacteria. The genus Mycobacterium can be difficult to transform; this is mainly caused by the thick and waxy cell wall, but is compounded by the fact that most molecular techniques have been developed for distantly-related species such as Escherichia coli and Bacillus subtilis. In spite of these obstacles, mycobacterial plasmids have been identified and DNA transformation of many mycobacterial species have now been described. The most successful method for introducing DNA into mycobacteria is electroporation. Many parameters contribute to successful transformation; these include the species/strain, the nature of the transforming DNA, the selectable marker used, the growth medium, and the conditions for the electroporation pulse. Optimized methods for the transformation of both slow- and fast-grower are detailed here. Transformation efficiencies for different mycobacterial species and with various selectable markers are reported.
Microbiology, Issue 15, Springer Protocols, Mycobacteria, Electroporation, Bacterial Transformation, Transformation Efficiency, Bacteria, Tuberculosis, M. Smegmatis, Springer Protocols