Many researchers, across incredibly diverse foci, are applying phylogenetics to their research question(s). However, many researchers are new to this topic and so it presents inherent problems. Here we compile a practical introduction to phylogenetics for nonexperts. We outline in a step-by-step manner, a pipeline for generating reliable phylogenies from gene sequence datasets. We begin with a user-guide for similarity search tools via online interfaces as well as local executables. Next, we explore programs for generating multiple sequence alignments followed by protocols for using software to determine best-fit models of evolution. We then outline protocols for reconstructing phylogenetic relationships via maximum likelihood and Bayesian criteria and finally describe tools for visualizing phylogenetic trees. While this is not by any means an exhaustive description of phylogenetic approaches, it does provide the reader with practical starting information on key software applications commonly utilized by phylogeneticists. The vision for this article would be that it could serve as a practical training tool for researchers embarking on phylogenetic studies and also serve as an educational resource that could be incorporated into a classroom or teaching-lab.
22 Related JoVE Articles!
Surface Renewal: An Advanced Micrometeorological Method for Measuring and Processing Field-Scale Energy Flux Density Data
Institutions: United States Department of Agriculture-Agricultural Research Service, University of California, Davis, University of Chile, University of California, Davis, URS Corporation Australia Pty. Ltd..
Advanced micrometeorological methods have become increasingly important in soil, crop, and environmental sciences. For many scientists without formal training in atmospheric science, these techniques are relatively inaccessible. Surface renewal and other flux measurement methods require an understanding of boundary layer meteorology and extensive training in instrumentation and multiple data management programs. To improve accessibility of these techniques, we describe the underlying theory of surface renewal measurements, demonstrate how to set up a field station for surface renewal with eddy covariance calibration, and utilize our open-source turnkey data logger program to perform flux data acquisition and processing. The new turnkey program returns to the user a simple data table with the corrected fluxes and quality control parameters, and eliminates the need for researchers to shuttle between multiple processing programs to obtain the final flux data. An example of data generated from these measurements demonstrates how crop water use is measured with this technique. The output information is useful to growers for making irrigation decisions in a variety of agricultural ecosystems. These stations are currently deployed in numerous field experiments by researchers in our group and the California Department of Water Resources in the following crops: rice, wine and raisin grape vineyards, alfalfa, almond, walnut, peach, lemon, avocado, and corn.
Environmental Sciences, Issue 82, Conservation of Natural Resources, Engineering, Agriculture, plants, energy balance, irrigated agriculture, flux data, evapotranspiration, agrometeorology
Diagnosing Pulmonary Tuberculosis with the Xpert MTB/RIF Test
Institutions: University of Bern, MCL Laboratories Inc..
Tuberculosis (TB) due to Mycobacterium tuberculosis
(MTB) remains a major public health issue: the infection affects up to one third of the world population1
, and almost two million people are killed by TB each year.2
Universal access to high-quality, patient-centered treatment for all TB patients is emphasized by WHO's Stop TB Strategy.3
The rapid detection of MTB in respiratory specimens and drug therapy based on reliable drug resistance testing results are a prerequisite for the successful implementation of this strategy. However, in many areas of the world, TB diagnosis still relies on insensitive, poorly standardized sputum microscopy methods. Ineffective TB detection and the emergence and transmission of drug-resistant MTB strains increasingly jeopardize global TB control activities.2
Effective diagnosis of pulmonary TB requires the availability - on a global scale - of standardized, easy-to-use, and robust diagnostic tools that would allow the direct detection of both the MTB complex and resistance to key antibiotics, such as rifampicin (RIF). The latter result can serve as marker for multidrug-resistant MTB (MDR TB) and has been reported in > 95% of the MDR-TB isolates.4, 5
The rapid availability of reliable test results is likely to directly translate into sound patient management decisions that, ultimately, will cure the individual patient and break the chain of TB transmission in the community.2
Cepheid's (Sunnyvale, CA, U.S.A.) Xpert MTB/RIF assay6, 7
meets the demands outlined above in a remarkable manner. It is a nucleic-acids amplification test for 1) the detection of MTB complex DNA in sputum or concentrated sputum sediments; and 2) the detection of RIF resistance-associated mutations of the rpoB
It is designed for use with Cepheid's GeneXpert Dx System that integrates and automates sample processing, nucleic acid amplification, and detection of the target sequences using real-time PCR and reverse transcriptase PCR. The system consists of an instrument, personal computer, barcode scanner, and preloaded software for running tests and viewing the results.9
It employs single-use disposable Xpert MTB/RIF cartridges that hold PCR reagents and host the PCR process. Because the cartridges are self-contained, cross-contamination between samples is eliminated.6
Current nucleic acid amplification methods used to detect MTB are complex, labor-intensive, and technically demanding. The Xpert MTB/RIF assay has the potential to bring standardized, sensitive and very specific diagnostic testing for both TB and drug resistance to universal-access point-of-care settings3
, provided that they will be able to afford it. In order to facilitate access, the Foundation for Innovative New Diagnostics (FIND) has negotiated significant price reductions. Current FIND-negotiated prices, along with the list of countries eligible for the discounts, are available on the web.10
Immunology, Issue 62, tuberculosis, drug resistance, rifampicin, rapid diagnosis, Xpert MTB/RIF test
A Next-generation Tissue Microarray (ngTMA) Protocol for Biomarker Studies
Institutions: University of Bern.
Biomarker research relies on tissue microarrays (TMA). TMAs are produced by repeated transfer of small tissue cores from a ‘donor’ block into a ‘recipient’ block and then used for a variety of biomarker applications. The construction of conventional TMAs is labor intensive, imprecise, and time-consuming. Here, a protocol using next-generation Tissue Microarrays (ngTMA) is outlined. ngTMA is based on TMA planning and design, digital pathology, and automated tissue microarraying. The protocol is illustrated using an example of 134 metastatic colorectal cancer patients. Histological, statistical and logistical aspects are considered, such as the tissue type, specific histological regions, and cell types for inclusion in the TMA, the number of tissue spots, sample size, statistical analysis, and number of TMA copies. Histological slides for each patient are scanned and uploaded onto a web-based digital platform. There, they are viewed and annotated (marked) using a 0.6-2.0 mm diameter tool, multiple times using various colors to distinguish tissue areas. Donor blocks and 12 ‘recipient’ blocks are loaded into the instrument. Digital slides are retrieved and matched to donor block images. Repeated arraying of annotated regions is automatically performed resulting in an ngTMA. In this example, six ngTMAs are planned containing six different tissue types/histological zones. Two copies of the ngTMAs are desired. Three to four slides for each patient are scanned; 3 scan runs are necessary and performed overnight. All slides are annotated; different colors are used to represent the different tissues/zones, namely tumor center, invasion front, tumor/stroma, lymph node metastases, liver metastases, and normal tissue. 17 annotations/case are made; time for annotation is 2-3 min/case. 12 ngTMAs are produced containing 4,556 spots. Arraying time is 15-20 hr. Due to its precision, flexibility and speed, ngTMA is a powerful tool to further improve the quality of TMAs used in clinical and translational research.
Medicine, Issue 91, tissue microarray, biomarkers, prognostic, predictive, digital pathology, slide scanning
Accuracy in Dental Medicine, A New Way to Measure Trueness and Precision
Institutions: University of Zürich.
Reference scanners are used in dental medicine to verify a lot of procedures. The main interest is to verify impression methods as they serve as a base for dental restorations. The current limitation of many reference scanners is the lack of accuracy scanning large objects like full dental arches, or the limited possibility to assess detailed tooth surfaces. A new reference scanner, based on focus variation scanning technique, was evaluated with regards to highest local and general accuracy. A specific scanning protocol was tested to scan original tooth surface from dental impressions. Also, different model materials were verified. The results showed a high scanning accuracy of the reference scanner with a mean deviation of 5.3 ± 1.1 µm for trueness and 1.6 ± 0.6 µm for precision in case of full arch scans. Current dental impression methods showed much higher deviations (trueness: 20.4 ± 2.2 µm, precision: 12.5 ± 2.5 µm) than the internal scanning accuracy of the reference scanner. Smaller objects like single tooth surface can be scanned with an even higher accuracy, enabling the system to assess erosive and abrasive tooth surface loss. The reference scanner can be used to measure differences for a lot of dental research fields. The different magnification levels combined with a high local and general accuracy can be used to assess changes of single teeth or restorations up to full arch changes.
Medicine, Issue 86, Laboratories, Dental, Calibration, Technology, Dental impression, Accuracy, Trueness, Precision, Full arch scan, Abrasion
Characterization of Surface Modifications by White Light Interferometry: Applications in Ion Sputtering, Laser Ablation, and Tribology Experiments
Institutions: Argonne National Laboratory, Argonne National Laboratory, MassThink LLC.
In materials science and engineering it is often necessary to obtain quantitative measurements of surface topography with micrometer lateral resolution. From the measured surface, 3D topographic maps can be subsequently analyzed using a variety of software packages to extract the information that is needed.
In this article we describe how white light interferometry, and optical profilometry (OP) in general, combined with generic surface analysis software, can be used for materials science and engineering tasks. In this article, a number of applications of white light interferometry for investigation of surface modifications in mass spectrometry, and wear phenomena in tribology and lubrication are demonstrated. We characterize the products of the interaction of semiconductors and metals with energetic ions (sputtering), and laser irradiation (ablation), as well as ex situ
measurements of wear of tribological test specimens.
Specifically, we will discuss:
Aspects of traditional ion sputtering-based mass spectrometry such as sputtering rates/yields measurements on Si and Cu and subsequent time-to-depth conversion.
Results of quantitative characterization of the interaction of femtosecond laser irradiation with a semiconductor surface. These results are important for applications such as ablation mass spectrometry, where the quantities of evaporated material can be studied and controlled via pulse duration and energy per pulse. Thus, by determining the crater geometry one can define depth and lateral resolution versus experimental setup conditions.
Measurements of surface roughness parameters in two dimensions, and quantitative measurements of the surface wear that occur as a result of friction and wear tests.
Some inherent drawbacks, possible artifacts, and uncertainty assessments of the white light interferometry approach will be discussed and explained.
Materials Science, Issue 72, Physics, Ion Beams (nuclear interactions), Light Reflection, Optical Properties, Semiconductor Materials, White Light Interferometry, Ion Sputtering, Laser Ablation, Femtosecond Lasers, Depth Profiling, Time-of-flight Mass Spectrometry, Tribology, Wear Analysis, Optical Profilometry, wear, friction, atomic force microscopy, AFM, scanning electron microscopy, SEM, imaging, visualization
Perceptual and Category Processing of the Uncanny Valley Hypothesis' Dimension of Human Likeness: Some Methodological Issues
Institutions: University of Zurich.
Mori's Uncanny Valley Hypothesis1,2
proposes that the perception of humanlike characters such as robots and, by extension, avatars (computer-generated characters) can evoke negative or positive affect (valence) depending on the object's degree of visual and behavioral realism along a dimension of human likeness
) (Figure 1
). But studies of affective valence of subjective responses to variously realistic non-human characters have produced inconsistent findings 3, 4, 5, 6
. One of a number of reasons for this is that human likeness is not perceived as the hypothesis assumes. While the DHL can be defined following Mori's description as a smooth linear change in the degree of physical humanlike similarity, subjective perception of objects along the DHL can be understood in terms of the psychological effects of categorical perception (CP) 7
. Further behavioral and neuroimaging investigations of category processing and CP along the DHL and of the potential influence of the dimension's underlying category structure on affective experience are needed. This protocol therefore focuses on the DHL and allows examination of CP. Based on the protocol presented in the video as an example, issues surrounding the methodology in the protocol and the use in "uncanny" research of stimuli drawn from morph continua to represent the DHL are discussed in the article that accompanies the video. The use of neuroimaging and morph stimuli to represent the DHL in order to disentangle brain regions neurally responsive to physical human-like similarity from those responsive to category change and category processing is briefly illustrated.
Behavior, Issue 76, Neuroscience, Neurobiology, Molecular Biology, Psychology, Neuropsychology, uncanny valley, functional magnetic resonance imaging, fMRI, categorical perception, virtual reality, avatar, human likeness, Mori, uncanny valley hypothesis, perception, magnetic resonance imaging, MRI, imaging, clinical techniques
A Protocol for Computer-Based Protein Structure and Function Prediction
Institutions: University of Michigan , University of Kansas.
Genome sequencing projects have ciphered millions of protein sequence, which require knowledge of their structure and function to improve the understanding of their biological role. Although experimental methods can provide detailed information for a small fraction of these proteins, computational modeling is needed for the majority of protein molecules which are experimentally uncharacterized. The I-TASSER server is an on-line workbench for high-resolution modeling of protein structure and function. Given a protein sequence, a typical output from the I-TASSER server includes secondary structure prediction, predicted solvent accessibility of each residue, homologous template proteins detected by threading and structure alignments, up to five full-length tertiary structural models, and structure-based functional annotations for enzyme classification, Gene Ontology terms and protein-ligand binding sites. All the predictions are tagged with a confidence score which tells how accurate the predictions are without knowing the experimental data. To facilitate the special requests of end users, the server provides channels to accept user-specified inter-residue distance and contact maps to interactively change the I-TASSER modeling; it also allows users to specify any proteins as template, or to exclude any template proteins during the structure assembly simulations. The structural information could be collected by the users based on experimental evidences or biological insights with the purpose of improving the quality of I-TASSER predictions. The server was evaluated as the best programs for protein structure and function predictions in the recent community-wide CASP experiments. There are currently >20,000 registered scientists from over 100 countries who are using the on-line I-TASSER server.
Biochemistry, Issue 57, On-line server, I-TASSER, protein structure prediction, function prediction
Utility of Dissociated Intrinsic Hand Muscle Atrophy in the Diagnosis of Amyotrophic Lateral Sclerosis
Institutions: Westmead Hospital, University of Sydney, Australia.
The split hand
phenomenon refers to predominant wasting of thenar muscles and is an early and specific feature of amyotrophic lateral sclerosis (ALS). A novel split hand index (SI) was developed to quantify the split hand phenomenon, and its diagnostic utility was assessed in ALS patients. The split hand index was derived by dividing the product of the compound muscle action potential (CMAP) amplitude recorded over the abductor pollicis brevis and first dorsal interosseous muscles by the CMAP amplitude recorded over the abductor digiti minimi muscle. In order to assess the diagnostic utility of the split hand index, ALS patients were prospectively assessed and their results were compared to neuromuscular disorder patients. The split hand index was significantly reduced in ALS when compared to neuromuscular disorder patients (P<0.0001). Limb-onset ALS patients exhibited the greatest reduction in the split hand index, and a value of 5.2 or less reliably differentiated ALS from other neuromuscular disorders. Consequently, the split hand index appears to be a novel diagnostic biomarker for ALS, perhaps facilitating an earlier diagnosis.
Medicine, Issue 85, Amyotrophic Lateral Sclerosis (ALS), dissociated muscle atrophy, hypothenar muscles, motor neuron disease, split-hand index, thenar muscles
A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types
Institutions: Stony Brook University, Cold Spring Harbor Laboratory, University of Texas at Dallas.
ChIPseq is a widely used technique for investigating protein-DNA interactions. Read density profiles are generated by using next-sequencing of protein-bound DNA and aligning the short reads to a reference genome. Enriched regions are revealed as peaks, which often differ dramatically in shape, depending on the target protein1
. For example, transcription factors often bind in a site- and sequence-specific manner and tend to produce punctate peaks, while histone modifications are more pervasive and are characterized by broad, diffuse islands of enrichment2
. Reliably identifying these regions was the focus of our work.
Algorithms for analyzing ChIPseq data have employed various methodologies, from heuristics3-5
to more rigorous statistical models, e.g.
Hidden Markov Models (HMMs)6-8
. We sought a solution that minimized the necessity for difficult-to-define, ad hoc parameters that often compromise resolution and lessen the intuitive usability of the tool. With respect to HMM-based methods, we aimed to curtail parameter estimation procedures and simple, finite state classifications that are often utilized.
Additionally, conventional ChIPseq data analysis involves categorization of the expected read density profiles as either punctate or diffuse followed by subsequent application of the appropriate tool. We further aimed to replace the need for these two distinct models with a single, more versatile model, which can capably address the entire spectrum of data types.
To meet these objectives, we first constructed a statistical framework that naturally modeled ChIPseq data structures using a cutting edge advance in HMMs9
, which utilizes only explicit formulas-an innovation crucial to its performance advantages. More sophisticated then heuristic models, our HMM accommodates infinite hidden states through a Bayesian model. We applied it to identifying reasonable change points in read density, which further define segments of enrichment. Our analysis revealed how our Bayesian Change Point (BCP) algorithm had a reduced computational complexity-evidenced by an abridged run time and memory footprint. The BCP algorithm was successfully applied to both punctate peak and diffuse island identification with robust accuracy and limited user-defined parameters. This illustrated both its versatility and ease of use. Consequently, we believe it can be implemented readily across broad ranges of data types and end users in a manner that is easily compared and contrasted, making it a great tool for ChIPseq data analysis that can aid in collaboration and corroboration between research groups. Here, we demonstrate the application of BCP to existing transcription factor10,11
and epigenetic data12
to illustrate its usefulness.
Genetics, Issue 70, Bioinformatics, Genomics, Molecular Biology, Cellular Biology, Immunology, Chromatin immunoprecipitation, ChIP-Seq, histone modifications, segmentation, Bayesian, Hidden Markov Models, epigenetics
Test Samples for Optimizing STORM Super-Resolution Microscopy
Institutions: National Physical Laboratory.
STORM is a recently developed super-resolution microscopy technique with up to 10 times better resolution than standard fluorescence microscopy techniques. However, as the image is acquired in a very different way than normal, by building up an image molecule-by-molecule, there are some significant challenges for users in trying to optimize their image acquisition. In order to aid this process and gain more insight into how STORM works we present the preparation of 3 test samples and the methodology of acquiring and processing STORM super-resolution images with typical resolutions of between 30-50 nm. By combining the test samples with the use of the freely available rainSTORM processing software it is possible to obtain a great deal of information about image quality and resolution. Using these metrics it is then possible to optimize the imaging procedure from the optics, to sample preparation, dye choice, buffer conditions, and image acquisition settings. We also show examples of some common problems that result in poor image quality, such as lateral drift, where the sample moves during image acquisition and density related problems resulting in the 'mislocalization' phenomenon.
Molecular Biology, Issue 79, Genetics, Bioengineering, Biomedical Engineering, Biophysics, Basic Protocols, HeLa Cells, Actin Cytoskeleton, Coated Vesicles, Receptor, Epidermal Growth Factor, Actins, Fluorescence, Endocytosis, Microscopy, STORM, super-resolution microscopy, nanoscopy, cell biology, fluorescence microscopy, test samples, resolution, actin filaments, fiducial markers, epidermal growth factor, cell, imaging
From Voxels to Knowledge: A Practical Guide to the Segmentation of Complex Electron Microscopy 3D-Data
Institutions: Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory.
Modern 3D electron microscopy approaches have recently allowed unprecedented insight into the 3D ultrastructural organization of cells and tissues, enabling the visualization of large macromolecular machines, such as adhesion complexes, as well as higher-order structures, such as the cytoskeleton and cellular organelles in their respective cell and tissue context. Given the inherent complexity of cellular volumes, it is essential to first extract the features of interest in order to allow visualization, quantification, and therefore comprehension of their 3D organization. Each data set is defined by distinct characteristics, e.g.
, signal-to-noise ratio, crispness (sharpness) of the data, heterogeneity of its features, crowdedness of features, presence or absence of characteristic shapes that allow for easy identification, and the percentage of the entire volume that a specific region of interest occupies. All these characteristics need to be considered when deciding on which approach to take for segmentation.
The six different 3D ultrastructural data sets presented were obtained by three different imaging approaches: resin embedded stained electron tomography, focused ion beam- and serial block face- scanning electron microscopy (FIB-SEM, SBF-SEM) of mildly stained and heavily stained samples, respectively. For these data sets, four different segmentation approaches have been applied: (1) fully manual model building followed solely by visualization of the model, (2) manual tracing segmentation of the data followed by surface rendering, (3) semi-automated approaches followed by surface rendering, or (4) automated custom-designed segmentation algorithms followed by surface rendering and quantitative analysis. Depending on the combination of data set characteristics, it was found that typically one of these four categorical approaches outperforms the others, but depending on the exact sequence of criteria, more than one approach may be successful. Based on these data, we propose a triage scheme that categorizes both objective data set characteristics and subjective personal criteria for the analysis of the different data sets.
Bioengineering, Issue 90, 3D electron microscopy, feature extraction, segmentation, image analysis, reconstruction, manual tracing, thresholding
Creating Objects and Object Categories for Studying Perception and Perceptual Learning
Institutions: Georgia Health Sciences University, Georgia Health Sciences University, Georgia Health Sciences University, Palo Alto Research Center, Palo Alto Research Center, University of Minnesota .
In order to quantitatively study object perception, be it perception by biological systems or by machines, one needs to create objects and object categories with precisely definable, preferably naturalistic, properties1
. Furthermore, for studies on perceptual learning, it is useful to create novel objects and object categories (or object classes
) with such properties2
Many innovative and useful methods currently exist for creating novel objects and object categories3-6
(also see refs. 7,8). However, generally speaking, the existing methods have three broad types of shortcomings.
First, shape variations are generally imposed by the experimenter5,9,10
, and may therefore be different from the variability in natural categories, and optimized for a particular recognition algorithm. It would be desirable to have the variations arise independently of the externally imposed constraints.
Second, the existing methods have difficulty capturing the shape complexity of natural objects11-13
. If the goal is to study natural object perception, it is desirable for objects and object categories to be naturalistic, so as to avoid possible confounds and special cases.
Third, it is generally hard to quantitatively measure the available information in the stimuli created by conventional methods. It would be desirable to create objects and object categories where the available information can be precisely measured and, where necessary, systematically manipulated (or 'tuned'). This allows one to formulate the underlying object recognition tasks in quantitative terms.
Here we describe a set of algorithms, or methods, that meet all three of the above criteria. Virtual morphogenesis (VM) creates novel, naturalistic virtual 3-D objects called 'digital embryos' by simulating the biological process of embryogenesis14
. Virtual phylogenesis (VP) creates novel, naturalistic object categories by simulating the evolutionary process of natural selection9,12,13
. Objects and object categories created by these simulations can be further manipulated by various morphing methods to generate systematic variations of shape characteristics15,16
. The VP and morphing methods can also be applied, in principle, to novel virtual objects other than digital embryos, or to virtual versions of real-world objects9,13
. Virtual objects created in this fashion can be rendered as visual images using a conventional graphical toolkit, with desired manipulations of surface texture, illumination, size, viewpoint and background. The virtual objects can also be 'printed' as haptic objects using a conventional 3-D prototyper.
We also describe some implementations of these computational algorithms to help illustrate the potential utility of the algorithms. It is important to distinguish the algorithms from their implementations. The implementations are demonstrations offered solely as a 'proof of principle' of the underlying algorithms. It is important to note that, in general, an implementation of a computational algorithm often has limitations that the algorithm itself does not have.
Together, these methods represent a set of powerful and flexible tools for studying object recognition and perceptual learning by biological and computational systems alike. With appropriate extensions, these methods may also prove useful in the study of morphogenesis and phylogenesis.
Neuroscience, Issue 69, machine learning, brain, classification, category learning, cross-modal perception, 3-D prototyping, inference
Flying Insect Detection and Classification with Inexpensive Sensors
Institutions: University of California, Riverside, University of California, Riverside, University of São Paulo - USP, ISCA Technologies.
An inexpensive, noninvasive system that could accurately classify flying insects would have important implications for entomological research, and allow for the development of many useful applications in vector and pest control for both medical and agricultural entomology. Given this, the last sixty years have seen many research efforts devoted to this task. To date, however, none of this research has had a lasting impact. In this work, we show that pseudo-acoustic optical sensors can produce superior data; that additional features, both intrinsic and extrinsic to the insect’s flight behavior, can be exploited to improve insect classification; that a Bayesian classification approach allows to efficiently learn classification models that are very robust to over-fitting, and a general classification framework allows to easily incorporate arbitrary number of features. We demonstrate the findings with large-scale experiments that dwarf all previous works combined, as measured by the number of insects and the number of species considered.
Bioengineering, Issue 92, flying insect detection, automatic insect classification, pseudo-acoustic optical sensors, Bayesian classification framework, flight sound, circadian rhythm
Characterization of Complex Systems Using the Design of Experiments Approach: Transient Protein Expression in Tobacco as a Case Study
Institutions: RWTH Aachen University, Fraunhofer Gesellschaft.
Plants provide multiple benefits for the production of biopharmaceuticals including low costs, scalability, and safety. Transient expression offers the additional advantage of short development and production times, but expression levels can vary significantly between batches thus giving rise to regulatory concerns in the context of good manufacturing practice. We used a design of experiments (DoE) approach to determine the impact of major factors such as regulatory elements in the expression construct, plant growth and development parameters, and the incubation conditions during expression, on the variability of expression between batches. We tested plants expressing a model anti-HIV monoclonal antibody (2G12) and a fluorescent marker protein (DsRed). We discuss the rationale for selecting certain properties of the model and identify its potential limitations. The general approach can easily be transferred to other problems because the principles of the model are broadly applicable: knowledge-based parameter selection, complexity reduction by splitting the initial problem into smaller modules, software-guided setup of optimal experiment combinations and step-wise design augmentation. Therefore, the methodology is not only useful for characterizing protein expression in plants but also for the investigation of other complex systems lacking a mechanistic description. The predictive equations describing the interconnectivity between parameters can be used to establish mechanistic models for other complex systems.
Bioengineering, Issue 83, design of experiments (DoE), transient protein expression, plant-derived biopharmaceuticals, promoter, 5'UTR, fluorescent reporter protein, model building, incubation conditions, monoclonal antibody
Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
Institutions: Princeton University.
The aim of de novo
protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo
protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.
To disseminate these methods for broader use we present Protein WISDOM (http://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
Genetics, Issue 77, Molecular Biology, Bioengineering, Biochemistry, Biomedical Engineering, Chemical Engineering, Computational Biology, Genomics, Proteomics, Protein, Protein Binding, Computational Biology, Drug Design, optimization (mathematics), Amino Acids, Peptides, and Proteins, De novo protein and peptide design, Drug design, In silico sequence selection, Optimization, Fold specificity, Binding affinity, sequencing
Detection of Invasive Pulmonary Aspergillosis in Haematological Malignancy Patients by using Lateral-flow Technology
Institutions: University of Exeter, Queen Mary University of London, St. Bartholomew's Hospital and The London NHS Trust.
Invasive pulmonary aspergillosis (IPA) is a leading cause of morbidity and mortality in haematological malignancy patients and hematopoietic stem cell transplant recipients1
. Detection of IPA represents a formidable diagnostic challenge and, in the absence of a 'gold standard', relies on a combination of clinical data and microbiology and histopathology where feasible. Diagnosis of IPA must conform to the European Organization for Research and Treatment of Cancer and the National Institute of Allergy and Infectious Diseases Mycology Study Group (EORTC/MSG) consensus defining "proven", "probable", and "possible" invasive fungal diseases2
. Currently, no nucleic acid-based tests have been externally validated for IPA detection and so polymerase chain reaction (PCR) is not included in current EORTC/MSG diagnostic criteria.
Identification of Aspergillus
in histological sections is problematic because of similarities in hyphal morphologies with other invasive fungal pathogens3
, and proven identification requires isolation of the etiologic agent in pure culture. Culture-based approaches rely on the availability of biopsy samples, but these are not always accessible in sick patients, and do not always yield viable propagules for culture when obtained.
An important feature in the pathogenesis of Aspergillus
is angio-invasion, a trait that provides opportunities to track the fungus immunologically using tests that detect characteristic antigenic signatures molecules in serum and bronchoalveolar lavage (BAL) fluids. This has led to the development of the Platelia enzyme immunoassay (GM-EIA) that detects Aspergillus
galactomannan and a 'pan-fungal' assay (Fungitell test) that detects the conserved fungal cell wall component (1 →3)-β-D-glucan, but not in the mucorales that lack this component in their cell walls1,4
. Issues surrounding the accuracy of these tests1,4-6
has led to the recent development of next-generation monoclonal antibody (MAb)-based assays that detect surrogate markers of infection1,5
recently described the generation of an Aspergillus
-specific MAb (JF5) using hybridoma technology and its use to develop an immuno-chromatographic lateral-flow device (LFD) for the point-of-care (POC) diagnosis of IPA. A major advantage of the LFD is its ability to detect activity since MAb JF5 binds to an extracellular glycoprotein antigen that is secreted during active growth of the fungus only5
. This is an important consideration when using fluids such as lung BAL for diagnosing IPA since Aspergillus
spores are a common component of inhaled air. The utility of the device in diagnosing IPA has been demonstrated using an animal model of infection, where the LFD displayed improved sensitivity and specificity compared to the Platelia GM and Fungitell (1 → 3)-β-D-glucan assays7
Here, we present a simple LFD procedure to detect Aspergillus
antigen in human serum and BAL fluids. Its speed and accuracy provides a novel adjunct point-of-care test for diagnosis of IPA in haematological malignancy patients.
Immunology, Issue 61, Invasive pulmonary aspergillosis, acute myeloid leukemia, bone marrow transplant, diagnosis, monoclonal antibody, lateral-flow technology
Cortical Source Analysis of High-Density EEG Recordings in Children
Institutions: UCL Institute of Child Health, University College London.
EEG is traditionally described as a neuroimaging technique with high temporal and low spatial resolution. Recent advances in biophysical modelling and signal processing make it possible to exploit information from other imaging modalities like structural MRI that provide high spatial resolution to overcome this constraint1
. This is especially useful for investigations that require high resolution in the temporal as well as spatial domain. In addition, due to the easy application and low cost of EEG recordings, EEG is often the method of choice when working with populations, such as young children, that do not tolerate functional MRI scans well. However, in order to investigate which neural substrates are involved, anatomical information from structural MRI is still needed. Most EEG analysis packages work with standard head models that are based on adult anatomy. The accuracy of these models when used for children is limited2
, because the composition and spatial configuration of head tissues changes dramatically over development3
In the present paper, we provide an overview of our recent work in utilizing head models based on individual structural MRI scans or age specific head models to reconstruct the cortical generators of high density EEG. This article describes how EEG recordings are acquired, processed, and analyzed with pediatric populations at the London Baby Lab, including laboratory setup, task design, EEG preprocessing, MRI processing, and EEG channel level and source analysis.
Behavior, Issue 88, EEG, electroencephalogram, development, source analysis, pediatric, minimum-norm estimation, cognitive neuroscience, event-related potentials
Automated, Quantitative Cognitive/Behavioral Screening of Mice: For Genetics, Pharmacology, Animal Cognition and Undergraduate Instruction
Institutions: Rutgers University, Koç University, New York University, Fairfield University.
We describe a high-throughput, high-volume, fully automated, live-in 24/7 behavioral testing system for assessing the effects of genetic and pharmacological manipulations on basic mechanisms of cognition and learning in mice. A standard polypropylene mouse housing tub is connected through an acrylic tube to a standard commercial mouse test box. The test box has 3 hoppers, 2 of which are connected to pellet feeders. All are internally illuminable with an LED and monitored for head entries by infrared (IR) beams. Mice live in the environment, which eliminates handling during screening. They obtain their food during two or more daily feeding periods by performing in operant (instrumental) and Pavlovian (classical) protocols, for which we have written protocol-control software and quasi-real-time data analysis and graphing software. The data analysis and graphing routines are written in a MATLAB-based language created to simplify greatly the analysis of large time-stamped behavioral and physiological event records and to preserve a full data trail from raw data through all intermediate analyses to the published graphs and statistics within a single data structure. The data-analysis code harvests the data several times a day and subjects it to statistical and graphical analyses, which are automatically stored in the "cloud" and on in-lab computers. Thus, the progress of individual mice is visualized and quantified daily. The data-analysis code talks to the protocol-control code, permitting the automated advance from protocol to protocol of individual subjects. The behavioral protocols implemented are matching, autoshaping, timed hopper-switching, risk assessment in timed hopper-switching, impulsivity measurement, and the circadian anticipation of food availability. Open-source protocol-control and data-analysis code makes the addition of new protocols simple. Eight test environments fit in a 48 in x 24 in x 78 in cabinet; two such cabinets (16 environments) may be controlled by one computer.
Behavior, Issue 84, genetics, cognitive mechanisms, behavioral screening, learning, memory, timing
Detection of Architectural Distortion in Prior Mammograms via Analysis of Oriented Patterns
Institutions: University of Calgary , University of Calgary .
We demonstrate methods for the detection of architectural distortion in prior mammograms of interval-cancer cases based on analysis of the orientation of breast tissue patterns in mammograms. We hypothesize that architectural distortion modifies the normal orientation of breast tissue patterns in mammographic images before the formation of masses or tumors. In the initial steps of our methods, the oriented structures in a given mammogram are analyzed using Gabor filters and phase portraits to detect node-like sites of radiating or intersecting tissue patterns. Each detected site is then characterized using the node value, fractal dimension, and a measure of angular dispersion specifically designed to represent spiculating patterns associated with architectural distortion.
Our methods were tested with a database of 106 prior mammograms of 56 interval-cancer cases and 52 mammograms of 13 normal cases using the features developed for the characterization of architectural distortion, pattern classification via
quadratic discriminant analysis, and validation with the leave-one-patient out procedure. According to the results of free-response receiver operating characteristic analysis, our methods have demonstrated the capability to detect architectural distortion in prior mammograms, taken 15 months (on the average) before clinical diagnosis of breast cancer, with a sensitivity of 80% at about five false positives per patient.
Medicine, Issue 78, Anatomy, Physiology, Cancer Biology, angular spread, architectural distortion, breast cancer, Computer-Assisted Diagnosis, computer-aided diagnosis (CAD), entropy, fractional Brownian motion, fractal dimension, Gabor filters, Image Processing, Medical Informatics, node map, oriented texture, Pattern Recognition, phase portraits, prior mammograms, spectral analysis
The Generation of Higher-order Laguerre-Gauss Optical Beams for High-precision Interferometry
Institutions: University of Birmingham.
Thermal noise in high-reflectivity mirrors is a major impediment for several types of high-precision interferometric experiments that aim to reach the standard quantum limit or to cool mechanical systems to their quantum ground state. This is for example the case of future gravitational wave observatories, whose sensitivity to gravitational wave signals is expected to be limited in the most sensitive frequency band, by atomic vibration of their mirror masses. One promising approach being pursued to overcome this limitation is to employ higher-order Laguerre-Gauss (LG) optical beams in place of the conventionally used fundamental mode. Owing to their more homogeneous light intensity distribution these beams average more effectively over the thermally driven fluctuations of the mirror surface, which in turn reduces the uncertainty in the mirror position sensed by the laser light.
We demonstrate a promising method to generate higher-order LG beams by shaping a fundamental Gaussian beam with the help of diffractive optical elements. We show that with conventional sensing and control techniques that are known for stabilizing fundamental laser beams, higher-order LG modes can be purified and stabilized just as well at a comparably high level. A set of diagnostic tools allows us to control and tailor the properties of generated LG beams. This enabled us to produce an LG beam with the highest purity reported to date. The demonstrated compatibility of higher-order LG modes with standard interferometry techniques and with the use of standard spherical optics makes them an ideal candidate for application in a future generation of high-precision interferometry.
Physics, Issue 78, Optics, Astronomy, Astrophysics, Gravitational waves, Laser interferometry, Metrology, Thermal noise, Laguerre-Gauss modes, interferometry
Analyzing and Building Nucleic Acid Structures with 3DNA
Institutions: Rutgers - The State University of New Jersey, Columbia University .
The 3DNA software package is a popular and versatile bioinformatics tool with capabilities to analyze, construct, and visualize three-dimensional nucleic acid structures. This article presents detailed protocols for a subset of new and popular features available in 3DNA, applicable to both individual structures and ensembles of related structures. Protocol 1 lists the set of instructions needed to download and install the software. This is followed, in Protocol 2, by the analysis of a nucleic acid structure, including the assignment of base pairs and the determination of rigid-body parameters that describe the structure and, in Protocol 3, by a description of the reconstruction of an atomic model of a structure from its rigid-body parameters. The most recent version of 3DNA, version 2.1, has new features for the analysis and manipulation of ensembles of structures, such as those deduced from nuclear magnetic resonance (NMR) measurements and molecular dynamic (MD) simulations; these features are presented in Protocols 4 and 5. In addition to the 3DNA stand-alone software package, the w3DNA web server, located at http://w3dna.rutgers.edu, provides a user-friendly interface to selected features of the software. Protocol 6 demonstrates a novel feature of the site for building models of long DNA molecules decorated with bound proteins at user-specified locations.
Genetics, Issue 74, Molecular Biology, Biochemistry, Bioengineering, Biophysics, Genomics, Chemical Biology, Quantitative Biology, conformational analysis, DNA, high-resolution structures, model building, molecular dynamics, nucleic acid structure, RNA, visualization, bioinformatics, three-dimensional, 3DNA, software
Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes
Institutions: Dartmouth College.
SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference1
. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data1
. In this article, we utilize a web version of SCOPE2
to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs3,4
and has been used in other studies5-8
The three algorithms that comprise SCOPE are BEAM9
, which finds non-degenerate motifs (ACCGGT), PRISM10
, which finds degenerate motifs (ASCGWT), and SPACER11
, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well.
Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor.
Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run.
Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail1,2,9-11
Genetics, Issue 51, gene regulation, computational biology, algorithm, promoter sequence motif