Genome sequencing projects have ciphered millions of protein sequence, which require knowledge of their structure and function to improve the understanding of their biological role. Although experimental methods can provide detailed information for a small fraction of these proteins, computational modeling is needed for the majority of protein molecules which are experimentally uncharacterized. The I-TASSER server is an on-line workbench for high-resolution modeling of protein structure and function. Given a protein sequence, a typical output from the I-TASSER server includes secondary structure prediction, predicted solvent accessibility of each residue, homologous template proteins detected by threading and structure alignments, up to five full-length tertiary structural models, and structure-based functional annotations for enzyme classification, Gene Ontology terms and protein-ligand binding sites. All the predictions are tagged with a confidence score which tells how accurate the predictions are without knowing the experimental data. To facilitate the special requests of end users, the server provides channels to accept user-specified inter-residue distance and contact maps to interactively change the I-TASSER modeling; it also allows users to specify any proteins as template, or to exclude any template proteins during the structure assembly simulations. The structural information could be collected by the users based on experimental evidences or biological insights with the purpose of improving the quality of I-TASSER predictions. The server was evaluated as the best programs for protein structure and function predictions in the recent community-wide CASP experiments. There are currently >20,000 registered scientists from over 100 countries who are using the on-line I-TASSER server.
24 Related JoVE Articles!
Fluorescence-microscopy Screening and Next-generation Sequencing: Useful Tools for the Identification of Genes Involved in Organelle Integrity
Institutions: Michigan State University.
This protocol describes a fluorescence microscope-based screening of Arabidopsis
seedlings and describes how to map recessive mutations that alter the subcellular distribution of a specific tagged fluorescent marker in the secretory pathway. Arabidopsis
is a powerful biological model for genetic studies because of its genome size, generation time, and conservation of molecular mechanisms among kingdoms. The array genotyping as an approach to map the mutation in alternative to the traditional method based on molecular markers is advantageous because it is relatively faster and may allow the mapping of several mutants in a really short time frame. This method allows the identification of proteins that can influence the integrity of any organelle in plants. Here, as an example, we propose a screen to map genes important for the integrity of the endoplasmic reticulum (ER). Our approach, however, can be easily extended to other plant cell organelles (for example see1,2
), and thus represents an important step toward understanding the molecular basis governing other subcellular structures.
Genetics, Issue 62, EMS mutagenesis, secretory pathway, mapping, confocal screening
Trajectory Data Analyses for Pedestrian Space-time Activity Study
Institutions: Kean University, University of Wisconsin-Madison.
It is well recognized that human movement in the spatial and temporal dimensions has direct influence on disease transmission1-3
. An infectious disease typically spreads via contact between infected and susceptible individuals in their overlapped activity spaces. Therefore, daily mobility-activity information can be used as an indicator to measure exposures to risk factors of infection. However, a major difficulty and thus the reason for paucity of studies of infectious disease transmission at the micro scale arise from the lack of detailed individual mobility data. Previously in transportation and tourism research detailed space-time activity data often relied on the time-space diary technique, which requires subjects to actively record their activities in time and space. This is highly demanding for the participants and collaboration from the participants greatly affects the quality of data4
Modern technologies such as GPS and mobile communications have made possible the automatic collection of trajectory data. The data collected, however, is not ideal for modeling human space-time activities, limited by the accuracies of existing devices. There is also no readily available tool for efficient processing of the data for human behavior study. We present here a suite of methods and an integrated ArcGIS desktop-based visual interface for the pre-processing and spatiotemporal analyses of trajectory data. We provide examples of how such processing may be used to model human space-time activities, especially with error-rich pedestrian trajectory data, that could be useful in public health studies such as infectious disease transmission modeling.
The procedure presented includes pre-processing, trajectory segmentation, activity space characterization, density estimation and visualization, and a few other exploratory analysis methods. Pre-processing is the cleaning of noisy raw trajectory data. We introduce an interactive visual pre-processing interface as well as an automatic module. Trajectory segmentation5
involves the identification of indoor and outdoor parts from pre-processed space-time tracks. Again, both interactive visual segmentation and automatic segmentation are supported. Segmented space-time tracks are then analyzed to derive characteristics of one's activity space such as activity radius etc.
Density estimation and visualization are used to examine large amount of trajectory data to model hot spots and interactions. We demonstrate both density surface mapping6
and density volume rendering7
. We also include a couple of other exploratory data analyses (EDA) and visualizations tools, such as Google Earth animation support and connection analysis. The suite of analytical as well as visual methods presented in this paper may be applied to any trajectory data for space-time activity studies.
Environmental Sciences, Issue 72, Computer Science, Behavior, Infectious Diseases, Geography, Cartography, Data Display, Disease Outbreaks, cartography, human behavior, Trajectory data, space-time activity, GPS, GIS, ArcGIS, spatiotemporal analysis, visualization, segmentation, density surface, density volume, exploratory data analysis, modelling
Using plusTipTracker Software to Measure Microtubule Dynamics in Xenopus laevis Growth Cones
Institutions: Boston College.
Microtubule (MT) plus-end-tracking proteins (+TIPs) localize to the growing plus-ends of MTs and regulate MT dynamics1,2
. One of the most well-known and widely-utilized +TIPs for analyzing MT dynamics is the End-Binding protein, EB1, which binds all growing MT plus-ends, and thus, is a marker for MT polymerization1
. Many studies of EB1 behavior within growth cones have used time-consuming and biased computer-assisted, hand-tracking methods to analyze individual MTs1-3
. Our approach is to quantify global parameters of MT dynamics using the software package, plusTipTracker4
, following the acquisition of high-resolution, live images of tagged EB1 in cultured embryonic growth cones5
. This software is a MATLAB-based, open-source, user-friendly package that combines automated detection, tracking, visualization, and analysis for movies of fluorescently-labeled +TIPs. Here, we present the protocol for using plusTipTracker for the analysis of fluorescently-labeled +TIP comets in cultured Xenopus laevis
growth cones. However, this software can also be used to characterize MT dynamics in various cell types6-8
Molecular Biology, Issue 91, plusTipTracker, microtubule plus-end-tracking proteins, EB1, growth cone, Xenopus laevis, live cell imaging analysis, microtubule dynamics
Creating Dynamic Images of Short-lived Dopamine Fluctuations with lp-ntPET: Dopamine Movies of Cigarette Smoking
Institutions: Yale University, Yale University, Yale University, Yale University, Massachusetts General Hospital, University of California, Irvine.
We describe experimental and statistical steps for creating dopamine movies of the brain from dynamic PET data. The movies represent minute-to-minute fluctuations of dopamine induced by smoking a cigarette. The smoker is imaged during a natural smoking experience while other possible confounding effects (such as head motion, expectation, novelty, or aversion to smoking repeatedly) are minimized.
We present the details of our unique analysis. Conventional methods for PET analysis estimate time-invariant kinetic model parameters which cannot capture short-term fluctuations in neurotransmitter release. Our analysis - yielding a dopamine movie - is based on our work with kinetic models and other decomposition techniques that allow for time-varying parameters 1-7
. This aspect of the analysis - temporal-variation - is key to our work. Because our model is also linear in parameters, it is practical, computationally, to apply at the voxel level. The analysis technique is comprised of five main steps: pre-processing, modeling, statistical comparison, masking and visualization. Preprocessing is applied to the PET data with a unique 'HYPR' spatial filter 8
that reduces spatial noise but preserves critical temporal information. Modeling identifies the time-varying function that best describes the dopamine effect on 11
C-raclopride uptake. The statistical step compares the fit of our (lp-ntPET) model 7
to a conventional model 9
. Masking restricts treatment to those voxels best described by the new model. Visualization maps the dopamine function at each voxel to a color scale and produces a dopamine movie. Interim results and sample dopamine movies of cigarette smoking are presented.
Behavior, Issue 78, Neuroscience, Neurobiology, Molecular Biology, Biomedical Engineering, Medicine, Anatomy, Physiology, Image Processing, Computer-Assisted, Receptors, Dopamine, Dopamine, Functional Neuroimaging, Binding, Competitive, mathematical modeling (systems analysis), Neurotransmission, transient, dopamine release, PET, modeling, linear, time-invariant, smoking, F-test, ventral-striatum, clinical techniques
A Comparative Approach to Characterize the Landscape of Host-Pathogen Protein-Protein Interactions
Institutions: Institut Pasteur , Université Sorbonne Paris Cité, Dana Farber Cancer Institute.
Significant efforts were gathered to generate large-scale comprehensive protein-protein interaction network maps. This is instrumental to understand the pathogen-host relationships and was essentially performed by genetic screenings in yeast two-hybrid systems. The recent improvement of protein-protein interaction detection by a Gaussia
luciferase-based fragment complementation assay now offers the opportunity to develop integrative comparative interactomic approaches necessary to rigorously compare interaction profiles of proteins from different pathogen strain variants against a common set of cellular factors.
This paper specifically focuses on the utility of combining two orthogonal methods to generate protein-protein interaction datasets: yeast two-hybrid (Y2H) and a new assay, high-throughput Gaussia princeps
protein complementation assay (HT-GPCA) performed in mammalian cells.
A large-scale identification of cellular partners of a pathogen protein is performed by mating-based yeast two-hybrid screenings of cDNA libraries using multiple pathogen strain variants. A subset of interacting partners selected on a high-confidence statistical scoring is further validated in mammalian cells for pair-wise interactions with the whole set of pathogen variants proteins using HT-GPCA. This combination of two complementary methods improves the robustness of the interaction dataset, and allows the performance of a stringent comparative interaction analysis. Such comparative interactomics constitute a reliable and powerful strategy to decipher any pathogen-host interplays.
Immunology, Issue 77, Genetics, Microbiology, Biochemistry, Molecular Biology, Cellular Biology, Biomedical Engineering, Infection, Cancer Biology, Virology, Medicine, Host-Pathogen Interactions, Host-Pathogen Interactions, Protein-protein interaction, High-throughput screening, Luminescence, Yeast two-hybrid, HT-GPCA, Network, protein, yeast, cell, culture
Patient-specific Modeling of the Heart: Estimation of Ventricular Fiber Orientations
Institutions: Johns Hopkins University.
Patient-specific simulations of heart (dys)function aimed at personalizing cardiac therapy are hampered by the absence of in vivo
imaging technology for clinically acquiring myocardial fiber orientations. The objective of this project was to develop a methodology to estimate cardiac fiber orientations from in vivo
images of patient heart geometries. An accurate representation of ventricular geometry and fiber orientations was reconstructed, respectively, from high-resolution ex vivo structural magnetic resonance (MR) and diffusion tensor (DT) MR images of a normal human heart, referred to as the atlas. Ventricular geometry of a patient heart was extracted, via
semiautomatic segmentation, from an in vivo
computed tomography (CT) image. Using image transformation algorithms, the atlas ventricular geometry was deformed to match that of the patient. Finally, the deformation field was applied to the atlas fiber orientations to obtain an estimate of patient fiber orientations. The accuracy of the fiber estimates was assessed using six normal and three failing canine hearts. The mean absolute difference between inclination angles of acquired and estimated fiber orientations was 15.4 °. Computational simulations of ventricular activation maps and pseudo-ECGs in sinus rhythm and ventricular tachycardia indicated that there are no significant differences between estimated and acquired fiber orientations at a clinically observable level.The new insights obtained from the project will pave the way for the development of patient-specific models of the heart that can aid physicians in personalized diagnosis and decisions regarding electrophysiological interventions.
Bioengineering, Issue 71, Biomedical Engineering, Medicine, Anatomy, Physiology, Cardiology, Myocytes, Cardiac, Image Processing, Computer-Assisted, Magnetic Resonance Imaging, MRI, Diffusion Magnetic Resonance Imaging, Cardiac Electrophysiology, computerized simulation (general), mathematical modeling (systems analysis), Cardiomyocyte, biomedical image processing, patient-specific modeling, Electrophysiology, simulation
Recording Human Electrocorticographic (ECoG) Signals for Neuroscientific Research and Real-time Functional Cortical Mapping
Institutions: New York State Department of Health, Albany Medical College, Albany Medical College, Washington University, Rensselaer Polytechnic Institute, State University of New York at Albany, University of Texas at El Paso .
Neuroimaging studies of human cognitive, sensory, and motor processes are usually based on noninvasive techniques such as electroencephalography (EEG), magnetoencephalography or functional magnetic-resonance imaging. These techniques have either inherently low temporal or low spatial resolution, and suffer from low signal-to-noise ratio and/or poor high-frequency sensitivity. Thus, they are suboptimal for exploring the short-lived spatio-temporal dynamics of many of the underlying brain processes. In contrast, the invasive technique of electrocorticography (ECoG) provides brain signals that have an exceptionally high signal-to-noise ratio, less susceptibility to artifacts than EEG, and a high spatial and temporal resolution (i.e., <1 cm/<1 millisecond, respectively). ECoG involves measurement of electrical brain signals using electrodes that are implanted subdurally on the surface of the brain. Recent studies have shown that ECoG amplitudes in certain frequency bands carry substantial information about task-related activity, such as motor execution and planning1
, auditory processing2
and visual-spatial attention3
. Most of this information is captured in the high gamma range (around 70-110 Hz). Thus, gamma activity has been proposed as a robust and general indicator of local cortical function1-5
. ECoG can also reveal functional connectivity and resolve finer task-related spatial-temporal dynamics, thereby advancing our understanding of large-scale cortical processes. It has especially proven useful for advancing brain-computer interfacing (BCI) technology for decoding a user's intentions to enhance or improve communication6
. Nevertheless, human ECoG data are often hard to obtain because of the risks and limitations of the invasive procedures involved, and the need to record within the constraints of clinical settings. Still, clinical monitoring to localize epileptic foci offers a unique and valuable opportunity to collect human ECoG data. We describe our methods for collecting recording ECoG, and demonstrate how to use these signals for important real-time applications such as clinical mapping and brain-computer interfacing. Our example uses the BCI2000 software platform8,9
and the SIGFRIED10
method, an application for real-time mapping of brain functions. This procedure yields information that clinicians can subsequently use to guide the complex and laborious process of functional mapping by electrical stimulation.
Prerequisites and Planning:
Patients with drug-resistant partial epilepsy may be candidates for resective surgery of an epileptic focus to minimize the frequency of seizures. Prior to resection, the patients undergo monitoring using subdural electrodes for two purposes: first, to localize the epileptic focus, and second, to identify nearby critical brain areas (i.e., eloquent cortex) where resection could result in long-term functional deficits. To implant electrodes, a craniotomy is performed to open the skull. Then, electrode grids and/or strips are placed on the cortex, usually beneath the dura. A typical grid has a set of 8 x 8 platinum-iridium electrodes of 4 mm diameter (2.3 mm exposed surface) embedded in silicon with an inter-electrode distance of 1cm. A strip typically contains 4 or 6 such electrodes in a single line. The locations for these grids/strips are planned by a team of neurologists and neurosurgeons, and are based on previous EEG monitoring, on a structural MRI of the patient's brain, and on relevant factors of the patient's history. Continuous recording over a period of 5-12 days serves to localize epileptic foci, and electrical stimulation via the implanted electrodes allows clinicians to map eloquent cortex. At the end of the monitoring period, explantation of the electrodes and therapeutic resection are performed together in one procedure.
In addition to its primary clinical purpose, invasive monitoring also provides a unique opportunity to acquire human ECoG data for neuroscientific research. The decision to include a prospective patient in the research is based on the planned location of their electrodes, on the patient's performance scores on neuropsychological assessments, and on their informed consent, which is predicated on their understanding that participation in research is optional and is not related to their treatment. As with all research involving human subjects, the research protocol must be approved by the hospital's institutional review board. The decision to perform individual experimental tasks is made day-by-day, and is contingent on the patient's endurance and willingness to participate. Some or all of the experiments may be prevented by problems with the clinical state of the patient, such as post-operative facial swelling, temporary aphasia, frequent seizures, post-ictal fatigue and confusion, and more general pain or discomfort.
At the Epilepsy Monitoring Unit at Albany Medical Center in Albany, New York, clinical monitoring is implemented around the clock using a 192-channel Nihon-Kohden Neurofax monitoring system. Research recordings are made in collaboration with the Wadsworth Center of the New York State Department of Health in Albany. Signals from the ECoG electrodes are fed simultaneously to the research and the clinical systems via splitter connectors. To ensure that the clinical and research systems do not interfere with each other, the two systems typically use separate grounds. In fact, an epidural strip of electrodes is sometimes implanted to provide a ground for the clinical system. Whether research or clinical recording system, the grounding electrode is chosen to be distant from the predicted epileptic focus and from cortical areas of interest for the research. Our research system consists of eight synchronized 16-channel g.USBamp amplifier/digitizer units (g.tec, Graz, Austria). These were chosen because they are safety-rated and FDA-approved for invasive recordings, they have a very low noise-floor in the high-frequency range in which the signals of interest are found, and they come with an SDK that allows them to be integrated with custom-written research software. In order to capture the high-gamma signal accurately, we acquire signals at 1200Hz sampling rate-considerably higher than that of the typical EEG experiment or that of many clinical monitoring systems. A built-in low-pass filter automatically prevents aliasing of signals higher than the digitizer can capture. The patient's eye gaze is tracked using a monitor with a built-in Tobii T-60 eye-tracking system (Tobii Tech., Stockholm, Sweden). Additional accessories such as joystick, bluetooth Wiimote (Nintendo Co.), data-glove (5th
Dimension Technologies), keyboard, microphone, headphones, or video camera are connected depending on the requirements of the particular experiment.
Data collection, stimulus presentation, synchronization with the different input/output accessories, and real-time analysis and visualization are accomplished using our BCI2000 software8,9
. BCI2000 is a freely available general-purpose software system for real-time biosignal data acquisition, processing and feedback. It includes an array of pre-built modules that can be flexibly configured for many different purposes, and that can be extended by researchers' own code in C++, MATLAB or Python. BCI2000 consists of four modules that communicate with each other via a network-capable protocol: a Source module that handles the acquisition of brain signals from one of 19 different hardware systems from different manufacturers; a Signal Processing module that extracts relevant ECoG features and translates them into output signals; an Application module that delivers stimuli and feedback to the subject; and the Operator module that provides a graphical interface to the investigator.
A number of different experiments may be conducted with any given patient. The priority of experiments will be determined by the location of the particular patient's electrodes. However, we usually begin our experimentation using the SIGFRIED (SIGnal modeling For Realtime Identification and Event Detection) mapping method, which detects and displays significant task-related activity in real time. The resulting functional map allows us to further tailor subsequent experimental protocols and may also prove as a useful starting point for traditional mapping by electrocortical stimulation (ECS).
Although ECS mapping remains the gold standard for predicting the clinical outcome of resection, the process of ECS mapping is time consuming and also has other problems, such as after-discharges or seizures. Thus, a passive functional mapping technique may prove valuable in providing an initial estimate of the locus of eloquent cortex, which may then be confirmed and refined by ECS. The results from our passive SIGFRIED mapping technique have been shown to exhibit substantial concurrence with the results derived using ECS mapping10
The protocol described in this paper establishes a general methodology for gathering human ECoG data, before proceeding to illustrate how experiments can be initiated using the BCI2000 software platform. Finally, as a specific example, we describe how to perform passive functional mapping using the BCI2000-based SIGFRIED system.
Neuroscience, Issue 64, electrocorticography, brain-computer interfacing, functional brain mapping, SIGFRIED, BCI2000, epilepsy monitoring, magnetic resonance imaging, MRI
RNA Secondary Structure Prediction Using High-throughput SHAPE
Institutions: Frederick National Laboratory for Cancer Research.
Understanding the function of RNA involved in biological processes requires a thorough knowledge of RNA structure. Toward this end, the methodology dubbed "high-throughput selective 2' hydroxyl acylation analyzed by primer extension", or SHAPE, allows prediction of RNA secondary structure with single nucleotide resolution. This approach utilizes chemical probing agents that preferentially acylate single stranded or flexible regions of RNA in aqueous solution. Sites of chemical modification are detected by reverse transcription of the modified RNA, and the products of this reaction are fractionated by automated capillary electrophoresis (CE). Since reverse transcriptase pauses at those RNA nucleotides modified by the SHAPE reagents, the resulting cDNA library indirectly maps those ribonucleotides that are single stranded in the context of the folded RNA. Using ShapeFinder software, the electropherograms produced by automated CE are processed and converted into nucleotide reactivity tables that are themselves converted into pseudo-energy constraints used in the RNAStructure (v5.3) prediction algorithm. The two-dimensional RNA structures obtained by combining SHAPE probing with in silico
RNA secondary structure prediction have been found to be far more accurate than structures obtained using either method alone.
Genetics, Issue 75, Molecular Biology, Biochemistry, Virology, Cancer Biology, Medicine, Genomics, Nucleic Acid Probes, RNA Probes, RNA, High-throughput SHAPE, Capillary electrophoresis, RNA structure, RNA probing, RNA folding, secondary structure, DNA, nucleic acids, electropherogram, synthesis, transcription, high throughput, sequencing
High-throughput Physical Mapping of Chromosomes using Automated in situ Hybridization
Institutions: Virginia Tech.
Projects to obtain whole-genome sequences for 10,000 vertebrate species1
and for 5,000 insect and related arthropod species2
are expected to take place over the next 5 years. For example, the sequencing of the genomes for 15 malaria mosquitospecies is currently being done using an Illumina platform3,4
. This Anopheles
species cluster includes both vectors and non-vectors of malaria. When the genome assemblies become available, researchers will have the unique opportunity to perform comparative analysis for inferring evolutionary changes relevant to vector ability. However, it has proven difficult to use next-generation sequencing reads to generate high-quality de novo
. Moreover, the existing genome assemblies for Anopheles gambiae
, although obtained using the Sanger method, are gapped or fragmented4,6
Success of comparative genomic analyses will be limited if researchers deal with numerous sequencing contigs, rather than with chromosome-based genome assemblies. Fragmented, unmapped sequences create problems for genomic analyses because: (i) unidentified gaps cause incorrect or incomplete annotation of genomic sequences; (ii) unmapped sequences lead to confusion between paralogous genes and genes from different haplotypes; and (iii) the lack of chromosome assignment and orientation of the sequencing contigs does not allow for reconstructing rearrangement phylogeny and studying chromosome evolution. Developing high-resolution physical maps for species with newly sequenced genomes is a timely and cost-effective investment that will facilitate genome annotation, evolutionary analysis, and re-sequencing of individual genomes from natural populations7,8
Here, we present innovative approaches to chromosome preparation, fluorescent in situ
hybridization (FISH), and imaging that facilitate rapid development of physical maps. Using An. gambiae
as an example, we demonstrate that the development of physical chromosome maps can potentially improve genome assemblies and, thus, the quality of genomic analyses. First, we use a high-pressure method to prepare polytene chromosome spreads. This method, originally developed for Drosophila9
, allows the user to visualize more details on chromosomes than the regular squashing technique10
. Second, a fully automated, front-end system for FISH is used for high-throughput physical genome mapping. The automated slide staining system runs multiple assays simultaneously and dramatically reduces hands-on time11
. Third, an automatic fluorescent imaging system, which includes a motorized slide stage, automatically scans and photographs labeled chromosomes after FISH12
. This system is especially useful for identifying and visualizing multiple chromosomal plates on the same slide. In addition, the scanning process captures a more uniform FISH result. Overall, the automated high-throughput physical mapping protocol is more efficient than a standard manual protocol.
Genetics, Issue 64, Entomology, Molecular Biology, Genomics, automation, chromosome, genome, hybridization, labeling, mapping, mosquito
The ITS2 Database
Institutions: University of Würzburg, University of Würzburg.
The internal transcribed spacer 2 (ITS2) has been used as a phylogenetic marker for more than two decades. As ITS2 research mainly focused on the very variable ITS2 sequence, it confined this marker to low-level phylogenetics only. However, the combination of the ITS2 sequence and its highly conserved secondary structure improves the phylogenetic resolution1
and allows phylogenetic inference at multiple taxonomic ranks, including species delimitation2-8
The ITS2 Database9
presents an exhaustive dataset of internal transcribed spacer 2 sequences from NCBI GenBank11
. Following an annotation by profile Hidden Markov Models (HMMs), the secondary structure of each sequence is predicted. First, it is tested whether a minimum energy based fold12
(direct fold) results in a correct, four helix conformation. If this is not the case, the structure is predicted by homology modeling13
. In homology modeling, an already known secondary structure is transferred to another ITS2 sequence, whose secondary structure was not able to fold correctly in a direct fold.
The ITS2 Database is not only a database for storage and retrieval of ITS2 sequence-structures. It also provides several tools to process your own ITS2 sequences, including annotation, structural prediction, motif detection and BLAST14
search on the combined sequence-structure information. Moreover, it integrates trimmed versions of 4SALE15,16
for multiple sequence-structure alignment calculation and Neighbor Joining18
tree reconstruction. Together they form a coherent analysis pipeline from an initial set of sequences to a phylogeny based on sequence and secondary structure.
In a nutshell, this workbench simplifies first phylogenetic analyses to only a few mouse-clicks, while additionally providing tools and data for comprehensive large-scale analyses.
Genetics, Issue 61, alignment, internal transcribed spacer 2, molecular systematics, secondary structure, ribosomal RNA, phylogenetic tree, homology modeling, phylogeny
Genomic MRI - a Public Resource for Studying Sequence Patterns within Genomic DNA
Institutions: University of Toledo Health Science Campus.
Non-coding genomic regions in complex eukaryotes, including intergenic areas, introns, and untranslated segments of exons, are profoundly non-random in their nucleotide composition and consist of a complex mosaic of sequence patterns. These patterns include so-called Mid-Range Inhomogeneity (MRI) regions -- sequences 30-10000 nucleotides in length that are enriched by a particular base or combination of bases (e.g. (G+T)-rich, purine-rich, etc.). MRI regions are associated with unusual (non-B-form) DNA structures that are often involved in regulation of gene expression, recombination, and other genetic processes (Fedorova & Fedorov 2010). The existence of a strong fixation bias within MRI regions against mutations that tend to reduce their sequence inhomogeneity additionally supports the functionality and importance of these genomic sequences (Prakash et al.
Here we demonstrate a freely available Internet resource -- the Genomic MRI
program package -- designed for computational analysis of genomic sequences in order to find and characterize various MRI patterns within them (Bechtel et al.
2008). This package also allows generation of randomized sequences with various properties and level of correspondence to the natural input DNA sequences. The main goal of this resource is to facilitate examination of vast regions of non-coding DNA that are still scarcely investigated and await thorough exploration and recognition.
Genetics, Issue 51, bioinformatics, computational biology, genomics, non-randomness, signals, gene regulation, DNA conformation
Dual-mode Imaging of Cutaneous Tissue Oxygenation and Vascular Function
Institutions: The Ohio State University, The Ohio State University, The Ohio State University, The Ohio State University.
Accurate assessment of cutaneous tissue oxygenation and vascular function is important for appropriate detection, staging, and treatment of many health disorders such as chronic wounds. We report the development of a dual-mode imaging system for non-invasive and non-contact imaging of cutaneous tissue oxygenation and vascular function. The imaging system integrated an infrared camera, a CCD camera, a liquid crystal tunable filter and a high intensity fiber light source. A Labview interface was programmed for equipment control, synchronization, image acquisition, processing, and visualization. Multispectral images captured by the CCD camera were used to reconstruct the tissue oxygenation map. Dynamic thermographic images captured by the infrared camera were used to reconstruct the vascular function map. Cutaneous tissue oxygenation and vascular function images were co-registered through fiduciary markers. The performance characteristics of the dual-mode image system were tested in humans.
Medicine, Issue 46, Dual-mode, multispectral imaging, infrared imaging, cutaneous tissue oxygenation, vascular function, co-registration, wound healing
A Practical Guide to Phylogenetics for Nonexperts
Institutions: The George Washington University.
Many researchers, across incredibly diverse foci, are applying phylogenetics to their research question(s). However, many researchers are new to this topic and so it presents inherent problems. Here we compile a practical introduction to phylogenetics for nonexperts. We outline in a step-by-step manner, a pipeline for generating reliable phylogenies from gene sequence datasets. We begin with a user-guide for similarity search tools via online interfaces as well as local executables. Next, we explore programs for generating multiple sequence alignments followed by protocols for using software to determine best-fit models of evolution. We then outline protocols for reconstructing phylogenetic relationships via maximum likelihood and Bayesian criteria and finally describe tools for visualizing phylogenetic trees. While this is not by any means an exhaustive description of phylogenetic approaches, it does provide the reader with practical starting information on key software applications commonly utilized by phylogeneticists. The vision for this article would be that it could serve as a practical training tool for researchers embarking on phylogenetic studies and also serve as an educational resource that could be incorporated into a classroom or teaching-lab.
Basic Protocol, Issue 84, phylogenetics, multiple sequence alignments, phylogenetic tree, BLAST executables, basic local alignment search tool, Bayesian models
Diffusion Tensor Magnetic Resonance Imaging in the Analysis of Neurodegenerative Diseases
Institutions: University of Ulm.
Diffusion tensor imaging (DTI) techniques provide information on the microstructural processes of the cerebral white matter (WM) in vivo
. The present applications are designed to investigate differences of WM involvement patterns in different brain diseases, especially neurodegenerative disorders, by use of different DTI analyses in comparison with matched controls.
DTI data analysis is performed in a variate fashion, i.e.
voxelwise comparison of regional diffusion direction-based metrics such as fractional anisotropy (FA), together with fiber tracking (FT) accompanied by tractwise fractional anisotropy statistics (TFAS) at the group level in order to identify differences in FA along WM structures, aiming at the definition of regional patterns of WM alterations at the group level. Transformation into a stereotaxic standard space is a prerequisite for group studies and requires thorough data processing to preserve directional inter-dependencies. The present applications show optimized technical approaches for this preservation of quantitative and directional information during spatial normalization in data analyses at the group level. On this basis, FT techniques can be applied to group averaged data in order to quantify metrics information as defined by FT. Additionally, application of DTI methods, i.e.
differences in FA-maps after stereotaxic alignment, in a longitudinal analysis at an individual subject basis reveal information about the progression of neurological disorders. Further quality improvement of DTI based results can be obtained during preprocessing by application of a controlled elimination of gradient directions with high noise levels.
In summary, DTI is used to define a distinct WM pathoanatomy of different brain diseases by the combination of whole brain-based and tract-based DTI analysis.
Medicine, Issue 77, Neuroscience, Neurobiology, Molecular Biology, Biomedical Engineering, Anatomy, Physiology, Neurodegenerative Diseases, nuclear magnetic resonance, NMR, MR, MRI, diffusion tensor imaging, fiber tracking, group level comparison, neurodegenerative diseases, brain, imaging, clinical techniques
Mapping Bacterial Functional Networks and Pathways in Escherichia Coli using Synthetic Genetic Arrays
Institutions: University of Toronto, University of Toronto, University of Regina.
Phenotypes are determined by a complex series of physical (e.g.
protein-protein) and functional (e.g.
gene-gene or genetic) interactions (GI)1
. While physical interactions can indicate which bacterial proteins are associated as complexes, they do not necessarily reveal pathway-level functional relationships1. GI screens, in which the growth of double mutants bearing two deleted or inactivated genes is measured and compared to the corresponding single mutants, can illuminate epistatic dependencies between loci and hence provide a means to query and discover novel functional relationships2
. Large-scale GI maps have been reported for eukaryotic organisms like yeast3-7
, but GI information remains sparse for prokaryotes8
, which hinders the functional annotation of bacterial genomes. To this end, we and others have developed high-throughput quantitative bacterial GI screening methods9, 10
Here, we present the key steps required to perform quantitative E. coli
Synthetic Genetic Array (eSGA) screening procedure on a genome-scale9
, using natural bacterial conjugation and homologous recombination to systemically generate and measure the fitness of large numbers of double mutants in a colony array format.
Briefly, a robot is used to transfer, through conjugation, chloramphenicol (Cm) - marked mutant alleles from engineered Hfr (High frequency of recombination) 'donor strains' into an ordered array of kanamycin (Kan) - marked F- recipient strains. Typically, we use loss-of-function single mutants bearing non-essential gene deletions (e.g.
the 'Keio' collection11
) and essential gene hypomorphic mutations (i.e.
alleles conferring reduced protein expression, stability, or activity9, 12, 13
) to query the functional associations of non-essential and essential genes, respectively. After conjugation and ensuing genetic exchange mediated by homologous recombination, the resulting double mutants are selected on solid medium containing both antibiotics. After outgrowth, the plates are digitally imaged and colony sizes are quantitatively scored using an in-house automated image processing system14
. GIs are revealed when the growth rate of a double mutant is either significantly better or worse than expected9
. Aggravating (or negative) GIs often result between loss-of-function mutations in pairs of genes from compensatory pathways that impinge on the same essential process2
. Here, the loss of a single gene is buffered, such that either single mutant is viable. However, the loss of both pathways is deleterious and results in synthetic lethality or sickness (i.e.
slow growth). Conversely, alleviating (or positive) interactions can occur between genes in the same pathway or protein complex2
as the deletion of either gene alone is often sufficient to perturb the normal function of the pathway or complex such that additional perturbations do not reduce activity, and hence growth, further. Overall, systematically identifying and analyzing GI networks can provide unbiased, global maps of the functional relationships between large numbers of genes, from which pathway-level information missed by other approaches can be inferred9
Genetics, Issue 69, Molecular Biology, Medicine, Biochemistry, Microbiology, Aggravating, alleviating, conjugation, double mutant, Escherichia coli, genetic interaction, Gram-negative bacteria, homologous recombination, network, synthetic lethality or sickness, suppression
Cortical Source Analysis of High-Density EEG Recordings in Children
Institutions: UCL Institute of Child Health, University College London.
EEG is traditionally described as a neuroimaging technique with high temporal and low spatial resolution. Recent advances in biophysical modelling and signal processing make it possible to exploit information from other imaging modalities like structural MRI that provide high spatial resolution to overcome this constraint1
. This is especially useful for investigations that require high resolution in the temporal as well as spatial domain. In addition, due to the easy application and low cost of EEG recordings, EEG is often the method of choice when working with populations, such as young children, that do not tolerate functional MRI scans well. However, in order to investigate which neural substrates are involved, anatomical information from structural MRI is still needed. Most EEG analysis packages work with standard head models that are based on adult anatomy. The accuracy of these models when used for children is limited2
, because the composition and spatial configuration of head tissues changes dramatically over development3
In the present paper, we provide an overview of our recent work in utilizing head models based on individual structural MRI scans or age specific head models to reconstruct the cortical generators of high density EEG. This article describes how EEG recordings are acquired, processed, and analyzed with pediatric populations at the London Baby Lab, including laboratory setup, task design, EEG preprocessing, MRI processing, and EEG channel level and source analysis.
Behavior, Issue 88, EEG, electroencephalogram, development, source analysis, pediatric, minimum-norm estimation, cognitive neuroscience, event-related potentials
From Voxels to Knowledge: A Practical Guide to the Segmentation of Complex Electron Microscopy 3D-Data
Institutions: Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory, Lawrence Berkeley National Laboratory.
Modern 3D electron microscopy approaches have recently allowed unprecedented insight into the 3D ultrastructural organization of cells and tissues, enabling the visualization of large macromolecular machines, such as adhesion complexes, as well as higher-order structures, such as the cytoskeleton and cellular organelles in their respective cell and tissue context. Given the inherent complexity of cellular volumes, it is essential to first extract the features of interest in order to allow visualization, quantification, and therefore comprehension of their 3D organization. Each data set is defined by distinct characteristics, e.g.
, signal-to-noise ratio, crispness (sharpness) of the data, heterogeneity of its features, crowdedness of features, presence or absence of characteristic shapes that allow for easy identification, and the percentage of the entire volume that a specific region of interest occupies. All these characteristics need to be considered when deciding on which approach to take for segmentation.
The six different 3D ultrastructural data sets presented were obtained by three different imaging approaches: resin embedded stained electron tomography, focused ion beam- and serial block face- scanning electron microscopy (FIB-SEM, SBF-SEM) of mildly stained and heavily stained samples, respectively. For these data sets, four different segmentation approaches have been applied: (1) fully manual model building followed solely by visualization of the model, (2) manual tracing segmentation of the data followed by surface rendering, (3) semi-automated approaches followed by surface rendering, or (4) automated custom-designed segmentation algorithms followed by surface rendering and quantitative analysis. Depending on the combination of data set characteristics, it was found that typically one of these four categorical approaches outperforms the others, but depending on the exact sequence of criteria, more than one approach may be successful. Based on these data, we propose a triage scheme that categorizes both objective data set characteristics and subjective personal criteria for the analysis of the different data sets.
Bioengineering, Issue 90, 3D electron microscopy, feature extraction, segmentation, image analysis, reconstruction, manual tracing, thresholding
Automated, Quantitative Cognitive/Behavioral Screening of Mice: For Genetics, Pharmacology, Animal Cognition and Undergraduate Instruction
Institutions: Rutgers University, Koç University, New York University, Fairfield University.
We describe a high-throughput, high-volume, fully automated, live-in 24/7 behavioral testing system for assessing the effects of genetic and pharmacological manipulations on basic mechanisms of cognition and learning in mice. A standard polypropylene mouse housing tub is connected through an acrylic tube to a standard commercial mouse test box. The test box has 3 hoppers, 2 of which are connected to pellet feeders. All are internally illuminable with an LED and monitored for head entries by infrared (IR) beams. Mice live in the environment, which eliminates handling during screening. They obtain their food during two or more daily feeding periods by performing in operant (instrumental) and Pavlovian (classical) protocols, for which we have written protocol-control software and quasi-real-time data analysis and graphing software. The data analysis and graphing routines are written in a MATLAB-based language created to simplify greatly the analysis of large time-stamped behavioral and physiological event records and to preserve a full data trail from raw data through all intermediate analyses to the published graphs and statistics within a single data structure. The data-analysis code harvests the data several times a day and subjects it to statistical and graphical analyses, which are automatically stored in the "cloud" and on in-lab computers. Thus, the progress of individual mice is visualized and quantified daily. The data-analysis code talks to the protocol-control code, permitting the automated advance from protocol to protocol of individual subjects. The behavioral protocols implemented are matching, autoshaping, timed hopper-switching, risk assessment in timed hopper-switching, impulsivity measurement, and the circadian anticipation of food availability. Open-source protocol-control and data-analysis code makes the addition of new protocols simple. Eight test environments fit in a 48 in x 24 in x 78 in cabinet; two such cabinets (16 environments) may be controlled by one computer.
Behavior, Issue 84, genetics, cognitive mechanisms, behavioral screening, learning, memory, timing
Protein WISDOM: A Workbench for In silico De novo Design of BioMolecules
Institutions: Princeton University.
The aim of de novo
protein design is to find the amino acid sequences that will fold into a desired 3-dimensional structure with improvements in specific properties, such as binding affinity, agonist or antagonist behavior, or stability, relative to the native sequence. Protein design lies at the center of current advances drug design and discovery. Not only does protein design provide predictions for potentially useful drug targets, but it also enhances our understanding of the protein folding process and protein-protein interactions. Experimental methods such as directed evolution have shown success in protein design. However, such methods are restricted by the limited sequence space that can be searched tractably. In contrast, computational design strategies allow for the screening of a much larger set of sequences covering a wide variety of properties and functionality. We have developed a range of computational de novo
protein design methods capable of tackling several important areas of protein design. These include the design of monomeric proteins for increased stability and complexes for increased binding affinity.
To disseminate these methods for broader use we present Protein WISDOM (https://www.proteinwisdom.org), a tool that provides automated methods for a variety of protein design problems. Structural templates are submitted to initialize the design process. The first stage of design is an optimization sequence selection stage that aims at improving stability through minimization of potential energy in the sequence space. Selected sequences are then run through a fold specificity stage and a binding affinity stage. A rank-ordered list of the sequences for each step of the process, along with relevant designed structures, provides the user with a comprehensive quantitative assessment of the design. Here we provide the details of each design method, as well as several notable experimental successes attained through the use of the methods.
Genetics, Issue 77, Molecular Biology, Bioengineering, Biochemistry, Biomedical Engineering, Chemical Engineering, Computational Biology, Genomics, Proteomics, Protein, Protein Binding, Computational Biology, Drug Design, optimization (mathematics), Amino Acids, Peptides, and Proteins, De novo protein and peptide design, Drug design, In silico sequence selection, Optimization, Fold specificity, Binding affinity, sequencing
Identification of Key Factors Regulating Self-renewal and Differentiation in EML Hematopoietic Precursor Cells by RNA-sequencing Analysis
Institutions: The University of Texas Graduate School of Biomedical Sciences at Houston.
Hematopoietic stem cells (HSCs) are used clinically for transplantation treatment to rebuild a patient's hematopoietic system in many diseases such as leukemia and lymphoma. Elucidating the mechanisms controlling HSCs self-renewal and differentiation is important for application of HSCs for research and clinical uses. However, it is not possible to obtain large quantity of HSCs due to their inability to proliferate in vitro
. To overcome this hurdle, we used a mouse bone marrow derived cell line, the EML (Erythroid, Myeloid, and Lymphocytic) cell line, as a model system for this study.
RNA-sequencing (RNA-Seq) has been increasingly used to replace microarray for gene expression studies. We report here a detailed method of using RNA-Seq technology to investigate the potential key factors in regulation of EML cell self-renewal and differentiation. The protocol provided in this paper is divided into three parts. The first part explains how to culture EML cells and separate Lin-CD34+ and Lin-CD34- cells. The second part of the protocol offers detailed procedures for total RNA preparation and the subsequent library construction for high-throughput sequencing. The last part describes the method for RNA-Seq data analysis and explains how to use the data to identify differentially expressed transcription factors between Lin-CD34+ and Lin-CD34- cells. The most significantly differentially expressed transcription factors were identified to be the potential key regulators controlling EML cell self-renewal and differentiation. In the discussion section of this paper, we highlight the key steps for successful performance of this experiment.
In summary, this paper offers a method of using RNA-Seq technology to identify potential regulators of self-renewal and differentiation in EML cells. The key factors identified are subjected to downstream functional analysis in vitro
and in vivo
Genetics, Issue 93, EML Cells, Self-renewal, Differentiation, Hematopoietic precursor cell, RNA-Sequencing, Data analysis
Analysis of Nephron Composition and Function in the Adult Zebrafish Kidney
Institutions: University of Notre Dame.
The zebrafish model has emerged as a relevant system to study kidney development, regeneration and disease. Both the embryonic and adult zebrafish kidneys are composed of functional units known as nephrons, which are highly conserved with other vertebrates, including mammals. Research in zebrafish has recently demonstrated that two distinctive phenomena transpire after adult nephrons incur damage: first, there is robust regeneration within existing nephrons that replaces the destroyed tubule epithelial cells; second, entirely new nephrons are produced from renal progenitors in a process known as neonephrogenesis. In contrast, humans and other mammals seem to have only a limited ability for nephron epithelial regeneration. To date, the mechanisms responsible for these kidney regeneration phenomena remain poorly understood. Since adult zebrafish kidneys undergo both nephron epithelial regeneration and neonephrogenesis, they provide an outstanding experimental paradigm to study these events. Further, there is a wide range of genetic and pharmacological tools available in the zebrafish model that can be used to delineate the cellular and molecular mechanisms that regulate renal regeneration. One essential aspect of such research is the evaluation of nephron structure and function. This protocol describes a set of labeling techniques that can be used to gauge renal composition and test nephron functionality in the adult zebrafish kidney. Thus, these methods are widely applicable to the future phenotypic characterization of adult zebrafish kidney injury paradigms, which include but are not limited to, nephrotoxicant exposure regimes or genetic methods of targeted cell death such as the nitroreductase mediated cell ablation technique. Further, these methods could be used to study genetic perturbations in adult kidney formation and could also be applied to assess renal status during chronic disease modeling.
Cellular Biology, Issue 90,
zebrafish; kidney; nephron; nephrology; renal; regeneration; proximal tubule; distal tubule; segment; mesonephros; physiology; acute kidney injury (AKI)
Spatial Multiobjective Optimization of Agricultural Conservation Practices using a SWAT Model and an Evolutionary Algorithm
Institutions: University of Washington, Iowa State University, North Carolina A&T University, Iowa Geological and Water Survey.
Finding the cost-efficient (i.e.
, lowest-cost) ways of targeting conservation practice investments for the achievement of specific water quality goals across the landscape is of primary importance in watershed management. Traditional economics methods of finding the lowest-cost solution in the watershed context (e.g.
) assume that off-site impacts can be accurately described as a proportion of on-site pollution generated. Such approaches are unlikely to be representative of the actual pollution process in a watershed, where the impacts of polluting sources are often determined by complex biophysical processes. The use of modern physically-based, spatially distributed hydrologic simulation models allows for a greater degree of realism in terms of process representation but requires a development of a simulation-optimization framework where the model becomes an integral part of optimization.
Evolutionary algorithms appear to be a particularly useful optimization tool, able to deal with the combinatorial nature of a watershed simulation-optimization problem and allowing the use of the full water quality model. Evolutionary algorithms treat a particular spatial allocation of conservation practices in a watershed as a candidate solution and utilize sets (populations) of candidate solutions iteratively applying stochastic operators of selection, recombination, and mutation to find improvements with respect to the optimization objectives. The optimization objectives in this case are to minimize nonpoint-source pollution in the watershed, simultaneously minimizing the cost of conservation practices. A recent and expanding set of research is attempting to use similar methods and integrates water quality models with broadly defined evolutionary optimization methods3,4,9,10,13-15,17-19,22,23,25
. In this application, we demonstrate a program which follows Rabotyagov et al.'s approach and integrates a modern and commonly used SWAT water quality model7
with a multiobjective evolutionary algorithm SPEA226
, and user-specified set of conservation practices and their costs to search for the complete tradeoff frontiers between costs of conservation practices and user-specified water quality objectives. The frontiers quantify the tradeoffs faced by the watershed managers by presenting the full range of costs associated with various water quality improvement goals. The program allows for a selection of watershed configurations achieving specified water quality improvement goals and a production of maps of optimized placement of conservation practices.
Environmental Sciences, Issue 70, Plant Biology, Civil Engineering, Forest Sciences, Water quality, multiobjective optimization, evolutionary algorithms, cost efficiency, agriculture, development
Analyzing and Building Nucleic Acid Structures with 3DNA
Institutions: Rutgers - The State University of New Jersey, Columbia University .
The 3DNA software package is a popular and versatile bioinformatics tool with capabilities to analyze, construct, and visualize three-dimensional nucleic acid structures. This article presents detailed protocols for a subset of new and popular features available in 3DNA, applicable to both individual structures and ensembles of related structures. Protocol 1 lists the set of instructions needed to download and install the software. This is followed, in Protocol 2, by the analysis of a nucleic acid structure, including the assignment of base pairs and the determination of rigid-body parameters that describe the structure and, in Protocol 3, by a description of the reconstruction of an atomic model of a structure from its rigid-body parameters. The most recent version of 3DNA, version 2.1, has new features for the analysis and manipulation of ensembles of structures, such as those deduced from nuclear magnetic resonance (NMR) measurements and molecular dynamic (MD) simulations; these features are presented in Protocols 4 and 5. In addition to the 3DNA stand-alone software package, the w3DNA web server, located at https://w3dna.rutgers.edu, provides a user-friendly interface to selected features of the software. Protocol 6 demonstrates a novel feature of the site for building models of long DNA molecules decorated with bound proteins at user-specified locations.
Genetics, Issue 74, Molecular Biology, Biochemistry, Bioengineering, Biophysics, Genomics, Chemical Biology, Quantitative Biology, conformational analysis, DNA, high-resolution structures, model building, molecular dynamics, nucleic acid structure, RNA, visualization, bioinformatics, three-dimensional, 3DNA, software
Using SCOPE to Identify Potential Regulatory Motifs in Coregulated Genes
Institutions: Dartmouth College.
SCOPE is an ensemble motif finder that uses three component algorithms in parallel to identify potential regulatory motifs by over-representation and motif position preference1
. Each component algorithm is optimized to find a different kind of motif. By taking the best of these three approaches, SCOPE performs better than any single algorithm, even in the presence of noisy data1
. In this article, we utilize a web version of SCOPE2
to examine genes that are involved in telomere maintenance. SCOPE has been incorporated into at least two other motif finding programs3,4
and has been used in other studies5-8
The three algorithms that comprise SCOPE are BEAM9
, which finds non-degenerate motifs (ACCGGT), PRISM10
, which finds degenerate motifs (ASCGWT), and SPACER11
, which finds longer bipartite motifs (ACCnnnnnnnnGGT). These three algorithms have been optimized to find their corresponding type of motif. Together, they allow SCOPE to perform extremely well.
Once a gene set has been analyzed and candidate motifs identified, SCOPE can look for other genes that contain the motif which, when added to the original set, will improve the motif score. This can occur through over-representation or motif position preference. Working with partial gene sets that have biologically verified transcription factor binding sites, SCOPE was able to identify most of the rest of the genes also regulated by the given transcription factor.
Output from SCOPE shows candidate motifs, their significance, and other information both as a table and as a graphical motif map. FAQs and video tutorials are available at the SCOPE web site which also includes a "Sample Search" button that allows the user to perform a trial run.
Scope has a very friendly user interface that enables novice users to access the algorithm's full power without having to become an expert in the bioinformatics of motif finding. As input, SCOPE can take a list of genes, or FASTA sequences. These can be entered in browser text fields, or read from a file. The output from SCOPE contains a list of all identified motifs with their scores, number of occurrences, fraction of genes containing the motif, and the algorithm used to identify the motif. For each motif, result details include a consensus representation of the motif, a sequence logo, a position weight matrix, and a list of instances for every motif occurrence (with exact positions and "strand" indicated). Results are returned in a browser window and also optionally by email. Previous papers describe the SCOPE algorithms in detail1,2,9-11
Genetics, Issue 51, gene regulation, computational biology, algorithm, promoter sequence motif