JoVE Visualize What is visualize?
Stop Reading. Start Watching.
Advanced Search
Stop Reading. Start Watching.
Regular Search
Find video protocols related to scientific articles indexed in Pubmed.
HAMAP in 2015: updates to the protein family classification and annotation system.
Nucleic Acids Res.
PUBLISHED: 10-29-2014
Show Abstract
Hide Abstract
HAMAP (High-quality Automated and Manual Annotation of Proteins-available at http://hamap.expasy.org/) is a system for the automatic classification and annotation of protein sequences. HAMAP provides annotation of the same quality and detail as UniProtKB/Swiss-Prot, using manually curated profiles for protein sequence family classification and expert curated rules for functional annotation of family members. HAMAP data and tools are made available through our website and as part of the UniRule pipeline of UniProt, providing annotation for millions of unreviewed sequences of UniProtKB/TrEMBL. Here we report on the growth of HAMAP and updates to the HAMAP system since our last report in the NAR Database Issue of 2013. We continue to augment HAMAP with new family profiles and annotation rules as new protein families are characterized and annotated in UniProtKB/Swiss-Prot; the latest version of HAMAP (as of 3 September 2014) contains 1983 family classification profiles and 1998 annotation rules (up from 1780 and 1720). We demonstrate how the complex logic of HAMAP rules allows for precise annotation of individual functional variants within large homologous protein families. We also describe improvements to our web-based tool HAMAP-Scan which simplify the classification and annotation of sequences, and the incorporation of an improved sequence-profile search algorithm.
Related JoVE Video
Updates in Rhea-a manually curated resource of biochemical reactions.
Nucleic Acids Res.
PUBLISHED: 10-22-2014
Show Abstract
Hide Abstract
Rhea (http://www.ebi.ac.uk/rhea) is a comprehensive and non-redundant resource of expert-curated biochemical reactions described using species from the ChEBI (Chemical Entities of Biological Interest) ontology of small molecules. Rhea has been designed for the functional annotation of enzymes and the description of genome-scale metabolic networks, providing stoichiometrically balanced enzyme-catalyzed reactions (covering the IUBMB Enzyme Nomenclature list and additional reactions), transport reactions and spontaneously occurring reactions. Rhea reactions are extensively curated with links to source literature and are mapped to other publicly available enzyme and pathway databases such as Reactome, BioCyc, KEGG and UniPathway, through manual curation and computational methods. Here we describe developments in Rhea since our last report in the 2012 database issue of Nucleic Acids Research. These include significant growth in the number of Rhea reactions and the inclusion of reactions involving complex macromolecules such as proteins, nucleic acids and other polymers that lie outside the scope of ChEBI. Together these developments will significantly increase the utility of Rhea as a tool for the description, analysis and reconciliation of genome-scale metabolic models.
Related JoVE Video
Differentially Phased Leaf Growth and Movements in Arabidopsis Depend on Coordinated Circadian and Light Regulation.
Plant Cell
PUBLISHED: 10-03-2014
Show Abstract
Hide Abstract
In contrast to vastly studied hypocotyl growth, little is known about diel regulation of leaf growth and its coordination with movements such as changes in leaf elevation angle (hyponasty). We developed a 3D live-leaf growth analysis system enabling simultaneous monitoring of growth and movements. Leaf growth is maximal several hours after dawn, requires light, and is regulated by daylength, suggesting coupling between growth and metabolism. We identify both blade and petiole positioning as important components of leaf movements in Arabidopsis thaliana and reveal a temporal delay between growth and movements. In hypocotyls, the combination of circadian expression of PHYTOCHROME INTERACTING FACTOR4 (PIF4) and PIF5 and their light-regulated protein stability drives rhythmic hypocotyl elongation with peak growth at dawn. We find that PIF4 and PIF5 are not essential to sustain rhythmic leaf growth but influence their amplitude. Furthermore, EARLY FLOWERING3, a member of the evening complex (EC), is required to maintain the correct phase between growth and movement. Our study shows that the mechanisms underlying rhythmic hypocotyl and leaf growth differ. Moreover, we reveal the temporal relationship between leaf elongation and movements and demonstrate the importance of the EC for the coordination of these phenotypic traits.
Related JoVE Video
Transcriptional response to cardiac injury in the zebrafish: systematic identification of genes with highly concordant activity across in vivo models.
BMC Genomics
PUBLISHED: 09-15-2014
Show Abstract
Hide Abstract
Zebrafish is a clinically-relevant model of heart regeneration. Unlike mammals, it has a remarkable heart repair capacity after injury, and promises novel translational applications. Amputation and cryoinjury models are key research tools for understanding injury response and regeneration in vivo. An understanding of the transcriptional responses following injury is needed to identify key players of heart tissue repair, as well as potential targets for boosting this property in humans.
Related JoVE Video
Extensive remodeling of DC function by rapid maturation-induced transcriptional silencing.
Nucleic Acids Res.
PUBLISHED: 08-07-2014
Show Abstract
Hide Abstract
The activation, or maturation, of dendritic cells (DCs) is crucial for the initiation of adaptive T-cell mediated immune responses. Research on the molecular mechanisms implicated in DC maturation has focused primarily on inducible gene-expression events promoting the acquisition of new functions, such as cytokine production and enhanced T-cell-stimulatory capacity. In contrast, mechanisms that modulate DC function by inducing widespread gene-silencing remain poorly understood. Yet the termination of key functions is known to be critical for the function of activated DCs. Genome-wide analysis of activation-induced histone deacetylation, combined with genome-wide quantification of activation-induced silencing of nascent transcription, led us to identify a novel inducible transcriptional-repression pathway that makes major contributions to the DC-maturation process. This silencing response is a rapid primary event distinct from repression mechanisms known to operate at later stages of DC maturation. The repressed genes function in pivotal processes--including antigen-presentation, extracellular signal detection, intracellular signal transduction and lipid-mediator biosynthesis--underscoring the central contribution of the silencing mechanism to rapid reshaping of DC function. Interestingly, promoters of the repressed genes exhibit a surprisingly high frequency of PU.1-occupied sites, suggesting a novel role for this lineage-specific transcription factor in marking genes poised for inducible repression.
Related JoVE Video
Analysis of stop-gain and frameshift variants in human innate immunity genes.
PLoS Comput. Biol.
PUBLISHED: 07-01-2014
Show Abstract
Hide Abstract
Loss-of-function variants in innate immunity genes are associated with Mendelian disorders in the form of primary immunodeficiencies. Recent resequencing projects report that stop-gains and frameshifts are collectively prevalent in humans and could be responsible for some of the inter-individual variability in innate immune response. Current computational approaches evaluating loss-of-function in genes carrying these variants rely on gene-level characteristics such as evolutionary conservation and functional redundancy across the genome. However, innate immunity genes represent a particular case because they are more likely to be under positive selection and duplicated. To create a ranking of severity that would be applicable to innate immunity genes we evaluated 17,764 stop-gain and 13,915 frameshift variants from the NHLBI Exome Sequencing Project and 1,000 Genomes Project. Sequence-based features such as loss of functional domains, isoform-specific truncation and nonsense-mediated decay were found to correlate with variant allele frequency and validated with gene expression data. We integrated these features in a Bayesian classification scheme and benchmarked its use in predicting pathogenic variants against Online Mendelian Inheritance in Man (OMIM) disease stop-gains and frameshifts. The classification scheme was applied in the assessment of 335 stop-gains and 236 frameshifts affecting 227 interferon-stimulated genes. The sequence-based score ranks variants in innate immunity genes according to their potential to cause disease, and complements existing gene-based pathogenicity scores. Specifically, the sequence-based score improves measurement of functional gene impairment, discriminates across different variants in a given gene and appears particularly useful for analysis of less conserved genes.
Related JoVE Video
The mzTab data exchange format: communicating mass-spectrometry-based proteomics and metabolomics experimental results to a wider audience.
Mol. Cell Proteomics
PUBLISHED: 06-30-2014
Show Abstract
Hide Abstract
The HUPO Proteomics Standards Initiative has developed several standardized data formats to facilitate data sharing in mass spectrometry (MS)-based proteomics. These allow researchers to report their complete results in a unified way. However, at present, there is no format to describe the final qualitative and quantitative results for proteomics and metabolomics experiments in a simple tabular format. Many downstream analysis use cases are only concerned with the final results of an experiment and require an easily accessible format, compatible with tools such as Microsoft Excel or R. We developed the mzTab file format for MS-based proteomics and metabolomics results to meet this need. mzTab is intended as a lightweight supplement to the existing standard XML-based file formats (mzML, mzIdentML, mzQuantML), providing a comprehensive summary, similar in concept to the supplemental material of a scientific publication. mzTab files can contain protein, peptide, and small molecule identifications together with experimental metadata and basic quantitative information. The format is not intended to store the complete experimental evidence but provides mechanisms to report results at different levels of detail. These range from a simple summary of the final results to a representation of the results including the experimental design. This format is ideally suited to make MS-based proteomics and metabolomics results available to a wider biological community outside the field of MS. Several software tools for proteomics and metabolomics have already adapted the format as an output format. The comprehensive mzTab specification document and extensive additional documentation can be found online.
Related JoVE Video
Functional and Evolutionary Analysis of the CASPARIAN STRIP MEMBRANE DOMAIN PROTEIN Family.
Plant Physiol.
PUBLISHED: 06-11-2014
Show Abstract
Hide Abstract
CASPARIAN STRIP MEMBRANE DOMAIN PROTEINS (CASPs) are four-membrane-span proteins that mediate the deposition of Casparian strips in the endodermis by recruiting the lignin polymerization machinery. CASPs show high stability in their membrane domain, which presents all the hallmarks of a membrane scaffold. Here, we characterized the large family of CASP-like (CASPL) proteins. CASPLs were found in all major divisions of land plants as well as in green algae; homologs outside of the plant kingdom were identified as members of the MARVEL protein family. When ectopically expressed in the endodermis, most CASPLs were able to integrate the CASP membrane domain, which suggests that CASPLs share with CASPs the propensity to form transmembrane scaffolds. Extracellular loops are not necessary for generating the scaffold, since CASP1 was still able to localize correctly when either one of the extracellular loops was deleted. The CASP first extracellular loop was found conserved in euphyllophytes but absent in plants lacking Casparian strips, an observation that may contribute to the study of Casparian strip and root evolution. In Arabidopsis (Arabidopsis thaliana), CASPL showed specific expression in a variety of cell types, such as trichomes, abscission zone cells, peripheral root cap cells, and xylem pole pericycle cells.
Related JoVE Video
Fifteen years SIB Swiss Institute of Bioinformatics: life science databases, tools and support.
Nucleic Acids Res.
PUBLISHED: 05-03-2014
Show Abstract
Hide Abstract
The SIB Swiss Institute of Bioinformatics (www.isb-sib.ch) was created in 1998 as an institution to foster excellence in bioinformatics. It is renowned worldwide for its databases and software tools, such as UniProtKB/Swiss-Prot, PROSITE, SWISS-MODEL, STRING, etc, that are all accessible on ExPASy.org, SIB's Bioinformatics Resource Portal. This article provides an overview of the scientific and training resources SIB has consistently been offering to the life science community for more than 15 years.
Related JoVE Video
Genome-wide profiling of the cardiac transcriptome after myocardial infarction identifies novel heart-specific long non-coding RNAs.
Eur. Heart J.
PUBLISHED: 05-03-2014
Show Abstract
Hide Abstract
Heart disease is recognized as a consequence of dysregulation of cardiac gene regulatory networks. Previously, unappreciated components of such networks are the long non-coding RNAs (lncRNAs). Their roles in the heart remain to be elucidated. Thus, this study aimed to systematically characterize the cardiac long non-coding transcriptome post-myocardial infarction and to elucidate their potential roles in cardiac homoeostasis.
Related JoVE Video
Type I interferons protect T cells against NK cell attack mediated by the activating receptor NCR1.
Immunity
PUBLISHED: 04-18-2014
Show Abstract
Hide Abstract
Direct type I interferon (IFN) signaling on T cells is necessary for the proper expansion, differentiation, and survival of responding T cells following infection with viruses prominently inducing type I IFN. The reasons for the abortive response of T cells lacking the type I IFN receptor (Ifnar1(-/-)) remain unclear. We report here that Ifnar1(-/-) T cells were highly susceptible to natural killer (NK) cell-mediated killing in a perforin-dependent manner. Depletion of NK cells prior to lymphocytic choriomeningitis virus (LCMV) infection completely restored the early expansion of Ifnar1(-/-) T cells. Ifnar1(-/-) T cells had elevated expression of natural cytotoxicity triggering receptor 1 (NCR1) ligands upon infection, rendering them targets for NCR1 mediated NK cell attack. Thus, direct sensing of type I IFNs by T cells protects them from NK cell killing by regulating the expression of NCR1 ligands, thereby revealing a mechanism by which T cells can evade the potent cytotoxic activity of NK cells.
Related JoVE Video
Integrative knowledge management to enhance pharmaceutical R&D.
Nat Rev Drug Discov
PUBLISHED: 04-02-2014
Show Abstract
Hide Abstract
Information technologies already have a key role in pharmaceutical research and development (R&D), but achieving substantial advances in their use and effectiveness will depend on overcoming current challenges in sharing, integrating and jointly analysing the range of data generated at different stages of the R&D process.
Related JoVE Video
Automated quantitative histology reveals vascular morphodynamics during Arabidopsis hypocotyl secondary growth.
Elife
PUBLISHED: 02-13-2014
Show Abstract
Hide Abstract
Among various advantages, their small size makes model organisms preferred subjects of investigation. Yet, even in model systems detailed analysis of numerous developmental processes at cellular level is severely hampered by their scale. For instance, secondary growth of Arabidopsis hypocotyls creates a radial pattern of highly specialized tissues that comprises several thousand cells starting from a few dozen. This dynamic process is difficult to follow because of its scale and because it can only be investigated invasively, precluding comprehensive understanding of the cell proliferation, differentiation, and patterning events involved. To overcome such limitation, we established an automated quantitative histology approach. We acquired hypocotyl cross-sections from tiled high-resolution images and extracted their information content using custom high-throughput image processing and segmentation. Coupled with automated cell type recognition through machine learning, we could establish a cellular resolution atlas that reveals vascular morphodynamics during secondary growth, for example equidistant phloem pole formation. DOI: http://dx.doi.org/10.7554/eLife.01567.001.
Related JoVE Video
TBC1D7 mutations are associated with intellectual disability, macrocrania, patellar dislocation, and celiac disease.
Hum. Mutat.
PUBLISHED: 02-03-2014
Show Abstract
Hide Abstract
TBC1D7 forms a complex with TSC1 and TSC2 that inhibits mTORC1 signaling and limits cell growth. Mutations in TBC1D7 were reported in a family with intellectual disability (ID) and macrocrania. Using exome sequencing, we identified two sisters homozygote for the novel c.17_20delAGAG, p.R7TfsX21 TBC1D7 truncating mutation. In addition to the already described macrocephaly and mild ID, they share osteoarticular defects, patella dislocation, behavioral abnormalities, psychosis, learning difficulties, celiac disease, prognathism, myopia, and astigmatism. Consistent with a loss-of-function of TBC1D7, the patient's cell lines show an increase in the phosphorylation of 4EBP1, a direct downstream target of mTORC1 and a delay in the initiation of the autophagy process. This second family allows enlarging the phenotypic spectrum associated with TBC1D7 mutations and defining a TBC1D7 syndrome. Our work reinforces the involvement of TBC1D7 in the regulation of mTORC1 pathways and suggests an altered control of autophagy as possible cause of this disease.
Related JoVE Video
Genetic variations and diseases in UniProtKB/Swiss-Prot: the ins and outs of expert manual curation.
Hum. Mutat.
PUBLISHED: 01-31-2014
Show Abstract
Hide Abstract
During the last few years, next-generation sequencing (NGS) technologies have accelerated the detection of genetic variants resulting in the rapid discovery of new disease-associated genes. However, the wealth of variation data made available by NGS alone is not sufficient to understand the mechanisms underlying disease pathogenesis and manifestation. Multidisciplinary approaches combining sequence and clinical data with prior biological knowledge are needed to unravel the role of genetic variants in human health and disease. In this context, it is crucial that these data are linked, organized, and made readily available through reliable online resources. The Swiss-Prot section of the Universal Protein Knowledgebase (UniProtKB/Swiss-Prot) provides the scientific community with a collection of information on protein functions, interactions, biological pathways, as well as human genetic diseases and variants, all manually reviewed by experts. In this article, we present an overview of the information content of UniProtKB/Swiss-Prot to show how this knowledgebase can support researchers in the elucidation of the mechanisms leading from a molecular defect to a disease phenotype.
Related JoVE Video
An integrated ontology resource to explore and study host-virus relationships.
PLoS ONE
PUBLISHED: 01-01-2014
Show Abstract
Hide Abstract
Our growing knowledge of viruses reveals how these pathogens manage to evade innate host defenses. A global scheme emerges in which many viruses usurp key cellular defense mechanisms and often inhibit the same components of antiviral signaling. To accurately describe these processes, we have generated a comprehensive dictionary for eukaryotic host-virus interactions. This controlled vocabulary has been detailed in 57 ViralZone resource web pages which contain a global description of all molecular processes. In order to annotate viral gene products with this vocabulary, an ontology has been built in a hierarchy of UniProt Knowledgebase (UniProtKB) keyword terms and corresponding Gene Ontology (GO) terms have been developed in parallel. The results are 65 UniProtKB keywords related to 57 GO terms, which have been used in 14,390 manual annotations; 908,723 automatic annotations and propagated to an estimation of 922,941 GO annotations. ViralZone pages, UniProtKB keywords and GO terms provide complementary tools to users, and the three resources have been linked to each other through host-virus vocabulary.
Related JoVE Video
The EMPRES-i genetic module: a novel tool linking epidemiological outbreak information and genetic characteristics of influenza viruses.
Database (Oxford)
PUBLISHED: 01-01-2014
Show Abstract
Hide Abstract
Combining epidemiological information, genetic characterization and geomapping in the analysis of influenza can contribute to a better understanding and description of influenza epidemiology and ecology, including possible virus reassortment events. Furthermore, integration of information such as agroecological farming system characteristics can provide new knowledge on risk factors of influenza emergence and spread. Integrating viral characteristics into an animal disease information system is therefore expected to provide a unique tool to trace-and-track particular virus strains; generate clade distributions and spatiotemporal clusters; screen for distribution of viruses with specific molecular markers; identify potential risk factors; and analyze or map viral characteristics related to vaccines used for control and/or prevention. For this purpose, a genetic module was developed within EMPRES-i (FAO's global animal disease information system) linking epidemiological information from influenza events with virus characteristics and enabling combined analysis. An algorithm was developed to act as the interface between EMPRES-i disease event data and publicly available influenza virus sequences in OpenfluDB. This algorithm automatically computes potential links between outbreak event and sequences, which are subsequently manually validated by experts. Subsequently, other virus characteristics such as antiviral resistance can then be associated to outbreak data. To visualize such characteristics on a geographic map, shape files with virus characteristics to overlay on other EMPRES-i map layers (e.g. animal densities) can be generated. The genetic module allows export of associated epidemiological and sequence data for further analysis. FAO has made this tool available for scientists and policy makers. Contributions are expected from users to improve and validate the number of linked influenza events and isolate information as well as the quality of information. Possibilities to interconnect with other influenza sequence databases or to expand the genetic module to other viral diseases (e.g. foot and mouth disease) are being explored. Database OpenfluDB URL: http://openflu.vital-it.ch Database EMPRES-i URL: http://EMPRES-i.fao.org/.
Related JoVE Video
SBML qualitative models: a model representation format and infrastructure to foster interactions between qualitative modelling formalisms and tools.
BMC Syst Biol
PUBLISHED: 08-15-2013
Show Abstract
Hide Abstract
Qualitative frameworks, especially those based on the logical discrete formalism, are increasingly used to model regulatory and signalling networks. A major advantage of these frameworks is that they do not require precise quantitative data, and that they are well-suited for studies of large networks. While numerous groups have developed specific computational tools that provide original methods to analyse qualitative models, a standard format to exchange qualitative models has been missing.
Related JoVE Video
Hard-wired heterogeneity in blood stem cells revealed using a dynamic regulatory network model.
Bioinformatics
PUBLISHED: 07-02-2013
Show Abstract
Hide Abstract
Combinatorial interactions of transcription factors with cis-regulatory elements control the dynamic progression through successive cellular states and thus underpin all metazoan development. The construction of network models of cis-regulatory elements, therefore, has the potential to generate fundamental insights into cellular fate and differentiation. Haematopoiesis has long served as a model system to study mammalian differentiation, yet modelling based on experimentally informed cis-regulatory interactions has so far been restricted to pairs of interacting factors. Here, we have generated a Boolean network model based on detailed cis-regulatory functional data connecting 11 haematopoietic stem/progenitor cell (HSPC) regulator genes.
Related JoVE Video
TIE-2 and VEGFR kinase activities drive immunosuppressive function of TIE-2-expressing monocytes in human breast tumors.
Clin. Cancer Res.
PUBLISHED: 05-06-2013
Show Abstract
Hide Abstract
Tumor-associated TIE-2-expressing monocytes (TEM) are highly proangiogenic cells critical for tumor vascularization. We previously showed that, in human breast cancer, TIE-2 and VEGFR pathways control proangiogenic activity of TEMs. Here, we examine the contribution of these pathways to immunosuppressive activity of TEMs.
Related JoVE Video
Density-based hierarchical clustering of pyro-sequences on a large scale--the case of fungal ITS1.
Bioinformatics
PUBLISHED: 03-28-2013
Show Abstract
Hide Abstract
Analysis of millions of pyro-sequences is currently playing a crucial role in the advance of environmental microbiology. Taxonomy-independent, i.e. unsupervised, clustering of these sequences is essential for the definition of Operational Taxonomic Units. For this application, reproducibility and robustness should be the most sought after qualities, but have thus far largely been overlooked.
Related JoVE Video
pfsearchV3: a code acceleration and heuristic to search PROSITE profiles.
Bioinformatics
PUBLISHED: 03-16-2013
Show Abstract
Hide Abstract
The PROSITE resource provides a rich and well annotated source of signatures in the form of generalized profiles that allow protein domain detection and functional annotation. One of the major limiting factors in the application of PROSITE in genome and metagenome annotation pipelines is the time required to search protein sequence databases for putative matches. We describe an improved and optimized implementation of the PROSITE search tool pfsearch that, combined with a newly developed heuristic, addresses this limitation. On a modern x86_64 hyper-threaded quad-core desktop computer, the new pfsearchV3 is two orders of magnitude faster than the original algorithm.
Related JoVE Video
Application of text-mining for updating protein post-translational modification annotation in UniProtKB.
BMC Bioinformatics
PUBLISHED: 03-08-2013
Show Abstract
Hide Abstract
The annotation of protein post-translational modifications (PTMs) is an important task of UniProtKB curators and, with continuing improvements in experimental methodology, an ever greater number of articles are being published on this topic. To help curators cope with this growing body of information we have developed a system which extracts information from the scientific literature for the most frequently annotated PTMs in UniProtKB.
Related JoVE Video
Evolution of the ferric reductase domain (FRD) superfamily: modularity, functional diversification, and signature motifs.
PLoS ONE
PUBLISHED: 01-30-2013
Show Abstract
Hide Abstract
A heme-containing transmembrane ferric reductase domain (FRD) is found in bacterial and eukaryotic protein families, including ferric reductases (FRE), and NADPH oxidases (NOX). The aim of this study was to understand the phylogeny of the FRD superfamily. Bacteria contain FRD proteins consisting only of the ferric reductase domain, such as YedZ and short bFRE proteins. Full length FRE and NOX enzymes are mostly found in eukaryotic cells and all possess a dehydrogenase domain, allowing them to catalyze electron transfer from cytosolic NADPH to extracellular metal ions (FRE) or oxygen (NOX). Metazoa possess YedZ-related STEAP proteins, possibly derived from bacteria through horizontal gene transfer. Phylogenetic analyses suggests that FRE enzymes appeared early in evolution, followed by a transition towards EF-hand containing NOX enzymes (NOX5- and DUOX-like). An ancestral gene of the NOX(1-4) family probably lost the EF-hands and new regulatory mechanisms of increasing complexity evolved in this clade. Two signature motifs were identified: NOX enzymes are distinguished from FRE enzymes through a four amino acid motif spanning from transmembrane domain 3 (TM3) to TM4, and YedZ/STEAP proteins are identified by the replacement of the first canonical heme-spanning histidine by a highly conserved arginine. The FRD superfamily most likely originated in bacteria.
Related JoVE Video
Database resources for the tuberculosis community.
Tuberculosis (Edinb)
PUBLISHED: 01-17-2013
Show Abstract
Hide Abstract
Access to online repositories for genomic and associated "-omics" datasets is now an essential part of everyday research activity. It is important therefore that the Tuberculosis community is aware of the databases and tools available to them online, as well as for the database hosts to know what the needs of the research community are. One of the goals of the Tuberculosis Annotation Jamboree, held in Washington DC on March 7th-8th 2012, was therefore to provide an overview of the current status of three key Tuberculosis resources, TubercuList (tuberculist.epfl.ch), TB Database (www.tbdb.org), and Pathosystems Resource Integration Center (PATRIC, www.patricbrc.org). Here we summarize some key updates and upcoming features in TubercuList, and provide an overview of the PATRIC site and its online tools for pathogen RNA-Seq analysis.
Related JoVE Video
Plant species distributions along environmental gradients: do belowground interactions with fungi matter?
Front Plant Sci
PUBLISHED: 01-01-2013
Show Abstract
Hide Abstract
The distribution of plants along environmental gradients is constrained by abiotic and biotic factors. Cumulative evidence attests of the impact of biotic factors on plant distributions, but only few studies discuss the role of belowground communities. Soil fungi, in particular, are thought to play an important role in how plant species assemble locally into communities. We first review existing evidence, and then test the effect of the number of soil fungal operational taxonomic units (OTUs) on plant species distributions using a recently collected dataset of plant and metagenomic information on soil fungi in the Western Swiss Alps. Using species distribution models (SDMs), we investigated whether the distribution of individual plant species is correlated to the number of OTUs of two important soil fungal classes known to interact with plants: the Glomeromycetes, that are obligatory symbionts of plants, and the Agaricomycetes, that may be facultative plant symbionts, pathogens, or wood decayers. We show that including the fungal richness information in the models of plant species distributions improves predictive accuracy. Number of fungal OTUs is especially correlated to the distribution of high elevation plant species. We suggest that high elevation soil show greater variation in fungal assemblages that may in turn impact plant turnover among communities. We finally discuss how to move beyond correlative analyses, through the design of field experiments manipulating plant and fungal communities along environmental gradients.
Related JoVE Video
Qualitative modeling identifies IL-11 as a novel regulator in maintaining self-renewal in human pluripotent stem cells.
Front Physiol
PUBLISHED: 01-01-2013
Show Abstract
Hide Abstract
Pluripotency in human embryonic stem cells (hESCs) and induced pluripotent stem cells (iPSCs) is regulated by three transcription factors-OCT3/4, SOX2, and NANOG. To fully exploit the therapeutic potential of these cells it is essential to have a good mechanistic understanding of the maintenance of self-renewal and pluripotency. In this study, we demonstrate a powerful systems biology approach in which we first expand literature-based network encompassing the core regulators of pluripotency by assessing the behavior of genes targeted by perturbation experiments. We focused our attention on highly regulated genes encoding cell surface and secreted proteins as these can be more easily manipulated by the use of inhibitors or recombinant proteins. Qualitative modeling based on combining boolean networks and in silico perturbation experiments were employed to identify novel pluripotency-regulating genes. We validated Interleukin-11 (IL-11) and demonstrate that this cytokine is a novel pluripotency-associated factor capable of supporting self-renewal in the absence of exogenously added bFGF in culture. To date, the various protocols for hESCs maintenance require supplementation with bFGF to activate the Activin/Nodal branch of the TGF? signaling pathway. Additional evidence supporting our findings is that IL-11 belongs to the same protein family as LIF, which is known to be necessary for maintaining pluripotency in mouse but not in human ESCs. These cytokines operate through the same gp130 receptor which interacts with Janus kinases. Our finding might explain why mESCs are in a more naïve cell state compared to hESCs and how to convert primed hESCs back to the naïve state. Taken together, our integrative modeling approach has identified novel genes as putative candidates to be incorporated into the expansion of the current gene regulatory network responsible for inducing and maintaining pluripotency.
Related JoVE Video
A 2D/3D image analysis system to track fluorescently labeled structures in rod-shaped cells: application to measure spindle pole asymmetry during mitosis.
Cell Div
PUBLISHED: 01-01-2013
Show Abstract
Hide Abstract
BACKGROUND: The yeast Schizosaccharomyces pombe is frequently used as a model for studying the cell cycle. The cells are rod-shaped and divide by medial fission. The process of cell division, or cytokinesis, is controlled by a network of signaling proteins called the Septation Initiation Network (SIN); SIN proteins associate with the SPBs during nuclear division (mitosis). Some SIN proteins associate with both SPBs early in mitosis, and then display strongly asymmetric signal intensity at the SPBs in late mitosis, just before cytokinesis. This asymmetry is thought to be important for correct regulation of SIN signaling, and coordination of cytokinesis and mitosis. In order to study the dynamics of organelles or large protein complexes such as the spindle pole body (SPB), which have been labeled with a fluorescent protein tag in living cells, a number of the image analysis problems must be solved; the cell outline must be detected automatically, and the position and signal intensity associated with the structures of interest within the cell must be determined. RESULTS: We present a new 2D and 3D image analysis system that permits versatile and robust analysis of motile, fluorescently labeled structures in rod-shaped cells. We have designed an image analysis system that we have implemented as a user-friendly software package allowing the fast and robust image-analysis of large numbers of rod-shaped cells. We have developed new robust algorithms, which we combined with existing methodologies to facilitate fast and accurate analysis. Our software permits the detection and segmentation of rod-shaped cells in either static or dynamic (i.e. time lapse) multi-channel images. It enables tracking of two structures (for example SPBs) in two different image channels. For 2D or 3D static images, the locations of the structures are identified, and then intensity values are extracted together with several quantitative parameters, such as length, width, cell orientation, background fluorescence and the distance between the structures of interest. Furthermore, two kinds of kymographs of the tracked structures can be established, one representing the migration with respect to their relative position, the other representing their individual trajectories inside the cell. This software package, called "RodCellJ", allowed us to analyze a large number of S. pombe cells to understand the rules that govern SIN protein asymmetry. CONCLUSIONS: "RodCell" is freely available to the community as a package of several ImageJ plugins to simultaneously analyze the behavior of a large number of rod-shaped cells in an extensive manner. The integration of different image-processing techniques in a single package, as well as the development of novel algorithms does not only allow to speed up the analysis with respect to the usage of existing tools, but also accounts for higher accuracy. Its utility was demonstrated on both 2D and 3D static and dynamic images to study the septation initiation network of the yeast Schizosaccharomyces pombe. More generally, it can be used in any kind of biological context where fluorescent-protein labeled structures need to be analyzed in rod-shaped cells. AVAILABILITY: RodCellJ is freely available under http://bigwww.epfl.ch/algorithms.html, (after acceptance of the publication).
Related JoVE Video
Rhea--a manually curated resource of biochemical reactions.
Nucleic Acids Res.
PUBLISHED: 12-01-2011
Show Abstract
Hide Abstract
Rhea (http://www.ebi.ac.uk/rhea) is a comprehensive resource of expert-curated biochemical reactions. Rhea provides a non-redundant set of chemical transformations for use in a broad spectrum of applications, including metabolic network reconstruction and pathway inference. Rhea includes enzyme-catalyzed reactions (covering the IUBMB Enzyme Nomenclature list), transport reactions and spontaneously occurring reactions. Rhea reactions are described using chemical species from the Chemical Entities of Biological Interest ontology (ChEBI) and are stoichiometrically balanced for mass and charge. They are extensively manually curated with links to source literature and other public resources on metabolism including enzyme and pathway databases. This cross-referencing facilitates the mapping and reconciliation of common reactions and compounds between distinct resources, which is a common first step in the reconstruction of genome scale metabolic networks and models.
Related JoVE Video
The UniProt-GO Annotation database in 2011.
Nucleic Acids Res.
PUBLISHED: 11-28-2011
Show Abstract
Hide Abstract
The GO annotation dataset provided by the UniProt Consortium (GOA: http://www.ebi.ac.uk/GOA) is a comprehensive set of evidenced-based associations between terms from the Gene Ontology resource and UniProtKB proteins. Currently supplying over 100 million annotations to 11 million proteins in more than 360,000 taxa, this resource has increased 2-fold over the last 2 years and has benefited from a wealth of checks to improve annotation correctness and consistency as well as now supplying a greater information content enabled by GO Consortium annotation format developments. Detailed, manual GO annotations obtained from the curation of peer-reviewed papers are directly contributed by all UniProt curators and supplemented with manual and electronic annotations from 36 model organism and domain-focused scientific resources. The inclusion of high-quality, automatic annotation predictions ensures the UniProt GO annotation dataset supplies functional information to a wide range of proteins, including those from poorly characterized, non-model organism species. UniProt GO annotations are freely available in a range of formats accessible by both file downloads and web-based views. In addition, the introduction of a new, normalized file format in 2010 has made for easier handling of the complete UniProt-GOA data set.
Related JoVE Video
UniPathway: a resource for the exploration and annotation of metabolic pathways.
Nucleic Acids Res.
PUBLISHED: 11-18-2011
Show Abstract
Hide Abstract
UniPathway (http://www.unipathway.org) is a fully manually curated resource for the representation and annotation of metabolic pathways. UniPathway provides explicit representations of enzyme-catalyzed and spontaneous chemical reactions, as well as a hierarchical representation of metabolic pathways. This hierarchy uses linear subpathways as the basic building block for the assembly of larger and more complex pathways, including species-specific pathway variants. All of the pathway data in UniPathway has been extensively cross-linked to existing pathway resources such as KEGG and MetaCyc, as well as sequence resources such as the UniProt KnowledgeBase (UniProtKB), for which UniPathway provides a controlled vocabulary for pathway annotation. We introduce here the basic concepts underlying the UniPathway resource, with the aim of allowing users to fully exploit the information provided by UniPathway.
Related JoVE Video
Visualization and quality assessment of de novo genome assemblies.
Bioinformatics
PUBLISHED: 10-12-2011
Show Abstract
Hide Abstract
Recent technological progress has greatly facilitated de novo genome sequencing. However, de novo assemblies consist in many pieces of contiguous sequence (contigs) arranged in thousands of scaffolds instead of small numbers of chromosomes. Confirming and improving the quality of such assemblies is critical for subsequent analysis. We present a method to evaluate genome scaffolding by aligning independently obtained transcriptome sequences to the genome and visually summarizing the alignments using the Cytoscape software. Applying this method to the genome of the red fire ant Solenopsis invicta allowed us to identify inconsistencies in 7%, confirm contig order in 20% and extend 16% of scaffolds.
Related JoVE Video
Conceptual framework and pilot study to benchmark phylogenomic databases based on reference gene trees.
Brief. Bioinformatics
PUBLISHED: 07-07-2011
Show Abstract
Hide Abstract
Phylogenomic databases provide orthology predictions for species with fully sequenced genomes. Although the goal seems well-defined, the content of these databases differs greatly. Seven ortholog databases (Ensembl Compara, eggNOG, HOGENOM, InParanoid, OMA, OrthoDB, Panther) were compared on the basis of reference trees. For three well-conserved protein families, we observed a generally high specificity of orthology assignments for these databases. We show that differences in the completeness of predicted gene relationships and in the phylogenetic information are, for the great majority, not due to the methods used, but to differences in the underlying database concepts. According to our metrics, none of the databases provides a fully correct and comprehensive protein classification. Our results provide a framework for meaningful and systematic comparisons of phylogenomic databases. In the future, a sustainable set of Gold standard phylogenetic trees could provide a robust method for phylogenomic databases to assess their current quality status, measure changes following new database releases and diagnose improvements subsequent to an upgrade of the analysis procedure.
Related JoVE Video
Comparison of strategies to detect epistasis from eQTL data.
PLoS ONE
PUBLISHED: 06-09-2011
Show Abstract
Hide Abstract
Genome-wide association studies have been instrumental in identifying genetic variants associated with complex traits such as human disease or gene expression phenotypes. It has been proposed that extending existing analysis methods by considering interactions between pairs of loci may uncover additional genetic effects. However, the large number of possible two-marker tests presents significant computational and statistical challenges. Although several strategies to detect epistasis effects have been proposed and tested for specific phenotypes, so far there has been no systematic attempt to compare their performance using real data. We made use of thousands of gene expression traits from linkage and eQTL studies, to compare the performance of different strategies. We found that using information from marginal associations between markers and phenotypes to detect epistatic effects yielded a lower false discovery rate (FDR) than a strategy solely using biological annotation in yeast, whereas results from human data were inconclusive. For future studies whose aim is to discover epistatic effects, we recommend incorporating information about marginal associations between SNPs and phenotypes instead of relying solely on biological annotation. Improved methods to discover epistatic effects will result in a more complete understanding of complex genetic effects.
Related JoVE Video
Exome sequencing identifies recurrent somatic MAP2K1 and MAP2K2 mutations in melanoma.
Nat. Genet.
PUBLISHED: 05-17-2011
Show Abstract
Hide Abstract
We performed exome sequencing to detect somatic mutations in protein-coding regions in seven melanoma cell lines and donor-matched germline cells. All melanoma samples had high numbers of somatic mutations, which showed the hallmark of UV-induced DNA repair. Such a hallmark was absent in tumor sample-specific mutations in two metastases derived from the same individual. Two melanomas with non-canonical BRAF mutations harbored gain-of-function MAP2K1 and MAP2K2 (MEK1 and MEK2, respectively) mutations, resulting in constitutive ERK phosphorylation and higher resistance to MEK inhibitors. Screening a larger cohort of individuals with melanoma revealed the presence of recurring somatic MAP2K1 and MAP2K2 mutations, which occurred at an overall frequency of 8%. Furthermore, missense and nonsense somatic mutations were frequently found in three candidate melanoma genes, FAT4, LRP1B and DSC1.
Related JoVE Video
T-Coffee: a web server for the multiple sequence alignment of protein and RNA sequences using structural information and homology extension.
Nucleic Acids Res.
PUBLISHED: 05-09-2011
Show Abstract
Hide Abstract
This article introduces a new interface for T-Coffee, a consistency-based multiple sequence alignment program. This interface provides an easy and intuitive access to the most popular functionality of the package. These include the default T-Coffee mode for protein and nucleic acid sequences, the M-Coffee mode that allows combining the output of any other aligners, and template-based modes of T-Coffee that deliver high accuracy alignments while using structural or homology derived templates. These three available template modes are Expresso for the alignment of protein with a known 3D-Structure, R-Coffee to align RNA sequences with conserved secondary structures and PSI-Coffee to accurately align distantly related sequences using homology extension. The new server benefits from recent improvements of the T-Coffee algorithm and can align up to 150 sequences as long as 10,000 residues and is available from both http://www.tcoffee.org and its main mirror http://tcoffee.crg.cat.
Related JoVE Video
A qualitative continuous model of cellular auxin and brassinosteroid signaling and their crosstalk.
Bioinformatics
PUBLISHED: 03-30-2011
Show Abstract
Hide Abstract
Hormone pathway interactions are crucial in shaping plant development, such as synergism between the auxin and brassinosteroid pathways in cell elongation. Both hormone pathways have been characterized in detail, revealing several feedback loops. The complexity of this network, combined with a shortage of kinetic data, renders its quantitative analysis virtually impossible at present.
Related JoVE Video
Network-guided analysis of genes with altered somatic copy number and gene expression reveals pathways commonly perturbed in metastatic melanoma.
PLoS ONE
PUBLISHED: 02-28-2011
Show Abstract
Hide Abstract
Cancer genomes frequently contain somatic copy number alterations (SCNA) that can significantly perturb the expression level of affected genes and thus disrupt pathways controlling normal growth. In melanoma, many studies have focussed on the copy number and gene expression levels of the BRAF, PTEN and MITF genes, but little has been done to identify new genes using these parameters at the genome-wide scale. Using karyotyping, SNP and CGH arrays, and RNA-seq, we have identified SCNA affecting gene expression (SCNA-genes) in seven human metastatic melanoma cell lines. We showed that the combination of these techniques is useful to identify candidate genes potentially involved in tumorigenesis. Since few of these alterations were recurrent across our samples, we used a protein network-guided approach to determine whether any pathways were enriched in SCNA-genes in one or more samples. From this unbiased genome-wide analysis, we identified 28 significantly enriched pathway modules. Comparison with two large, independent melanoma SCNA datasets showed less than 10% overlap at the individual gene level, but network-guided analysis revealed 66% shared pathways, including all but three of the pathways identified in our data. Frequently altered pathways included WNT, cadherin signalling, angiogenesis and melanogenesis. Additionally, our results emphasize the potential of the EPHA3 and FRS2 gene products, involved in angiogenesis and migration, as possible therapeutic targets in melanoma. Our study demonstrates the utility of network-guided approaches, for both large and small datasets, to identify pathways recurrently perturbed in cancer.
Related JoVE Video
The genome of the fire ant Solenopsis invicta.
Proc. Natl. Acad. Sci. U.S.A.
PUBLISHED: 01-31-2011
Show Abstract
Hide Abstract
Ants have evolved very complex societies and are key ecosystem members. Some ants, such as the fire ant Solenopsis invicta, are also major pests. Here, we present a draft genome of S. invicta, assembled from Roche 454 and Illumina sequencing reads obtained from a focal haploid male and his brothers. We used comparative genomic methods to obtain insight into the unique features of the S. invicta genome. For example, we found that this genome harbors four adjacent copies of vitellogenin. A phylogenetic analysis revealed that an ancestral vitellogenin gene first underwent a duplication that was followed by possibly independent duplications of each of the daughter vitellogenins. The vitellogenin genes have undergone subfunctionalization with queen- and worker-specific expression, possibly reflecting differential selection acting on the queen and worker castes. Additionally, we identified more than 400 putative olfactory receptors of which at least 297 are intact. This represents the largest repertoire reported so far in insects. S. invicta also harbors an expansion of a specific family of lipid-processing genes, two putative orthologs to the transformer/feminizer sex differentiation gene, a functional DNA methylation system, and a single putative telomerase ortholog. EST data indicate that this S. invicta telomerase ortholog has at least four spliceforms that differ in their use of two sets of mutually exclusive exons. Some of these and other unique aspects of the fire ant genome are likely linked to the complex social behavior of this species.
Related JoVE Video
CDK9 regulates AR promoter selectivity and cell growth through serine 81 phosphorylation.
Mol. Endocrinol.
PUBLISHED: 10-27-2010
Show Abstract
Hide Abstract
Previously we determined that S81 is the highest stoichiometric phosphorylation on the androgen receptor (AR) in response to hormone. To explore the role of this phosphorylation on growth, we stably expressed wild-type and S81A mutant AR in LHS and LAPC4 cells. The cells with increased wild-type AR expression grow faster compared with parental cells and S81A mutant-expressing cells, indicating that loss of S81 phosphorylation limits cell growth. To explore how S81 regulates cell growth, we tested whether S81 phosphorylation regulates AR transcriptional activity. LHS cells stably expressing wild-type and S81A mutant AR showed differences in the regulation of endogenous AR target genes, suggesting that S81 phosphorylation regulates promoter selectivity. We next sought to identify the S81 kinase using ion trap mass spectrometry to analyze AR-associated proteins in immunoprecipitates from cells. We observed cyclin-dependent kinase (CDK)9 association with the AR. CDK9 phosphorylates the AR on S81 in vitro. Phosphorylation is specific to S81 because CDK9 did not phosphorylate the AR on other serine phosphorylation sites. Overexpression of CDK9 with its cognate cyclin, Cyclin T, increased S81 phosphorylation levels in cells. Small interfering RNA knockdown of CDK9 protein levels decreased hormone-induced S81 phosphorylation. Additionally, treatment of LNCaP cells with the CDK9 inhibitors, 5,6-dichloro-1-?-D-ribofuranosylbenzimidazole and Flavopiridol, reduced S81 phosphorylation further, suggesting that CDK9 regulates S81 phosphorylation. Pharmacological inhibition of CDK9 also resulted in decreased AR transcription in LNCaP cells. Collectively these results suggest that CDK9 phosphorylation of AR S81 is an important step in regulating AR transcriptional activity and prostate cancer cell growth.
Related JoVE Video
ViralZone: a knowledge resource to understand virus diversity.
Nucleic Acids Res.
PUBLISHED: 10-14-2010
Show Abstract
Hide Abstract
The molecular diversity of viruses complicates the interpretation of viral genomic and proteomic data. To make sense of viral gene functions, investigators must be familiar with the virus host range, replication cycle and virion structure. Our aim is to provide a comprehensive resource bridging together textbook knowledge with genomic and proteomic sequences. ViralZone web resource (www.expasy.org/viralzone/) provides fact sheets on all known virus families/genera with easy access to sequence data. A selection of reference strains (RefStrain) provides annotated standards to circumvent the exponential increase of virus sequences. Moreover ViralZone offers a complete set of detailed and accurate virion pictures.
Related JoVE Video
EuroDia: a beta-cell gene expression resource.
Database (Oxford)
PUBLISHED: 10-14-2010
Show Abstract
Hide Abstract
Type 2 diabetes mellitus (T2DM) is a major disease affecting nearly 280 million people worldwide. Whilst the pathophysiological mechanisms leading to disease are poorly understood, dysfunction of the insulin-producing pancreatic beta-cells is key event for disease development. Monitoring the gene expression profiles of pancreatic beta-cells under several genetic or chemical perturbations has shed light on genes and pathways involved in T2DM. The EuroDia database has been established to build a unique collection of gene expression measurements performed on beta-cells of three organisms, namely human, mouse and rat. The Gene Expression Data Analysis Interface (GEDAI) has been developed to support this database. The quality of each dataset is assessed by a series of quality control procedures to detect putative hybridization outliers. The system integrates a web interface to several standard analysis functions from R/Bioconductor to identify differentially expressed genes and pathways. It also allows the combination of multiple experiments performed on different array platforms of the same technology. The design of this system enables each user to rapidly design a custom analysis pipeline and thus produce their own list of genes and pathways. Raw and normalized data can be downloaded for each experiment. The flexible engine of this database (GEDAI) is currently used to handle gene expression data from several laboratory-run projects dealing with different organisms and platforms. Database URL: http://eurodia.vital-it.ch.
Related JoVE Video
OpenFluDB, a database for human and animal influenza virus.
Database (Oxford)
PUBLISHED: 07-14-2010
Show Abstract
Hide Abstract
Although research on influenza lasted for more than 100 years, it is still one of the most prominent diseases causing half a million human deaths every year. With the recent observation of new highly pathogenic H5N1 and H7N7 strains, and the appearance of the influenza pandemic caused by the H1N1 swine-like lineage, a collaborative effort to share observations on the evolution of this virus in both animals and humans has been established. The OpenFlu database (OpenFluDB) is a part of this collaborative effort. It contains genomic and protein sequences, as well as epidemiological data from more than 27,000 isolates. The isolate annotations include virus type, host, geographical location and experimentally tested antiviral resistance. Putative enhanced pathogenicity as well as human adaptation propensity are computed from protein sequences. Each virus isolate can be associated with the laboratories that collected, sequenced and submitted it. Several analysis tools including multiple sequence alignment, phylogenetic analysis and sequence similarity maps enable rapid and efficient mining. The contents of OpenFluDB are supplied by direct user submission, as well as by a daily automatic procedure importing data from public repositories. Additionally, a simple mechanism facilitates the export of OpenFluDB records to GenBank. This resource has been successfully used to rapidly and widely distribute the sequences collected during the recent human swine flu outbreak and also as an exchange platform during the vaccine selection procedure. Database URL: http://openflu.vital-it.ch.
Related JoVE Video
FastEpistasis: a high performance computing solution for quantitative trait epistasis.
Bioinformatics
PUBLISHED: 04-07-2010
Show Abstract
Hide Abstract
Genome-wide association studies have become widely used tools to study effects of genetic variants on complex diseases. While it is of great interest to extend existing analysis methods by considering interaction effects between pairs of loci, the large number of possible tests presents a significant computational challenge. The number of computations is further multiplied in the study of gene expression quantitative trait mapping, in which tests are performed for thousands of gene phenotypes simultaneously.
Related JoVE Video
Animal Toxins: How is Complexity Represented in Databases?
Toxins (Basel)
PUBLISHED: 01-22-2010
Show Abstract
Hide Abstract
Peptide toxins synthesized by venomous animals have been extensively studied in the last decades. To be useful to the scientific community, this knowledge has been stored, annotated and made easy to retrieve by several databases. The aim of this article is to present what type of information users can access from each database. ArachnoServer and ConoServer focus on spider toxins and cone snail toxins, respectively. UniProtKB, a generalist protein knowledgebase, has an animal toxin-dedicated annotation program that includes toxins from all venomous animals. Finally, the ATDB metadatabase compiles data and annotations from other databases and provides toxin ontology.
Related JoVE Video
Multiple imputations applied to the DREAM3 phosphoproteomics challenge: a winning strategy.
PLoS ONE
PUBLISHED: 01-18-2010
Show Abstract
Hide Abstract
DREAM is an initiative that allows researchers to assess how well their methods or approaches can describe and predict networks of interacting molecules [1]. Each year, recently acquired datasets are released to predictors ahead of publication. Researchers typically have about three months to predict the masked data or network of interactions, using any predictive method. Predictions are assessed prior to an annual conference where the best predictions are unveiled and discussed. Here we present the strategy we used to make a winning prediction for the DREAM3 phosphoproteomics challenge. We used Amelia II, a multiple imputation software method developed by Gary King, James Honaker and Matthew Blackwell[2] in the context of social sciences to predict the 476 out of 4624 measurements that had been masked for the challenge. To chose the best possible multiple imputation parameters to apply for the challenge, we evaluated how transforming the data and varying the imputation parameters affected the ability to predict additionally masked data. We discuss the accuracy of our findings and show that multiple imputations applied to this dataset is a powerful method to accurately estimate the missing data. We postulate that multiple imputations methods might become an integral part of experimental design as a mean to achieve cost savings in experimental design or to increase the quantity of samples that could be handled for a given cost.
Related JoVE Video
Substantial deletion overlap among divergent Arabidopsis genomes revealed by intersection of short reads and tiling arrays.
Genome Biol.
PUBLISHED: 01-05-2010
Show Abstract
Hide Abstract
Identification of small polymorphisms from next generation sequencing short read data is relatively easy, but detection of larger deletions is less straightforward. Here, we analyzed four divergent Arabidopsis accessions and found that intersection of absent short read coverage with weak tiling array hybridization signal reliably flags deletions. Interestingly, individual deletions were frequently observed in two or more of the accessions examined, suggesting that variation in gene content partly reflects a common history of deletion events.
Related JoVE Video
ENFIN--A European network for integrative systems biology.
C. R. Biol.
PUBLISHED: 11-14-2009
Show Abstract
Hide Abstract
Integration of biological data of various types and the development of adapted bioinformatics tools represent critical objectives to enable research at the systems level. The European Network of Excellence ENFIN is engaged in developing an adapted infrastructure to connect databases, and platforms to enable both the generation of new bioinformatics tools and the experimental validation of computational predictions. With the aim of bridging the gap existing between standard wet laboratories and bioinformatics, the ENFIN Network runs integrative research projects to bring the latest computational techniques to bear directly on questions dedicated to systems biology in the wet laboratory environment. The Network maintains internally close collaboration between experimental and computational research, enabling a permanent cycling of experimental validation and improvement of computational prediction methods. The computational work includes the development of a database infrastructure (EnCORE), bioinformatics analysis methods and a novel platform for protein function analysis FuncNet.
Related JoVE Video
Evolutionary trajectories of primate genes involved in HIV pathogenesis.
Mol. Biol. Evol.
PUBLISHED: 09-02-2009
Show Abstract
Hide Abstract
The current availability of five complete genomes of different primate species allows the analysis of genetic divergence over the last 40 million years of evolution. We hypothesized that the interspecies differences observed in susceptibility to HIV-1 would be influenced by the long-range selective pressures on host genes associated with HIV-1 pathogenesis. We established a list of human genes (n = 140) proposed to be involved in HIV-1 biology and pathogenesis and a control set of 100 random genes. We retrieved the orthologous genes from the genome of humans and of four nonhuman primates (Pan troglodytes, Pongo pygmaeus abeli, Macaca mulatta, and Callithrix jacchus) and analyzed the nucleotide substitution patterns of this data set using codon-based maximum likelihood procedures. In addition, we evaluated whether the candidate genes have been targets of recent positive selection in humans by analyzing HapMap Phase 2 single-nucleotide polymorphisms genotyped in a region centered on each candidate gene. A total of 1,064 sequences were used for the analyses. Similar median K(A)/K(S) values were estimated for the set of genes involved in HIV-1 pathogenesis and for control genes, 0.19 and 0.15, respectively. However, genes of the innate immunity had median values of 0.37 (P value = 0.0001, compared with control genes), and genes of intrinsic cellular defense had K(A)/K(S) values around or greater than 1.0 (P value = 0.0002). Detailed assessment allowed the identification of residues under positive selection in 13 proteins: AKT1, APOBEC3G, APOBEC3H, CD4, DEFB1, GML, IL4, IL8RA, L-SIGN/CLEC4M, PTPRC/CD45, Tetherin/BST2, TLR7, and TRIM5alpha. A number of those residues are relevant for HIV-1 biology. The set of 140 genes involved in HIV-1 pathogenesis did not show a significant enrichment in signals of recent positive selection in humans (intraspecies selection). However, we identified within or near these genes 24 polymorphisms showing strong signatures of recent positive selection. Interestingly, the DEFB1 gene presented signatures of both interspecies positive selection in primates and intraspecies recent positive selection in humans. The systematic assessment of long-acting selective pressures on primate genomes is a useful tool to extend our understanding of genetic variation influencing contemporary susceptibility to HIV-1.
Related JoVE Video
The direct effects of tacrolimus and cyclosporin A on isolated human islets: A functional, survival and gene expression study.
Islets
PUBLISHED: 09-01-2009
Show Abstract
Hide Abstract
The use of immunosuppressive drugs in transplanted patients is associated with the development of diabetes, possibly due to ?-cell toxicity. To better understand the mechanisms leading to post-transplant diabetes, we investigated the actions of prolonged exposure of isolated human islets to therapeutical levels of tacrolimus (Tac) or cyclosporin A (CsA). Islets were isolated from the pancreas of multiorgan donors by enzymatic digestion and density gradient centrifugation. Functional, survival and molecular studies were then performed after 4 days of incubation with therapeutical concentrations of Tac or  CsA. Glucose-induced insulin secretion was significantly decreased in Tac, but not in CsA exposed islets, which was associated with a reduction of the amount of insulin granules as shown by electron microscopy. The percentage of apoptotic ?-cells was higher in Tac than CsA exposed islets. Microarray experiments followed by Gene Set Enrichment Analysis revealed that gene expression was more markedly affected upon Tac treatment. In conclusion, Tac and CsA affect features of beta-cell differently, with several changes occurring at the molecular level.
Related JoVE Video
Modeling stochasticity and robustness in gene regulatory networks.
Bioinformatics
PUBLISHED: 05-30-2009
Show Abstract
Hide Abstract
Understanding gene regulation in biological processes and modeling the robustness of underlying regulatory networks is an important problem that is currently being addressed by computational systems biologists. Lately, there has been a renewed interest in Boolean modeling techniques for gene regulatory networks (GRNs). However, due to their deterministic nature, it is often difficult to identify whether these modeling approaches are robust to the addition of stochastic noise that is widespread in gene regulatory processes. Stochasticity in Boolean models of GRNs has been addressed relatively sparingly in the past, mainly by flipping the expression of genes between different expression levels with a predefined probability. This stochasticity in nodes (SIN) model leads to over representation of noise in GRNs and hence non-correspondence with biological observations.
Related JoVE Video
The Microbe browser for comparative genomics.
Nucleic Acids Res.
PUBLISHED: 04-30-2009
Show Abstract
Hide Abstract
The Microbe browser is a web server providing comparative microbial genomics data. It offers comprehensive, integrated data from GenBank, RefSeq, UniProt, InterPro, Gene Ontology and the Orthologs Matrix Project (OMA) database, displayed along with gene predictions from five software packages. The Microbe browser is daily updated from the source databases and includes all completely sequenced bacterial and archaeal genomes. The data are displayed in an easy-to-use, interactive website based on Ensembl software. The Microbe browser is available at http://microbe.vital-it.ch/. Programmatic access is available through the OMA application programming interface (API) at http://microbe.vital-it.ch/api.
Related JoVE Video
AssociationViewer: a scalable and integrated software tool for visualization of large-scale variation data in genomic context.
Bioinformatics
PUBLISHED: 01-25-2009
Show Abstract
Hide Abstract
We present a tool designed for visualization of large-scale genetic and genomic data exemplified by results from genome-wide association studies. This software provides an integrated framework to facilitate the interpretation of SNP association studies in genomic context. Gene annotations can be retrieved from Ensembl, linkage disequilibrium data downloaded from HapMap and custom data imported in BED or WIG format. AssociationViewer integrates functionalities that enable the aggregation or intersection of data tracks. It implements an efficient cache system and allows the display of several, very large-scale genomic datasets.
Related JoVE Video
MIMAS 3.0 is a Multiomics Information Management and Annotation System.
BMC Bioinformatics
PUBLISHED: 01-09-2009
Show Abstract
Hide Abstract
DNA sequence integrity, mRNA concentrations and protein-DNA interactions have been subject to genome-wide analyses based on microarrays with ever increasing efficiency and reliability over the past fifteen years. However, very recently novel technologies for Ultra High-Throughput DNA Sequencing (UHTS) have been harnessed to study these phenomena with unprecedented precision. As a consequence, the extensive bioinformatics environment available for array data management, analysis, interpretation and publication must be extended to include these novel sequencing data types.
Related JoVE Video
Microarray analysis of isolated human islet transcriptome in type 2 diabetes and the role of the ubiquitin-proteasome system in pancreatic beta cell dysfunction.
Mol. Cell. Endocrinol.
Show Abstract
Hide Abstract
To shed light on islet cell molecular phenotype in human type 2 diabetes (T2D), we studied the transcriptome of non-diabetic (ND) and T2D islets to then focus on the ubiquitin-proteasome system (UPS), the major protein degradation pathway. We assessed gene expression, amount of ubiquitinated proteins, proteasome activity, and the effects of proteasome inhibition and prolonged exposure to palmitate. Microarray analysis identified more than one thousand genes differently expressed in T2D islets, involved in many structures and functions, with consistent alterations of the UPS. Quantitative RT-PCR demonstrated downregulation of selected UPS genes in T2D islets and beta cell fractions, with greater ubiquitin accumulation and reduced proteasome activity. Chemically induced reduction of proteasome activity was associated with lower glucose-stimulated insulin secretion, which was partly reproduced by palmitate exposure. These results show the presence of many changes in islet transcriptome in T2D islets and underline the importance of the association between UPS alterations and beta cell dysfunction in human T2D.
Related JoVE Video
ViralZone: recent updates to the virus knowledge resource.
Nucleic Acids Res.
Show Abstract
Hide Abstract
ViralZone (http://viralzone.expasy.org) is a knowledge repository that allows users to learn about viruses including their virion structure, replication cycle and host-virus interactions. The information is divided into viral fact sheets that describe virion shape, molecular biology and epidemiology for each viral genus, with links to the corresponding annotated proteomes of UniProtKB. Each viral genus page contains detailed illustrations, text and PubMed references. This new update provides a linked view of viral molecular biology through 133 new viral ontology pages that describe common steps of viral replication cycles shared by several viral genera. This viral cell-cycle ontology is also represented in UniProtKB in the form of annotated keywords. In this way, users can navigate from the description of a replication-cycle event, to the viral genus concerned, and the associated UniProtKB protein records.
Related JoVE Video
HAMAP in 2013, new developments in the protein family classification and annotation system.
Nucleic Acids Res.
Show Abstract
Hide Abstract
HAMAP (High-quality Automated and Manual Annotation of Proteins-available at http://hamap.expasy.org/) is a system for the classification and annotation of protein sequences. It consists of a collection of manually curated family profiles for protein classification, and associated annotation rules that specify annotations that apply to family members. HAMAP was originally developed to support the manual curation of UniProtKB/Swiss-Prot records describing microbial proteins. Here we describe new developments in HAMAP, including the extension of HAMAP to eukaryotic proteins, the use of HAMAP in the automated annotation of UniProtKB/TrEMBL, providing high-quality annotation for millions of protein sequences, and the future integration of HAMAP into a unified system for UniProtKB annotation, UniRule. HAMAP is continuously updated by expert curators with new family profiles and annotation rules as new protein families are characterized. The collection of HAMAP family classification profiles and annotation rules can be browsed and viewed on the HAMAP website, which also provides an interface to scan user sequences against HAMAP profiles.
Related JoVE Video
Reconciliation of metabolites and biochemical reactions for metabolic networks.
Brief. Bioinformatics
Show Abstract
Hide Abstract
Genome-scale metabolic network reconstructions are now routinely used in the study of metabolic pathways, their evolution and design. The development of such reconstructions involves the integration of information on reactions and metabolites from the scientific literature as well as public databases and existing genome-scale metabolic models. The reconciliation of discrepancies between data from these sources generally requires significant manual curation, which constitutes a major obstacle in efforts to develop and apply genome-scale metabolic network reconstructions. In this work, we discuss some of the major difficulties encountered in the mapping and reconciliation of metabolic resources and review three recent initiatives that aim to accelerate this process, namely BKM-react, MetRxn and MNXref (presented in this article). Each of these resources provides a pre-compiled reconciliation of many of the most commonly used metabolic resources. By reducing the time required for manual curation of metabolite and reaction discrepancies, these resources aim to accelerate the development and application of high-quality genome-scale metabolic network reconstructions and models.
Related JoVE Video
New and continuing developments at PROSITE.
Nucleic Acids Res.
Show Abstract
Hide Abstract
PROSITE (http://prosite.expasy.org/) consists of documentation entries describing protein domains, families and functional sites, as well as associated patterns and profiles to identify them. It is complemented by ProRule a collection of rules, which increases the discriminatory power of these profiles and patterns by providing additional information about functionally and/or structurally critical amino acids. PROSITE signatures, together with ProRule, are used for the annotation of domains and features of UniProtKB/Swiss-Prot entries. Here, we describe recent developments that allow users to perform whole-proteome annotation as well as a number of filtering options that can be combined to perform powerful targeted searches for biological discovery. The latest version of PROSITE (release 20.85, of 30 August 2012) contains 1308 patterns, 1039 profiles and 1041 ProRules.
Related JoVE Video
ExPASy: SIB bioinformatics resource portal.
Nucleic Acids Res.
Show Abstract
Hide Abstract
ExPASy (http://www.expasy.org) has worldwide reputation as one of the main bioinformatics resources for proteomics. It has now evolved, becoming an extensible and integrative portal accessing many scientific resources, databases and software tools in different areas of life sciences. Scientists can henceforth access seamlessly a wide range of resources in many different domains, such as proteomics, genomics, phylogeny/evolution, systems biology, population genetics, transcriptomics, etc. The individual resources (databases, web-based and downloadable software tools) are hosted in a decentralized way by different groups of the SIB Swiss Institute of Bioinformatics and partner institutions. Specifically, a single web portal provides a common entry point to a wide range of resources developed and operated by different SIB groups and external institutions. The portal features a search function across selected resources. Additionally, the availability and usage of resources are monitored. The portal is aimed for both expert users and people who are not familiar with a specific domain in life sciences. The new web interface provides, in particular, visual guidance for newcomers to ExPASy.
Related JoVE Video
Phytochrome interacting factors 4 and 5 control seedling growth in changing light conditions by directly controlling auxin signaling.
Plant J.
Show Abstract
Hide Abstract
Plant growth is strongly influenced by the presence of neighbors that compete for light resources. In response to vegetational shading shade-intolerant plants such as Arabidopsis display a suite of developmental responses known as the shade-avoidance syndrome (SAS). The phytochrome B (phyB) photoreceptor is the major light sensor to mediate this adaptive response. Control of the SAS occurs in part with phyB, which controls protein abundance of phytochrome-interacting factors 4 and 5 (PIF4 and PIF5) directly. The shade-avoidance response also requires rapid biosynthesis of auxin and its transport to promote elongation growth. The identification of genome-wide PIF5-binding sites during shade avoidance revealed that this bHLH transcription factor regulates the expression of a subset of previously identified SAS genes. Moreover our study suggests that PIF4 and PIF5 regulate elongation growth by controlling directly the expression of genes that code for auxin biosynthesis and auxin signaling components.
Related JoVE Video
The UniProtKB/Swiss-Prot Tox-Prot program: A central hub of integrated venom protein data.
Toxicon
Show Abstract
Hide Abstract
Animal toxins are of interest to a wide range of scientists, due to their numerous applications in pharmacology, neurology, hematology, medicine, and drug research. This, and to a lesser extent the development of new performing tools in transcriptomics and proteomics, has led to an increase in toxin discovery. In this context, providing publicly available data on animal toxins has become essential. The UniProtKB/Swiss-Prot Tox-Prot program (http://www.uniprot.org/program/Toxins) plays a crucial role by providing such an access to venom protein sequences and functions from all venomous species. This program has up to now curated more than 5000 venom proteins to the high-quality standards of UniProtKB/Swiss-Prot (release 2012_02). Proteins targeted by these toxins are also available in the knowledgebase. This paper describes in details the type of information provided by UniProtKB/Swiss-Prot for toxins, as well as the structured format of the knowledgebase.
Related JoVE Video
Protein interaction data curation: the International Molecular Exchange (IMEx) consortium.
Nat. Methods
Show Abstract
Hide Abstract
The International Molecular Exchange (IMEx) consortium is an international collaboration between major public interaction data providers to share literature-curation efforts and make a nonredundant set of protein interactions available in a single search interface on a common website (http://www.imexconsortium.org/). Common curation rules have been developed, and a central registry is used to manage the selection of articles to enter into the dataset. We discuss the advantages of such a service to the user, our quality-control measures and our data-distribution practices.
Related JoVE Video
Toward interoperable bioscience data.
Nat. Genet.
Show Abstract
Hide Abstract
To make full use of research data, the bioscience community needs to adopt technologies and reward mechanisms that support interoperability and promote the growth of an open data commoning culture. Here we describe the prerequisites for data commoning and present an established and growing ecosystem of solutions using the shared Investigation-Study-Assay framework to support that vision.
Related JoVE Video

What is Visualize?

JoVE Visualize is a tool created to match the last 5 years of PubMed publications to methods in JoVE's video library.

How does it work?

We use abstracts found on PubMed and match them to JoVE videos to create a list of 10 to 30 related methods videos.

Video X seems to be unrelated to Abstract Y...

In developing our video relationships, we compare around 5 million PubMed articles to our library of over 4,500 methods videos. In some cases the language used in the PubMed abstracts makes matching that content to a JoVE video difficult. In other cases, there happens not to be any content in our video library that is relevant to the topic of a given abstract. In these cases, our algorithms are trying their best to display videos with relevant content, which can sometimes result in matched videos with only a slight relation.