Schistosoma japonicum is a major infectious agent of schistosomiasis. It has been reported that large number of proteins excreted and secreted by S. japonicum during its life cycle are important for its infection and survival in definitive hosts. These proteins can be used as ideal candidates for vaccines or drug targets. In this work, we analyzed the protein sequences of S. japonicum and found that compared with other proteins in S. japonicum, excretory/secretory (ES) proteins are generally longer, more likely to be stable and enzyme, more likely to contain immune-related binding peptides and more likely to be involved in regulation and metabolism processes. Based on the sequence difference between ES and non-ES proteins, we trained a support vector machine (SVM) with much higher accuracy than existing approaches. Using this SVM, we identified 191 new ES proteins in S. japonicum, and further predicted 7 potential interactions between these ES proteins and human immune proteins. Our results are useful to understand the pathogenesis of schistosomiasis and can serve as a new resource for vaccine or drug targets discovery for anti-schistosome.
Native structures of proteins are formed essentially due to the combining effects of local and distant (in the sense of sequence) interactions among residues. These interaction information are, explicitly or implicitly, encoded into the scoring function in protein structure prediction approaches--threading approaches usually measure an alignment in the sense that how well a sequence adopts an existing structure; while the energy functions in Ab Initio methods are designed to measure how likely a conformation is near-native. Encouraging progress has been observed in structure refinement where knowledge-based or physics-based potentials are designed to capture distant interactions. Thus, it is interesting to investigate whether distant interaction information captured by the Ab Initio energy function can be used to improve threading, especially for the weakly/distant homologous templates.
Exon expression profiling technologies, including exon arrays and RNA-Seq, measure the abundance of every exon in a gene. Compared with gene expression profiling technologies like 3 array, exon expression profiling technologies could detect alterations in both transcription and alternative splicing, therefore they are expected to be more sensitive in diagnosis. However, exon expression profiling also brings higher dimension, more redundancy, and significant correlation among features. Ignoring the correlation structure among exons of a gene, a popular classification method like L1-SVM selects exons individually from each gene and thus is vulnerable to noise. To overcome this limitation, we present in this paper a new variant of SVM named Lex-SVM to incorporate correlation structure among exons and known splicing patterns to promote classification performance. Specifically, we construct a new norm, ex-norm, including our prior knowledge on exon correlation structure to regularize the coefficients of a linear SVM. Lex-SVM can be solved efficiently using standard linear programming techniques. The advantage of Lex-SVM is that it can select features group-wisely, force features in a subgroup to take equal weihts and exclude the features that contradict the majority in the subgroup. Experimental results suggest that on exon expression profile, Lex-SVM is more accurate than existing methods. Lex-SVM also generates a more compact model and selects genes more consistently in cross-validation. Unlike L1-SVM selecting only one exon in a gene, Lex-SVM assigns equal weights to as many exons in a gene as possible, lending itself easier for further interpretation.
Although accumulating evidence has provided insight into the various functions of long-non-coding RNAs (lncRNAs), the exact functions of the majority of such transcripts are still unknown. Here, we report the first computational annotation of lncRNA functions based on public microarray expression profiles. A coding-non-coding gene co-expression (CNC) network was constructed from re-annotated Affymetrix Mouse Genome Array data. Probable functions for altogether 340 lncRNAs were predicted based on topological or other network characteristics, such as module sharing, association with network hubs and combinations of co-expression and genomic adjacency. The functions annotated to the lncRNAs mainly involve organ or tissue development (e.g. neuron, eye and muscle development), cellular transport (e.g. neuronal transport and sodium ion, acid or lipid transport) or metabolic processes (e.g. involving macromolecules, phosphocreatine and tyrosine).
MicroRNAs (miRNAs), a growing class of small RNAs with crucial regulatory roles at the post-transcriptional level, are usually found to be clustered on chromosomes. However, with the exception of a few individual cases, so far little is known about the functional consequence of this conserved clustering of miRNA loci. In animal genomes such clusters often contain non-homologous miRNA genes. One hypothesis to explain this heterogeneity suggests that clustered miRNAs are functionally related by virtue of co-targeting downstream pathways.
Related JoVE Video
Journal of Visualized Experiments
What is Visualize?
JoVE Visualize is a tool created to match the last 5 years of PubMed publications to methods in JoVE's video library.
How does it work?
We use abstracts found on PubMed and match them to JoVE videos to create a list of 10 to 30 related methods videos.
Video X seems to be unrelated to Abstract Y...
In developing our video relationships, we compare around 5 million PubMed articles to our library of over 4,500 methods videos. In some cases the language used in the PubMed abstracts makes matching that content to a JoVE video difficult. In other cases, there happens not to be any content in our video library that is relevant to the topic of a given abstract. In these cases, our algorithms are trying their best to display videos with relevant content, which can sometimes result in matched videos with only a slight relation.