The development of high-throughput sequencing technologies has advanced our understanding of cancer. However, characterizing somatic structural variants in tumor genomes is still challenging because current strategies depend on the initial alignment of reads to a reference genome. Here, we describe SMUFIN (somatic mutation finder), a single program that directly compares sequence reads from normal and tumor genomes to accurately identify and characterize a range of somatic sequence variation, from single-nucleotide variants (SNV) to large structural variants at base pair resolution. Performance tests on modeled tumor genomes showed average sensitivity of 92% and 74% for SNVs and structural variants, with specificities of 95% and 91%, respectively. Analyses of aggressive forms of solid and hematological tumors revealed that SMUFIN identifies breakpoints associated with chromothripsis and chromoplexy with high specificity. SMUFIN provides an integrated solution for the accurate, fast and comprehensive characterization of somatic sequence variation in cancer.
The genetic analysis of ulcerative colitis (UC) has provided new insights into the etiology of this prevalent inflammatory bowel disease. However, most of the heritability of UC (>70%) has still not been characterized. To identify new risk loci for UC we have performed the first genome-wide association study (GWAS) in a Southern European population and undertaken a meta-analysis study combining the newly genotyped 825 UC patients and 1525 healthy controls from Spain with the six previously published GWAS comprising 6687 cases and 19 718 controls from Northern-European ancestry. We identified a novel locus with genome-wide significance at 6q22.1 [rs2858829, P = 8.97 × 10(-9), odds ratio (OR) (95% confidence interval, CI] = 1.12 (1.08-1.16)] that was validated with genotype data from a replication cohort of the same Southern European ancestry consisting in 1073 cases and 1279 controls [combined P = 7.59 × 10(-10), OR (95% CI) = 1.12 (1.08-1.16)]. Furthermore, we confirmed the association of 33 reported associations with UC and we nominally validated the GWAS results of nine new risk loci (P < 0.05, same direction of effect). SNP rs2858829 lies in an intergenic region and is a strong cis-eQTL for FAM26F gene, a gene that is shown to be selectively upregulated in UC colonic mucosa with active inflammation. Our results provide new insight into the genetic risk background of UC, confirming that there is a genetic risk component that differentiates from Crohn's Disease, the other major form of inflammatory bowel disease.
Despite of the variety of available Web services registries specially aimed at Life Sciences, their scope is usually restricted to a limited set of well-defined types of services. While dedicated registries are generally tied to a particular format, general-purpose ones are more adherent to standards and usually rely on Web Service Definition Language (WSDL). Although WSDL is quite flexible to support common Web services types, its lack of semantic expressiveness led to various initiatives to describe Web services via ontology languages. Nevertheless, WSDL 2.0 descriptions gained a standard representation based on Web Ontology Language (OWL). BioSWR is a novel Web services registry that provides standard Resource Description Framework (RDF) based Web services descriptions along with the traditional WSDL based ones. The registry provides Web-based interface for Web services registration, querying and annotation, and is also accessible programmatically via Representational State Transfer (REST) API or using a SPARQL Protocol and RDF Query Language. BioSWR server is located at http://inb.bsc.es/BioSWR/and its code is available at https://sourceforge.net/projects/bioswr/under the LGPL license.
The effects of pre-incubation with mercury (Hg(2+)) and cadmium (Cd(2+)) on the activities of individual glycolytic enzymes, on the flux and on internal metabolite concentrations of the upper part of glycolysis were investigated in mouse muscle extracts. In the range of metal concentrations analysed we found that only hexokinase and phosphofructokinase, the enzymes that shared the control of the flux, were inhibited by Hg(2+) and Cd(2+). The concentrations of the internal metabolites glucose-6-phosphate and fructose-6-phosphate did not change significantly when Hg(2+) and Cd(2+) were added. A mathematical model was constructed to explore the mechanisms of inhibition of Hg(2+) and Cd(2+) on hexokinase and phosphofructokinase. Equations derived from detailed mechanistic models for each inhibition were fitted to the experimental data. In a concentration-dependent manner these equations describe the observed inhibition of enzyme activity. Under the conditions analysed, the integral model showed that the simultaneous inhibition of hexokinase and phosphofructokinase explains the observation that the concentrations of glucose-6-phosphate and fructose-6-phosphate did not change as the heavy metals decreased the glycolytic flux.
AFTER DECADES OF USING UREA AS DENATURANT, THE KINETIC ROLE OF THIS MOLECULE IN THE UNFOLDING PROCESS IS STILL UNDEFINED: does urea actively induce protein unfolding or passively stabilize the unfolded state? By analyzing a set of 30 proteins (representative of all native folds) through extensive molecular dynamics simulations in denaturant (using a range of force-fields), we derived robust rules for urea unfolding that are valid at the proteome level. Irrespective of the protein fold, presence or absence of disulphide bridges, and secondary structure composition, urea concentrates in the first solvation shell of quasi-native proteins, but with a density lower than that of the fully unfolded state. The presence of urea does not alter the spontaneous vibration pattern of proteins. In fact, it reduces the magnitude of such vibrations, leading to a counterintuitive slow down of the atomic-motions that opposes unfolding. Urea stickiness and slow diffusion is, however, crucial for unfolding. Long residence urea molecules placed around the hydrophobic core are crucial to stabilize partially open structures generated by thermal fluctuations. Our simulations indicate that although urea does not favor the formation of partially open microstates, it is not a mere spectator of unfolding that simply displaces to the right of the folded??unfolded equilibrium. On the contrary, urea actively favors unfolding: it selects and stabilizes partially unfolded microstates, slowly driving the protein conformational ensemble far from the native one and also from the conformations sampled during thermal unfolding.
We present NAFlex, a new web tool to study the flexibility of nucleic acids, either isolated or bound to other molecules. The server allows the user to incorporate structures from protein data banks, completing gaps and removing structural inconsistencies. It is also possible to define canonical (average or sequence-adapted) nucleic acid structures using a variety of predefined internal libraries, as well to create specific nucleic acid conformations from the sequence. The server offers a variety of methods to explore nucleic acid flexibility, such as a colorless wormlike-chain model, a base-pair resolution mesoscopic model and atomistic molecular dynamics simulations with a wide variety of protocols and force fields. The trajectories obtained by simulations, or imported externally, can be visualized and analyzed using a large number of tools, including standard Cartesian analysis, essential dynamics, helical analysis, local and global stiffness, energy decomposition, principal components and in silico NMR spectra. The server is accessible free of charge from the mmb.irbbarcelona.org/NAFlex webpage.
Here we perform whole-exome sequencing of samples from 105 individuals with chronic lymphocytic leukemia (CLL), the most frequent leukemia in adults in Western countries. We found 1,246 somatic mutations potentially affecting gene function and identified 78 genes with predicted functional alterations in more than one tumor sample. Among these genes, SF3B1, encoding a subunit of the spliceosomal U2 small nuclear ribonucleoprotein (snRNP), is somatically mutated in 9.7% of affected individuals. Further analysis in 279 individuals with CLL showed that SF3B1 mutations were associated with faster disease progression and poor overall survival. This work provides the first comprehensive catalog of somatic mutations in CLL with relevant clinical correlates and defines a large set of new genes that may drive the development of this common form of leukemia. The results reinforce the idea that targeting several well-known genetic pathways, including mRNA splicing, could be useful in the treatment of CLL and other malignancies.
Chronic lymphocytic leukaemia (CLL), the most frequent leukaemia in adults in Western countries, is a heterogeneous disease with variable clinical presentation and evolution. Two major molecular subtypes can be distinguished, characterized respectively by a high or low number of somatic hypermutations in the variable region of immunoglobulin genes. The molecular changes leading to the pathogenesis of the disease are still poorly understood. Here we performed whole-genome sequencing of four cases of CLL and identified 46 somatic mutations that potentially affect gene function. Further analysis of these mutations in 363 patients with CLL identified four genes that are recurrently mutated: notch 1 (NOTCH1), exportin 1 (XPO1), myeloid differentiation primary response gene 88 (MYD88) and kelch-like 6 (KLHL6). Mutations in MYD88 and KLHL6 are predominant in cases of CLL with mutated immunoglobulin genes, whereas NOTCH1 and XPO1 mutations are mainly detected in patients with unmutated immunoglobulins. The patterns of somatic mutation, supported by functional and clinical analyses, strongly indicate that the recurrent NOTCH1, MYD88 and XPO1 mutations are oncogenic changes that contribute to the clinical evolution of the disease. To our knowledge, this is the first comprehensive analysis of CLL combining whole-genome sequencing with clinical characteristics and clinical outcomes. It highlights the usefulness of this approach for the identification of clinically relevant mutations in cancer.
More than 1700 trajectories of proteins representative of monomeric soluble structures in the protein data bank (PDB) have been obtained by means of state-of-the-art atomistic molecular dynamics simulations in near-physiological conditions. The trajectories and analyses are stored in a large data warehouse, which can be queried for dynamic information on proteins, including interactions. Here, we describe the project and the structure and contents of our database, and provide examples of how it can be used to describe the global flexibility properties of proteins. Basic analyses and trajectories stripped of solvent molecules at a reduced resolution level are available from our web server.
FlexServ is a web-based tool for the analysis of protein flexibility. The server incorporates powerful protocols for the coarse-grained determination of protein dynamics using different versions of Normal Mode Analysis (NMA), Brownian dynamics (BD) and Discrete Dynamics (DMD). It can also analyze user provided trajectories. The server allows a complete analysis of flexibility using a large variety of metrics, including basic geometrical analysis, B-factors, essential dynamics, stiffness analysis, collectivity measures, Lindemanns indexes, residue correlation, chain-correlations, dynamic domain determination, hinge point detections, etc. Data is presented through a web interface as plain text, 2D and 3D graphics.
We have extensively characterized the DNA methylomes of 139 patients with chronic lymphocytic leukemia (CLL) with mutated or unmutated IGHV and of several mature B-cell subpopulations through the use of whole-genome bisulfite sequencing and high-density microarrays. The two molecular subtypes of CLL have differing DNA methylomes that seem to represent epigenetic imprints from distinct normal B-cell subpopulations. DNA hypomethylation in the gene body, targeting mostly enhancer sites, was the most frequent difference between naive and memory B cells and between the two molecular subtypes of CLL and normal B cells. Although DNA methylation and gene expression were poorly correlated, we identified gene-body CpG dinucleotides whose methylation was positively or negatively associated with expression. We have also recognized a DNA methylation signature that distinguishes new clinico-biological subtypes of CLL. We propose an epigenomic scenario in which differential methylation in the gene body may have functional and clinical implications in leukemogenesis.
Genome-wide association studies (GWAS) have identified multiple risk loci for Crohns disease (CD). However, the cumulative risk exerted by these loci is low, and the likelihood that additional, as-yet undiscovered loci contribute to the risk of CD is very high. We performed a GWAS on a southern European population to identify new CD risk loci.
The detection of gene-gene interactions (i.e., epistasis) in the human genome is becoming decisive for the complete characterization of the genetic factors associated with complex binary traits. Despite the fact that many methods have been developed to address this challenging issue, their performance still remains insufficient. We will show how case and control groups store complementary information regarding interactions, and the use of this fundamental property in the design of a new, rapid, and highly powerful epistasis analysis method. Unlike previous approaches where statistical methods are tested over a very limited range of situations, we have performed an exhaustive evaluation of the power of our new method. To this end, we also propose a more comprehensive interpretation of epistasis in which genotype interactions may be of risk, protective, or neutral. In this extended view of genetic interactions, we demonstrate that our method has superior performance than existing approaches, thus, providing a highly powerful tool for the identification of gene-gene interactions associated with binary traits.
Recent genome-wide association studies (GWASs) have identified >20 new loci associated with the susceptibility to psoriasis vulgaris (PsV) risk. We investigated the association of PsV and its main clinical subphenotypes with 32 loci having previous genome-wide evidence of association with PsV (P < 5e-8) or strong GWAS evidence (P < 5e-5 in discovery and P < 0.05 in replication sample) in a large cohort of PsV patients (n = 2005) and controls (n = 1497). We provide the first independent replication for COG6 (P = 0.00079) and SERPINB8 (P = 0.048) loci with PsV. In those patients having developed psoriatic arthritis (n = 955), we found, for the first time, a strong association with IFIH1 (P = 0.013). Analyses of clinically relevant PsV subtypes yielded a significant association of severity of cutaneous disease with variation at LCE3D locus (P = 0.0005) in PsV and nail involvement with IL1RN in purely cutaneous psoriasis (PsC, P = 0.007). In an exploratory analysis of epistasis, we replicated the previously described HLA-C-ERAP1 interaction with PsC. Our findings show that common genetic variants associated with a complex phenotype like PsV influence different subphenotypes of high clinical relevance.
MDWeb and MDMoby constitute a web-based platform to help access to molecular dynamics (MD) in the standard and high-throughput regime. The platform provides tools to prepare systems from PDB structures mimicking the procedures followed by human experts. It provides inputs and can send simulations for three of the most popular MD packages (Amber, NAMD and Gromacs). Tools for analysis of trajectories, either provided by the user or retrieved from our MoDEL database (http://mmb.pcb.ub.es/MoDEL) are also incorporated. The platform has two ways of access, a set of web-services based on the BioMoby framework (MDMoby), programmatically accessible and a web portal (MDWeb).
The classic organization of a gene structure has followed the Jacob and Monod bacterial gene model proposed more than 50 years ago. Since then, empirical determinations of the complexity of the transcriptomes found in yeast to human has blurred the definition and physical boundaries of genes. Using multiple analysis approaches we have characterized individual gene boundaries mapping on human chromosomes 21 and 22. Analyses of the locations of the 5 and 3 transcriptional termini of 492 protein coding genes revealed that for 85% of these genes the boundaries extend beyond the current annotated termini, most often connecting with exons of transcripts from other well annotated genes. The biological and evolutionary importance of these chimeric transcripts is underscored by (1) the non-random interconnections of genes involved, (2) the greater phylogenetic depth of the genes involved in many chimeric interactions, (3) the coordination of the expression of connected genes and (4) the close in vivo and three dimensional proximity of the genomic regions being transcribed and contributing to parts of the chimeric RNAs. The non-random nature of the connection of the genes involved suggest that chimeric transcripts should not be studied in isolation, but together, as an RNA network.
Related JoVE Video
Journal of Visualized Experiments
What is Visualize?
JoVE Visualize is a tool created to match the last 5 years of PubMed publications to methods in JoVE's video library.
How does it work?
We use abstracts found on PubMed and match them to JoVE videos to create a list of 10 to 30 related methods videos.
Video X seems to be unrelated to Abstract Y...
In developing our video relationships, we compare around 5 million PubMed articles to our library of over 4,500 methods videos. In some cases the language used in the PubMed abstracts makes matching that content to a JoVE video difficult. In other cases, there happens not to be any content in our video library that is relevant to the topic of a given abstract. In these cases, our algorithms are trying their best to display videos with relevant content, which can sometimes result in matched videos with only a slight relation.