-1::1
Simple Hit Counter
Skip to content

Products

Solutions

×
×
Sign In

EN

EN - EnglishCN - 简体中文DE - DeutschES - EspañolKR - 한국어IT - ItalianoFR - FrançaisPT - Português do BrasilPL - PolskiHE - עִבְרִיתRU - РусскийJA - 日本語TR - TürkçeAR - العربية
Sign In Start Free Trial

RESEARCH

JoVE Journal

Peer reviewed scientific video journal

Behavior
Biochemistry
Bioengineering
Biology
Cancer Research
Chemistry
Developmental Biology
View All
JoVE Encyclopedia of Experiments

Video encyclopedia of advanced research methods

Biological Techniques
Biology
Cancer Research
Immunology
Neuroscience
Microbiology
JoVE Visualize

Visualizing science through experiment videos

EDUCATION

JoVE Core

Video textbooks for undergraduate courses

Analytical Chemistry
Anatomy and Physiology
Biology
Cell Biology
Chemistry
Civil Engineering
Electrical Engineering
View All
JoVE Science Education

Visual demonstrations of key scientific experiments

Advanced Biology
Basic Biology
Chemistry
View All
JoVE Lab Manual

Videos of experiments for undergraduate lab courses

Biology
Chemistry

BUSINESS

JoVE Business

Video textbooks for business education

Accounting
Finance
Macroeconomics
Marketing
Microeconomics

OTHERS

JoVE Quiz

Interactive video based quizzes for formative assessments

Authors

Teaching Faculty

Librarians

K12 Schools

Products

RESEARCH

JoVE Journal

Peer reviewed scientific video journal

JoVE Encyclopedia of Experiments

Video encyclopedia of advanced research methods

JoVE Visualize

Visualizing science through experiment videos

EDUCATION

JoVE Core

Video textbooks for undergraduates

JoVE Science Education

Visual demonstrations of key scientific experiments

JoVE Lab Manual

Videos of experiments for undergraduate lab courses

BUSINESS

JoVE Business

Video textbooks for business education

OTHERS

JoVE Quiz

Interactive video based quizzes for formative assessments

Solutions

Authors
Teaching Faculty
Librarians
K12 Schools

Language

English

EN

English

CN

简体中文

DE

Deutsch

ES

Español

KR

한국어

IT

Italiano

FR

Français

PT

Português do Brasil

PL

Polski

HE

עִבְרִית

RU

Русский

JA

日本語

TR

Türkçe

AR

العربية

    Menu

    JoVE Journal

    Behavior

    Biochemistry

    Bioengineering

    Biology

    Cancer Research

    Chemistry

    Developmental Biology

    Engineering

    Environment

    Genetics

    Immunology and Infection

    Medicine

    Neuroscience

    Menu

    JoVE Encyclopedia of Experiments

    Biological Techniques

    Biology

    Cancer Research

    Immunology

    Neuroscience

    Microbiology

    Menu

    JoVE Core

    Analytical Chemistry

    Anatomy and Physiology

    Biology

    Cell Biology

    Chemistry

    Civil Engineering

    Electrical Engineering

    Introduction to Psychology

    Mechanical Engineering

    Medical-Surgical Nursing

    View All

    Menu

    JoVE Science Education

    Advanced Biology

    Basic Biology

    Chemistry

    Clinical Skills

    Engineering

    Environmental Sciences

    Physics

    Psychology

    View All

    Menu

    JoVE Lab Manual

    Biology

    Chemistry

    Menu

    JoVE Business

    Accounting

    Finance

    Macroeconomics

    Marketing

    Microeconomics

Start Free Trial
Loading...
Home
JoVE Journal
Immunology and Infection
High-Throughput Transcriptome Analysis for Investigating Host-Pathogen Interactions
High-Throughput Transcriptome Analysis for Investigating Host-Pathogen Interactions
JoVE Journal
Immunology and Infection
Author Produced
A subscription to JoVE is required to view this content.  Sign in or start your free trial.
JoVE Journal Immunology and Infection
High-Throughput Transcriptome Analysis for Investigating Host-Pathogen Interactions

High-Throughput Transcriptome Analysis for Investigating Host-Pathogen Interactions

Full Text
4,643 Views
14:58 min
March 5, 2022

DOI: 10.3791/62324-v

André Nicolau Aquime Gonçalves1,2, Vanessa Escolano Maso3, Ícaro Maia Santos de Castro2,3, Amanda Pereira Vasconcelos3, Rodrigo Luiz Tomio Ogava2,3, Helder I Nakaya2,3,4

1Laboratory of Pathology of Infectious Diseases, Department of Pathology, Medical School,University of São Paulo, 2Scientific Platform Pasteur USP, 3Department of Clinical and Toxicological Analyses, School of Pharmaceutical Sciences,University of São Paulo, 4Hospital Israelita Albert Einstein

Summary

The protocol presented here describes a complete pipeline to analyze RNA-sequencing transcriptome data from raw reads to functional analysis, including quality control and preprocessing steps to advanced statistical analytical approaches.

Transcript

Welcome to the protocol of high-throughput transcriptome analysis for investigating host-pathogen interactions. This protocol is divided in the following steps. Quality control to filter low-quality reads and also to remove adapter sequences Sequencing and annotations, where are you have to map the reads into a reference genomes and annotate the reads into the genes.

Statistical and co-expression analysis, which defines the differentially expressed genes and also finds the co-expression modules. Molecular degree of perturbation analysis to find potential outlier samples. And finally, the functional analysis to determine the biological functions of differentially expressed genes.

All the tools utilizing these pipelines were pre-installed in a Linux system and encapsulated into a Docker container. The samples utilizing these protocols derived from a paper published by our group in PLOS Pathogen. The samples comprise 20 healthy people and 39 patients infected with Chikungunya virus.

The blood samples were collected, and RNA sequencing was performed. To install Docker in Windows system, you have to follow these steps. Go to the official webpage of Docker, and click in Get Started.

Find the installer for Docker Desktop for windows. Download the file. Install locally in your machine.

Make sure that these two options are marked. After installing the program, downloads the Docker image for this protocol. Go to the Windows terminal.

Execute the commands to downloads the image. After downloading the image, you can see the file in the Docker desktop, and from this image, we can initiate the container. After you click in the round button, you have to expand the original parameters and options to define the name of the container and to associate a folder in your local computer with the folder inside Docker.

After this, you click in Run to initiate the container. You can then access the terminal, which is in the Linux system inside the Docker. Type the bash commands, and then you can execute all the commands of this protocol.

First, we have to execute the source to make all the tools of this protocol available. You should access the directory scripts. To perform a transcriptomic analysis, you have to download first the reference genome.

For this, you have to execute the following commands. After the genome is download, you have to download the annotation of the genes. To do this, you have to type the following commands.

Next, you have to configure the fastq-dump. This is allow you to downloads the sequencing files of the examples. After typing the following commands, you have to use the Tab button to go to the Tools option and to mark the options currents directory.

Use the Tab buttons to save, and then ok. And then exit the tool fastq-dump. Now we can initiate the downloads of the reads by typing the following commands.

The quality control consists and evaluates graphically the probability of errors in the sequencing reads. In this step, you have also to remove the technical sequences such as adapters. To generate the quality control graphs, you have to run the FastQC program.

To remove the adapter sequences and the low-quality sequences, you have to type the following commands. With the good-quality reads, we have now to map the reads into the reference genome. After the mapping, we are gonna have to annotate the genes according to the human genes and then count the number of reads that match each human gene.

The first step is to index the reference genome by typing the following command. And then we type this commands to map the reads into the human genome. Next, you should run the scripts that annotate the reads.

After mapping and annotation, you can perform the differential expression analysis which it consists in finding the genes whose expression is higher or lower in one group compared to another. To identify the differentially expressed genes, or DEGs, you have to run following commands. After this, you can transfer the data results from the Docker to your local computer.

For this, go to the terminal and type the following commands to save all the results to a local folder. To perform the remaining analysis, you also have to copy all the files of the directory data to a directory in your local computer. In your local computer, you will be able to see the directories where you saved the data from Docker.

As you can see, you can access all the libraries. You can also open the HTML file containing the quality control reports. You can also access a directory containing the differentially expressed genes.

And inside this directory, you will find the volcano plots where you can see the genes that are up-or downregulated in the one group versus another, in this case, patients infected with Chikungunya virus versus healthy controls. All the remaining steps of this protocol are gonna be executed in web tools using your browser. Let's first start with CEMiTool.

Go to the browser and type the following address. CEMiTool identifies co-expression modules from expression data sets provided by the users. In the main page, you can go to the menu and click in the button Run.

This will open a new page where you can upload the expression file. This file is in the directory data of your local computer. You will see that there three expression files, and the one that we are gonna use for the CEMiTool is a normalization call tmm.

Then you have to select the phenodata file, the same thing for the file containing the protein-protein interactions, and finally, upload the file containing the gene sets or pathways. The gene sets file enables CEMiTool to perform enrichment analysis for each one of the co-expression module. Next, you should to expand the parameter section and click in Apply VST.

After that, you can just click Run CEMiTool. After you run CEMiTool, you will see that 12 co-expression modules were identified. By clicking here, you can download all the results of these analysis.

Another tool that we are gonna utilize in this protocol is MDP, or Molecular Degree of Perturbation. Just type in your browser mdp.sysbio.tools. MDP calculates the molecular distance of each sample compared to a reference group of samples, in this case, the healthy controls, in order to find not only potential outliers but also how perturbed are each samples compared to this group.

In the Run page, you can just upload the expression file by clicking the button and selecting the file. Then you have to upload the phenodata file. Then you have to define which column contain the information about the group or the class and then which class or group correspond to the control group.

After this, you can just run MDP. The bar graph shows for each one of the samples as a bar the score of molecular degree of perturbation, and the colors represent the different groups. And the box plot is another way of visualizing the same results where you see on each dots the is a different samples separate by two groups.

To perform the functional analysis, we are gonna use the Enrichr tool. For this, you have to select the list of genes that were differentially expressed, either up-or downregulated, and use it as a input gene list in Enrichr tool. You will see that there are different tabs.

All the results can also be downloaded to your local computer. The computer environment for transcriptome analysis has been placed on the Docker platform. This approach allows users with no prior experience with Linux system to utilize a terminal.

In this container, there is a predefined folder structure for dataset and scripts which are necessary for all the analysis. In the pipeline, users will utilize blood transcriptome data from 20 healthy individuals and 39 patients acutely infected with Chikungunya virus. The sequencing platform returns a set of FASTQ files containing the DNA sequence, i.e.

the reads, and the associated quality for each nucleotide base. The Phred quality scale indicates the probability of an incorrect reading for each base. Tools identify and remove low-quality reads from samples and to increase the probability of mapping reads.

In this step, the mapping module, the high-quality reads recovered are used as inputs to align them against the human reference genome. CEMiTool identifies and analyze co-expression modules. Genes within the same module are co-expressed, which means that they exhibit similar patterns of expression across the samples of the data sets.

The network analysis provides information about the most connected genes, i.e. the hubs. The names of those genes are shown in the network.

The size of the nodes is proportional to its degree of connectivity. The results obtained from the DEG analysis were summarized in the volcano plots. The analysis of the molecular degree of perturbation permits the identification of perturbed samples from healthy and infected individuals.

MDP suggests which samples are potential biological outliers. Removing those samples will impact the downstream results. A functional enrichment analysis using AURA can be performed with Enrichr tool.

These steps helps to interpret the results by revealing common functional roles of several genes that were differentially expressed. The biological process shown in the bar graphs are the top 10 enriched gene sets based on their p-value ranking. In conclusion, these protocols covers all steps of RNA-Seq analysis.

The pipeline was developed and encapsulated into the non-commercial system named Docker. On an image and made available for the scientific community. Due to the container system, all scripts and tools are under the same specific version to guarantee reproducibility.

Furthermore, parts of the bioinformatics analysis was performed via free user-friendly web tools.

Explore More Videos

High-throughput Transcriptome AnalysisHost-pathogen InteractionsQuality ControlRNA SequencingReference Genome MappingDifferential Gene ExpressionCo-expression AnalysisMolecular Perturbation AnalysisFunctional AnalysisDocker ContainerLinux System ToolsChikungunya VirusDocker InstallationWindows Terminal

Related Videos

A Comparative Approach to Characterize the Landscape of Host-Pathogen Protein-Protein Interactions

13:56

A Comparative Approach to Characterize the Landscape of Host-Pathogen Protein-Protein Interactions

Related Videos

11.5K Views

Generation and Multi-phenotypic High-content Screening of Coxiella burnetii Transposon Mutants

11:44

Generation and Multi-phenotypic High-content Screening of Coxiella burnetii Transposon Mutants

Related Videos

10.3K Views

Gene Expression Profiling of Infecting Microbes Using a Digital Bar-coding Platform

09:13

Gene Expression Profiling of Infecting Microbes Using a Digital Bar-coding Platform

Related Videos

8.3K Views

Real-time Analysis of Transcription Factor Binding, Transcription, Translation, and Turnover to Display Global Events During Cellular Activation

12:54

Real-time Analysis of Transcription Factor Binding, Transcription, Translation, and Turnover to Display Global Events During Cellular Activation

Related Videos

13.9K Views

A High-throughput, High-content, Liquid-based C. elegans Pathosystem

09:44

A High-throughput, High-content, Liquid-based C. elegans Pathosystem

Related Videos

15K Views

Using a Bacterial Pathogen to Probe for Cellular and Organismic-level Host Responses

08:38

Using a Bacterial Pathogen to Probe for Cellular and Organismic-level Host Responses

Related Videos

6.2K Views

IR-TEx: An Open Source Data Integration Tool for Big Data Transcriptomics Designed for the Malaria Vector Anopheles gambiae

08:22

IR-TEx: An Open Source Data Integration Tool for Big Data Transcriptomics Designed for the Malaria Vector Anopheles gambiae

Related Videos

6.5K Views

Label-Free Quantitative Proteomics Workflow for Discovery-Driven Host-Pathogen Interactions

05:37

Label-Free Quantitative Proteomics Workflow for Discovery-Driven Host-Pathogen Interactions

Related Videos

7.3K Views

Automated, High-Throughput Detection of Bacterial Adherence to Host Cells

07:21

Automated, High-Throughput Detection of Bacterial Adherence to Host Cells

Related Videos

3.8K Views

In Silico Identification and Characterization of circRNAs During Host-Pathogen Interactions

10:27

In Silico Identification and Characterization of circRNAs During Host-Pathogen Interactions

Related Videos

1.8K Views

JoVE logo
Contact Us Recommend to Library
Research
  • JoVE Journal
  • JoVE Encyclopedia of Experiments
  • JoVE Visualize
Business
  • JoVE Business
Education
  • JoVE Core
  • JoVE Science Education
  • JoVE Lab Manual
  • JoVE Quizzes
Solutions
  • Authors
  • Teaching Faculty
  • Librarians
  • K12 Schools
About JoVE
  • Overview
  • Leadership
Others
  • JoVE Newsletters
  • JoVE Help Center
  • Blogs
  • Site Maps
Contact Us Recommend to Library
JoVE logo

Copyright © 2025 MyJoVE Corporation. All rights reserved

Privacy Terms of Use Policies
WeChat QR code