This protocol describes a workflow from ex vivo or in vitro cell cultures to transcriptomic data pre-processing for cost-effective transcriptome-based drug screening.
Transcriptomics allows to obtain comprehensive insights into cellular programs and their responses to perturbations. Despite a significant decrease in the costs of library production and sequencing in the last decade, applying these technologies at the scale necessary for drug screening remains prohibitively expensive, obstructing the immense potential of these methods. Our study presents a cost-effective system for transcriptome-based drug screening, combining miniaturized perturbation cultures with mini-bulk transcriptomics. The optimized mini-bulk protocol provides informative biological signals at cost-effective sequencing depth, enabling extensive screening of known drugs and new molecules. Depending on the chosen treatment and incubation time, this protocol will result in sequencing libraries within approximately 2 days. Due to several stopping points within this protocol, the library preparation, as well as the sequencing, can be performed time-independently. Processing simultaneously a high number of samples is possible; measurement of up to 384 samples was tested without loss of data quality. There are also no known limitations to the number of conditions and/or drugs, despite considering variability in optimal drug incubation times.
The development of new drugs is a complex and time-consuming process that involves identifying potential drugs and their targets, optimizing and synthesizing drug candidates, and testing their efficacy and safety in preclinical and clinical trials1. Traditional methods for drug screening, i.e., the systematic assessment of libraries of candidate compounds for therapeutic purposes, involve the use of animal models or cell-based assays to test the effects on specific targets or pathways. While these methods have been successful in identifying drug candidates, they often did not provide sufficient insights into the complex molecular mechanisms underlying drug efficacy and also toxicity and mechanisms of potential side effects.
Assessing genome-wide transcriptional states presents a powerful approach to overcome current limitations in drug screening, as it enables comprehensive assessments of gene expression in response to drug treatments2. By measuring RNA transcripts in a genome-wide fashion expressed at a given time, transcriptomics aims to provide a holistic view of the transcriptional changes that occur in response to drugs, including changes in gene expression patterns, alternative splicing, and non-coding RNA expression3. This information can be used to determine drug targets, predict drug efficacy and toxicity, and optimize drug dosing and treatment regimens.
One of the key benefits of combining transcriptomics with unbiased drug screening is the potential to identify new drug targets that have not been previously considered. Conventional drug screening approaches often focus on established target molecules or pathways, hindering the identification of new targets and potentially resulting in drugs with unforeseen side effects and restricted effectiveness. Transcriptomics can overcome these limitations by providing insights into the molecular changes that occur in response to drug treatment, uncovering potential targets or pathways that may not have been previously considered2.
In addition to the identification of new drug targets, transcriptomics can also be used to predict drug efficacy and toxicity. By analyzing the gene expression patterns associated with drug responses, biomarkers can be developed that can be used to predict a patient’s response to a particular drug or treatment regimen. This can also help to optimize drug dosing and reduce the risk of adverse side effects4.
Despite its potential benefits, the cost of transcriptomics remains a significant barrier to its widespread application in drug screening. Transcriptomic analysis requires specialized equipment, technical expertise, and data analysis, which can make it challenging for smaller research teams or organizations with limited funding to utilize transcriptomics in drug screening. However, the cost of transcriptomics has been steadily decreasing, making it more accessible to the research communities. Additionally, advancements in technology and data analysis methods have made transcriptomics more efficient and cost-effective, further increasing its accessibility2.
In this protocol, we describe a high-dimensional and explorative system for transcriptome-based drug screening, combining miniaturized perturbation cultures with mini-bulk transcriptomics analysis5,6. With this protocol, it is possible to reduce the cost per sample to 1/6th of the current cost of commercial solutions for full-length mRNA sequencing. The protocol requires only standard laboratory equipment, the only exception being the use of short-read sequencing technologies, which can be outsourced if sequencing instruments are not available in-house. The optimized mini-bulk protocol provides information-rich biological signals at cost-effective sequencing depth, enabling extensive screening of known drugs and new molecules.
The aim of the experiment is to screen for drug activity on PBMCs in different biological contexts. This protocol can be applied to any biological question where several drugs should be tested with a transcriptomic readout, giving a transcriptome-wide view of the cellular effect of the treatment.
This protocol follows the guidelines of the local ethics committees of the University of Bonn.
1. Preparation of buffers, solutions, and equipment
2. Cell handling
NOTE: A detailed protocol for the cryopreservation of peripheral blood mononuclear cells (PBMC) from human blood can be found in7.
3. Library preparation for sequencing
4. Sequencing and data pre-processing
Following the reported protocol, human PBMCs were seeded, treated with different immunomodulatory drugs and, after different incubation times, harvested for bulk transcriptomic analysis using the sequencing protocol (Figure 1).
Ideal drug concentrations and incubation times for test compounds should be identified upstream this protocol with the help of complementary experimental strategies and based on the specific scientific question. In most cases, 2 – 4 h and 24 h incubation should provide a representation of early and late transcriptional responses to treatment.
The most important results to evaluate the correct executions of the protocol are the cDNA and library QCs (Figure 2 and Figure 3). cDNA profile should have a broad distribution with an average size of > 1000 bp (Figure 2), a lower average size or accumulation of molecules at low molecular weight (Figure 4) might indicate a low RNA input or RNA degradation.
The preparation of a good sequencing library is also an important step in the protocol; post-tagmentation libraries should have a rather narrow distribution of around 250 bp (Figure 3); longer fragments will perform poorly during sequencing (Figure 5).
Following sequencing, raw FASTQ files are aligned against the appropriate reference genome (e.g., human or mouse) and the transcript abundance is quantified for each sample (see nf-core RNA-seq pipeline (https://nf-co.re/rnaseq)). Exploratory data analysis will be now performed to check the overall quality of the data. Aligned data should capture a high number of protein coding genes as only poly adenylated RNA will be captured with this protocol (Figure 6A). In human samples we expect to capture between 15000 and 20000 transcripts (this value is based on the GENCODE 27 reference human genome annotation11). Further exploratory data analysis may include principal component analysis (PCA). Here, the underlying structure of the data can be visualized in a two-dimensional plot. In Figure 6B, we show an exemplary PCA plot; dots here are colored by treatment, showing three different clusters of experimental conditions leading to similar transcriptomic profiles. It can also be seen here that biological replicates (dots with the same color) are transcriptionally similar, showing a good robustness of the protocol. The results shown in this figure were generated with pipeline available on GitHub based on the DESeq2 workflow (https://github.com/jsschrepping/RNA-DESeq2).
Figure 1: Time estimations and workflow. (A) Time estimations of this protocol for a run with 96 samples. (B) Diagram of each step within the protocol from thawing the cells until data pre-processing. The diagram should be read from top to bottom following the arrows. Circled arrows describe a repetition of the step. Red dots represent stopping points further described in the protocol. Please click here to view a larger version of this figure.
Figure 2: Exemplary results for cDNA libraries. Results of a miniaturized electrophoresis showing the size distribution of exemplary cDNA library. Upper and lower signals represent markers used to align the sample. Blue lines indicate the average fragment sizes (blue brackets). The cDNA profile shows a broad distribution with an average size of > 1000 bp. Please click here to view a larger version of this figure.
Figure 3: Exemplary results for sequencing libraries. Results of a miniaturized electrophoresis showing the size distribution of successfully prepared sequencing libraries with a narrow distribution and an average size of around 250 bp. Upper and lower signals represent markers used to align the sample. Blue lines indicate the average fragment sizes (blue brackets). Please click here to view a larger version of this figure.
Figure 4: Exemplary suboptimal results for cDNA libraries. Results of a miniaturized electrophoresis showing the size distribution of a suboptimal cDNA library with an average size of < 200 bp. Upper and lower signals represent markers used to align the sample. Blue lines indicate the average fragment sizes (blue brackets). Please click here to view a larger version of this figure.
Figure 5: Exemplary suboptimal results for sequencing libraries. Results of a miniaturized electrophoresis showing the size distribution of a suboptimal sequencing library containing longer fragments of 200 – 1000 bp. Upper and lower signals represent markers used to align the sample. Blue lines indicate the average fragment sizes (blue brackets). Please click here to view a larger version of this figure.
Figure 6: Exploratory data analysis. Representative results of exploratory data analysis showing the effect of selected immunomodulatory drugs on PBMCs. (A) Bar graph of the number of detected genes (Y-axes) separated by gene type (X-axes). (B) PCA of all genes in the dataset colored by the different drug treatments. Dots with the same color are biological replicates. Please click here to view a larger version of this figure.
Reagent | Concentration | Volume [µL] /rxn |
Guanidine hydrochloride | 80 mM | 7.50 |
Deoxynucleotide triphosphates (dNTPs) | 10 mM each | 6.52 |
SMART dT30VN primer | 100 µM | 0.33 |
Nuclease-free water | 0.65 | |
Total volume | 15.00 |
Table 1: Lysis buffer.
Reagent | Volume [µL] /rxn |
SSRT II buffer (5x) | 2.00 |
DTT (100 mM) | 0.50 |
Betaine (5 M) | 2.00 |
MgCl2 (1 M) | 0.14 |
SSRT II (200 U/µL) | 0.25 |
RNAse inhibitor (40 U/µL) | 0.25 |
TSO-LNA (100 µM) | 0.20 |
Nuclease-free water | 0.66 |
Total volume | 6.00 |
Table 2: Reverse transcriptase (RT) reaction mix.
mRNA denaturation | ||
Step | Temperature | Duration |
mRNA denaturation | 95 °C | 2 min |
on ice | 2 min | |
Reverse transcription | ||
Step | Temperature | Duration |
Reverse transcription | 42 °C | 90 min |
Enzyme inactivation | 70 °C | 15 min |
4 °C | hold |
Table 3: Thermocycler program for mRNA denaturation and reverse transcriptase (RT).
Reagent | Volume [µL] /rxn |
High-fidelity DNA polymerase | 12.50 |
ISPCR primer (10 µM) | 0.15 |
Nuclease-free water | 2.35 |
Total volume | 15.00 |
Table 4: Pre-amplification mix.
Steps | Temperature | Duration | |
Initial denaturation | 98 °C | 3 min | |
Denaturation | 98 °C | 20 s | 16 – 18 Cycles |
Annealing | 67 °C | 20 s | |
Extension | 72 °C | 6 min | |
4 °C | hold |
Table 5: Pre-amplification thermocycler program.
Steps | Temperature | Duration |
Tagmentation | 55 °C | 8 min |
4 °C | hold |
Table 6: Tagmentation thermocycler program.
Tagmentation mix | |
Reagent | Volume [µL] /rxn |
Amplicon Tagment Mix (ATM) | 1.0 |
Tagment DNA Buffer (TD) | 2.0 |
Total volume | 3.0 |
Enrichment PCR mix | |
Reagent | Volume [µL] /rxn |
High-fidelity DNA polymerase | 7.0 |
Nextera-compatible indexing primer | 2.0 |
Total volume | 9.0 |
Table 7: Tagmentation mix and enrichment PCR mix.
Steps | Temperature | Duration | |
Hot Start | 72 °C | 5 min | |
Initial denaturation | 98 °C | 30 s | |
Denaturation | 98 °C | 10 s | 16 cycles |
Annealing | 60 °C | 30 s | |
Extension | 72 °C | 1 min | |
Final Extension | 72 °C | 5 min | |
4 °C | hold |
Table 8: Enrichment PCR thermocycler program.
Instrument | Loading concentration |
MiSeq v2 | 10 pM |
NextSeq 500/550 | 1.4 pM |
NovaSeq 6000 | 1250 pM |
Table 9: Examples for loading concentrations of common sequencing instruments.
Drug discovery and drug development can greatly benefit from the holistic view of cellular processes that bulk transcriptomics can provide. Nevertheless, this approach is often limited by the high cost of the experiment with standard bulk RNA-seq protocol, prohibiting its application in academic settings as well as its potential for industrial scalability.
The most critical steps of the protocol are cell thawing and the initial steps of library preparation. Ensuring high viability of the cells after thawing is critical for successful treatments and transcriptomic analysis. The first steps of cell harvesting and library preparation until cDNA synthesis are critical for preserving the integrity of the RNA. It is crucial at this stage to keep the cell lysate on ice at all times and process the samples as quickly as possible. If excessive RNA degradation is observed, lab surfaces and equipment should be cleaned with specific products to inhibit any RNase activity.
The protocol described here is currently optimized for drug treatment on PBMCs from healthy donors for a maximum of 24 h. To perform the experiment with a different cell type or for longer incubation time, cell numbers for seeding and culturing conditions might need to be optimized accordingly.
With this protocol, we provide a workflow for transcriptomic analysis upon drug treatment on PBMCs using standard laboratory equipment and no need for commercial kits. With this approach and avoiding the step of RNA purification, we reduced the costs significantly allowing for parallel analysis of a high number of compounds.
Because this protocol is based on an in vitro assay, its major limitation is that it cannot evaluate any metabolic processing of the drugs that could lead to different bioactivity. Additionally, the number of captured transcripts and sequencing depth will be lower than in standard bulk full-length transcriptomics methods, preventing the use of these data for applications that require higher information content, such as differential splicing or quantification of single nucleotide polymorphisms.
The authors have nothing to disclose.
J.L.S. is supported by the German Research Foundation (DFG) under Germany's Excellence Strategy (EXC2151-390873048), as well as under SCHU 950/8-1; GRK 2168, TP11; CRC SFB 1454 Metaflammation, IRTG GRK 2168, WGGC INST 216/981-1, CCU INST 217/988-1, the BMBF-funded excellence project Diet-Body-Brain (DietBB); and the EU project SYSCID under grant number 733100. M.B. is supported by DFG (IRTG2168-272482170, SFB1454-432325352). L.B. is supported by DFG (ImmuDiet BO 6228/2-1 – Project number 513977171) and Germany's Excellence Strategy (EXC2151-390873048). Images created with BioRender.com.
50 mL conical tube | fisher scientific | 10203001 | |
Adhesive PCR Plate Seals | Thermo Fisher Scientific | AB0558 | |
Amplicon Tagment Mix (ATM) | Illumina | FC-131-1096 | Nextera XT DNA Library Prep Kit (96 samples) |
AMPure XP beads | Beckman Coulter | A 63881 | |
Betaine | Sigma-Aldrich | 61962 | |
Cell culture grade 96-well plates | Thermo Fisher Scientific | 260860 | |
Cell culture vacuum pump (VACUSAFE) | Integra Bioscience | 158300 | |
Deoxynucleotide triphosphates (dNTPs) mix 10 mM each | Fermentas | R0192 | |
DMSO | Sigma-Aldrich | 276855 | |
DTT (100 mM) | Invitrogen | 18064-014 | |
EDTA | Sigma-Aldrich | 798681 | for adherent cells |
Ethanol | Sigma-Aldrich | 51976 | |
Fetal Bovine Serum | Thermo Fisher Scientific | 26140079 | |
Filter tips (10 µL) | Gilson | F171203 | |
Filter tips (100 µL) | Gilson | F171403 | |
Filter tips (20 µL) | Gilson | F171303 | |
Filter tips (200 µL) | Gilson | F171503 | |
Guanidine Hydrochloride | Sigma-Aldrich | G3272 | |
ISPCR primer (10 µM) | Biomers.net GmbH | SP10006 | 5′-AAGCAGTGGTATCAACGCAGAG T-3′ |
KAPA HiFi HotStart ReadyMix (2X) | KAPA Biosystems | KK2601 | |
Magnesium chloride (MgCl2) | Sigma-Aldrich | M8266 | |
Magnetic stand 96 | Ambion | AM10027 | |
Neutralize Tagment (NT) Buffer | Illumina | FC-131-1096 | Nextera XT DNA Library Prep Kit (96 samples), alternatively 0.2 % SDS |
Nextera-compatible indexing primer | Illumina | ||
Nuclease-free water | Invitrogen | 10977049 | |
PBS | Thermo Fisher Scientific | AM9624 | |
PCR 96-well plates | Thermo Fisher Scientific | AB0600 | |
PCR plate sealer | Thermo Fisher Scientific | HSF0031 | |
Penicillin / Streptomycin | Thermo Fisher Scientific | 15070063 | |
Qubit 4 fluorometer | Invitrogen | 15723679 | |
Recombinant RNase inhibitor (40 U/ul) | TAKARA | 2313A | |
RPMI-1640 cell culture medium | Gibco | 61870036 | If not working with PBMCs, adjust to cell type |
SMART dT30VN primer | Sigma-Aldrich | 5' Bio-AAGCAGTGGTATCAACGCAGAG TACT30VN-3 |
|
Standard lab equipment | various | various | e.g. centrifuge, ice machine, ice bucket, distilled water, water bath |
SuperScript II Reverse Transcriptase (SSRT II) | Thermo Fisher Scientific | 18064-014 | |
SuperScript II Reverse Transcriptase (SSRT II) buffer (5x) | Thermo Fisher Scientific | 18064-014 | |
Tagment DNA Buffer (TD) | Illumina | FC-131-1096 | Nextera XT DNA Library Prep Kit (96 samples) |
TapeStation system 4200 | Agilent | G2991BA | |
Thermocycler (S1000) | Bio-Rad | 1852148 | |
TSO-LNA (100 uM) | Eurogentec | 5' Biotin AAGCAGTGGTATCAACGCAGAG TACAT(G)(G){G |
|
Vortex-Genie 2 Mixer | Sigma-Aldrich | Z258415 |