Login processing...

Trial ends in Request Full Access Tell Your Colleague About Jove

Cancer Research

A Data Integration Workflow to Identify Drug Combinations Targeting Synthetic Lethal Interactions

Published: May 27, 2021 doi: 10.3791/60328


Large genetic screens in model organisms have led to the identification of negative genetic interactions. Here, we describe a data integration workflow using data from genetic screens in model organisms to delineate drug combinations targeting synthetic lethal interactions in cancer.


A synthetic lethal interaction between two genes is given when knock-out of either one of the two genes does not affect cell viability but knock-out of both synthetic lethal interactors leads to loss of cell viability or cell death. The best studied synthetic lethal interaction is between BRCA1/2 and PARP1, with PARP1 inhibitors being used in clinical practice to treat patients with BRCA1/2 mutated tumors. Large genetic screens in model organisms but also in haploid human cell lines have led to the identification of numerous additional synthetic lethal interaction pairs, all being potential targets of interest in the development of novel tumor therapies. One approach is to therapeutically target genes with a synthetic lethal interactor that is mutated or significantly downregulated in the tumor of interest. A second approach is to formulate drug combinations addressing synthetic lethal interactions. In this article, we outline a data integration workflow to evaluate and identify drug combinations targeting synthetic lethal interactions. We make use of available datasets on synthetic lethal interaction pairs, homology mapping resources, drug-target links from dedicated databases, as well as information on drugs being investigated in clinical trials in the disease area of interest. We further highlight key findings of two recent studies of our group on drug combination assessment in the context of ovarian and breast cancer.


Synthetic lethality defines an association of two genes, where loss of one gene does not affect viability, but loss of both genes leads to cell death. It was first described in 1946 by Dobzhansky while analyzing various phenotypes of drosophila by breeding homozygous mutants1. Mutants that did not produce viable offspring, although viable themselves, exhibited lethal phenotypes when crossed with certain other mutants, setting ground for the establishment of the theory of synthetic lethality. Hartwell and colleagues suggested that this concept might be applicable for cancer therapy in humans2. Pharmacologically provoked synthetic lethality could rely on just one mutation, given that the mutated gene’s synthetic lethal partner is targetable by a pharmacological compound. The first gene pair to enable pharmacological induction of synthetic lethality was BRCA(1/2) and PARP1. PARP1 functions as a sensor for DNA damage, and is tied to sites of double and single DNA strand-breaks, supercoils and crossovers3. BRCA1 and 2 play major roles in repair of DNA double-strand breaks through homologous recombination4. Farmer and colleagues published findings that cells deficient for BRCA1/2 were susceptible to PARP inhibition, while no cytotoxicity was observed in BRCA wild-type cells5. Ultimately, PARP inhibitors were approved for the treatment of BRCA deficient breast and ovarian cancer6,7. Further, synthetic lethality gene pairs leading to clinical approval of pharmacological compounds are much anticipated and a major area of recent cancer research efforts8.

Synthetic lethal gene interactions were modelled in multiple organisms including fruit flies, C. elegans and yeast2. Using various approaches including RNA-interference- and CRISPR/CAS-library knockouts, novel synthetic lethal gene pairs were discovered in recent years9,10,11. A protocol on the experimental procedures of RNAi in combination with CRISPR/CAS was recently published by Housden and colleagues12. Meanwhile, researchers also conducted large screens in haploid human cells to identify synthetic lethal interactions13,14. In silico methods like biological network analysis and machine learning have also shown promise in the discovery of synthetic lethal interactions15,16.

Conceptionally, one approach to make use of synthetic lethal interactions in the context of anti-tumor therapy is to identify mutated or non-functional proteins in tumor cells, making their synthetic lethal interaction partners promising drug targets for therapeutic intervention. Due to the heterogeneity of most tumor types, researchers have started the search for so-called synthetic lethal hub proteins. These synthetic lethal hubs have a number of synthetic lethal interaction partners that are either mutated and therefore non-functional or significantly downregulated in tumor samples. Addressing such synthetic lethal hubs holds promise in increasing drug efficacy or overcoming drug resistance as could be shown for instance in the context of vincristine resistant neuroblastoma17. A second approach to enhance drug treatment making use of the concept of synthetic lethal interactions is to identify drug combinations targeting synthetic lethal interactions. This could lead to new combinations of already approved single anti-tumor therapies and to the repositioning of drugs from other disease areas to the field of oncology.

In this article, we present a step-by-step procedure to yield a list of drug combinations that target synthetic lethal interaction pairs. In this workflow, we (i) use data on synthetic lethal interactions from BioGRID and (ii) information on homologous genes from Ensembl, (iii) retrieve drug-target pairs from DrugBank, (iv) build disease-drug associations from ClinicalTrials.gov, and (v) hence generate a set of drug combinations addressing synthetic lethal interactions. Lastly, we provide drug combinations in the context of ovarian and breast cancer in the representative results section.

Subscription Required. Please recommend JoVE to your librarian.


1. Retrieving synthetic lethal gene pairs

  1. Data retrieval from BioGrid.
    1. Download the latest BioGRID interaction file in tab2 format from https://downloads.thebiogrid.org/Download/BioGRID/Latest-Release/BIOGRID-ALL-LATEST.tab2.zip either using a web browser or directly from the Linux command line using curl or wget18.

      ##download and unpack the latest BioGRID interaction file
      #download latest BioGRID interaction file using curl
      curl -o biogrid_latest.zip https://downloads.thebiogrid.org/Download/BioGRID/Latest-Release/BIOGRID-ALL-LATEST.tab2.zip
      #unpack the downloaded data file
      unzip biogrid_latest.zip

    2. After the zip archive has been downloaded, unpack archive must and note the name of the actual dataset file (BIOGRID-ALL-X.X.X.tab2.txt) for subsequent steps. The BioGRID datafile holds interactions of different types that will be filtered in the next step.
      NOTE: Other sources (e.g. DRYGIN, SynlethDB) holding synthetic lethal interactions exist, as outlined in the discussion.
  2. Filter for synthetic lethality and negative genetic interactions (Experimental System).
    1. Use information in the column “Experimental System” (column number 12) that indicates the nature of supporting evidence for an interaction to identify synthetic lethal interactions.
    2. Restrict the dataset to entries with a value of either Negative Genetic or Synthetic Lethality. In the same step, filter columns and only retain columns relevant for subsequent analysis steps as listed in table 1 below.

      ##restrict the BioGRID interaction file to relevant columns and only retain interactions classified as negative genetic and synthetic lethality
      cut -d "^I" -f 1,8,9,12,16,17 "${BG}" \
      | awk -F "\t" 'BEGIN{
      if(NR == 1){
      print $0
      }else if($4 == "Negative Genetic" || $4 == "Synthetic Lethality"){
      print $0
      }' > bg_synlet.txt

      NOTE: In the code snippets ^I is used to represent horizontal tabs. Additional BioGRID categories such as synthetic growth defect may be included. Other columns of relevance for this workflow are listed in Table 1. BioGRID also retains the scores for individual interactions. Cutoffs may be used to identify strong/high confidence interactions.
Column number Column header name
3 Gene Name
12 Species
13 Drug IDs

Table 1: Relevant columns of the BioGRID datafile.

  1. Identify species for which synthetic lethal interactions were reported.
    1. Determine the number of synthetic lethal interaction partner tax-IDs to get an estimate on the number of synthetic lethal interactions being available per organism.

      ##count the number of appearances of each tax id in the previously extracted synthetic lethal interactions
      cut -d "^I" -f5,6 bg_synlet.txt | tail -n +2 | tr "\t" "\n" \

      | sort | uniq -c | sort -r -g

      NOTE: As a result of step 1, a list of synthetic lethal interactions with gene symbols from organisms in which the interactions were determined. The majority of synthetic lethal interactions have been determined in model organisms. When loading files into a spreadsheet program (e.g., Excel) avoid ruining Gene Symbols19,20.

2. Translating synthetic lethal gene pairs to human orthologs

  1. Retrieve human orthologs for relevant model organisms identified in step 1.3.
    1. Retrieve human orthologs from Ensembl BioMart21 by linking the respective model organism gene dataset with the human gene dataset. Use the gene symbols denoting the gene in the model organism and orthologous human genes for this task. Use the Ensembl BioMart webservice to automatize the retrieval process and send the query directly to BioMart RESTful access for retrieving the orthologous gene pairs (see example below and Ensembl BioMart Help & Documentation for further details).

      ##retrieve human orthologous for Saccharomyces Cerevisiae from Ensembl BioMart by using curl to send the BioMart query directly to the BioMart RESTful access service
      curl -o s_cerevisiae.txt --data-urlencode 'query=<?xml version="1.0" encoding="UTF-8"?>
      <!DOCTYPE Query>
      <Query virtualSchemaName = "default" formatter = "TSV" header = "0" uniqueRows = "1" count = "" datasetConfigVersion = "0.6" >

      <Dataset name = "scerevisiae_gene_ensembl" interface = "default" >
      <Attribute name = "external_gene_name" />

      <Dataset name = "hsapiens_gene_ensembl" interface = "default" >
      <Attribute name = "external_gene_name" />
      ' "http://www.ensembl.org/biomart/martservice"

      In order to retrieve the orthologous human genes for other model organisms, replace the value of the name attribute of the first Dataset element with the name of the respective Ensembl dataset and re-execute the query.

      NOTE: The process of ortholog mapping is well-documented in Ensembl BioMart Help & Documentation (http://www.ensembl.org/info/data/biomart/biomart_combining_species_datasets.html).
    1. Access an example BioMart query for human orthologs for Saccharomyces cerevisiae, the top species identified in step 1.3, via the URL http://www.ensembl.org/biomart/martview/9b71da1415aba480a52b8dc7dd554d63?VIRTUALSCHEMANAME=default&ATTRIBUTES=scerevisiae_gene_ensembl.default.feature_page.external_gene_name|hsapiens_gene_ensembl.default.feature_page.external_gene_name&FILTERS=&VISIBLEPANEL=linkattributepanel.
      NOTE: Other sources (e.g. roundup, oma browser, HomoloGene, inparanoid) for homology mapping exist, as outlined in the discussion section of this manuscript.
  2. Add human orthologs to extracted synthetic lethal interactions.
    1. Join synthetic lethal interactions based on organism tax-ID and gene symbol with the orthologous pairs retrieved in step 2.1. For human synthetic lethal interaction pairs either create artificial orthologous pairs for each human gene present in the dataset or make sure that human synthetic lethal interactions are not discarded while joining and transfer the human gene symbols into the newly added columns.

      ##collect ortholog mappings in a single file and join with synthetic lethal interaction file
      #create a target file with headers for collecting ortholog mappings
      echo "tax_id/gene_symbol^Ihuman_gene_symbol" > mapping.txt

      #repeat this step for each model organism, take care to adapt input file name and tax-ID
      #adds for each ortholog pair in s_cerevisiae.txt a new entry in mapping.txt: The Gene Symbol is prefixed with the tax id to ease subsequent joining with the synthetic lethal interactions file
      awk -F "\t" 'BEGIN{
      if($1 != "" && $2 != ""){
      print org_tax_id"/"$1, $2
      }' s_cerevisiae.txt >> mapping.txt

      #create artificial mapping entries for human genes
      awk -F "\t" 'BEGIN{
      if($5 == human_tax_id){
      print $5"/"$2, $2
      if($6 == human_tax_id){
      print $6"/"$3, $3
      }' bg_synlet.txt | sort -u >> mapping.txt

      #add required join keys (tax id/Gene Symbol) to synthetic lethal interactions
      awk -F "\t" 'BEGIN{
      if(NR == 1){
      print $0, "Key Interactor A", "Key Interactor B"
      print $0, $5"/"$2, $6"/"$3
      }' bg_synlet.txt > tmp_bg_synlet_w_keys.txt

      #join synthetical lethal interactions with orthologous pairs
      merge tmp_bg_synlet_w_keys.txt mapping.txt 7 1 > tmp.txt
      merge tmp.txt mapping.txt 8 1 > bg_synlet_mapped.txt

      NOTE: The merge command used in this example is not a standard Unix command. However, its implementation with the help of the GNU Core Utilities sort and join is straightforward. The command has been introduced to hide the complexity of sorting the files before they can be joined with the command join. An implementation of merge can be found at https://github.com/aheinzel/merge-sh.
    1. Use of any gene identifier uniquely identifying the gene in a certain namespace for best possible results.
      NOTE: Step 2 results in a list of synthetic lethal interactions from multiple organisms mapped to human genes.

3. Mapping synthetic lethal interaction partners to drugs

  1. Retrieve drug-target pairs from DrugBank.
    1. Download DrugBank data from the downloads section of DrugBank and create an account first if not already created22. Use the CSV file with drug target identifiers (protein identifiers section: https://www.drugbank.ca/releases/latest#protein-identifiers) and the DrugBank vocabulary (open data section: https://www.drugbank.ca/releases/latest#open-data) with DrugBank identifiers and names. Alternatively, extract the required information from the XML database dump.

      ##restrict the DrugBank drug target file to relevant columns and only retain entries for human molecular entities
      DB_NAMES="drugbank vocabulary.csv"

      #extract relevant columns and reformat to use tab as column seperator
      csvtool col 3,12,13 -u TAB "${DB_TARGETS}" > target_to_drugs_agg.txt

      awk -F "\t" 'BEGIN{
      if(NR == 1 || $2 == "Humans"){
      print $1, $3
      }' target_to_drugs_agg.txt > human_target_to_drugs_agg.txt

      NOTE: DrugBank data is provided in two main formats. The complete database is available as XML file. In addition, the majority of data is made available in a series of comma-separated value (CSV) files.
    1. Be aware that DrugBank also records non-human drug-targets. The species column (column number 12) can be used to extract human drug-targets.
      NOTE: For better readability names of the extracted columns are provided in Table 2. Other sources (e.g. the Therapeutic Target Database or Chembl) holding drug-target links exist, as outlined in the discussion section.
Column number Column header name
3 Gene Name
12 Species
13 Drug IDs
  1. Add drug names to drug-targets.
    1. Since drug name and drug-target information is provided in two separate CSV files, merge the information from the two files to subsequently add names of drugs targeting a synthetic lethal interaction partner to synthetic lethal interactions. Join the two datasets using the common DrugBank-drug-ID column. Normalize the drug-target dataset first that it only contains a single DrugBank-drug-ID per row, as the initial file may hold multiple DrugBank drug IDs in a row if a protein is targeted by multiple drugs.

      ##generate a single file holding drug target gene symbol, DrugBank drug ID and drug name
      #normalize drug-target dataset
      awk -F "\t" 'BEGIN{
      if(NR == 1){
      print $0
      }else if($1 != "" && $2 != ""){
      split($2, drug_targets, ";")
      for(i in drug_targets){
      drug_target = drug_targets[i]
      gsub(/ /, "", drug_target)
      print $1, drug_target | "sort -u"
      }' human_target_to_drugs_agg.txt > human_target_to_drug.txt

      #extract relevant columns and reformat to use tab as column separator
      csvtool col 1,3 -u TAB "${DB_NAMES}" > drugbank_id_to_name.txt

      merge human_target_to_drug.txt \
      drugbank_id_to_name.txt 2 1 > db_human_drug_targets.txt

      NOTE: Column one and three in the drugbank vocabulary.csv file hold the DrugBank drug ID and the respective name.
  1. Add drugs targeting synthetic lethal interaction partners to synthetic lethal interaction dataset.
    1. Join the synthetic lethal interaction dataset with the drug-target drug name file generated in the previous step using the gene symbol columns to add drugs to synthetic lethal interactions. Take care to add drug names for both partners of each synthetic lethal interaction.
      ##enhance the synthetic lethal interaction file by adding drugs targeting the partners of each synthetic lethal interaction
      merge bg_synlet_mapped.txt db_human_drug_targets.txt 9 1 > tmp.txt
      merge tmp.txt db_human_drug_targets.txt 10 1 > bg_synlet_mapped_drugs.txt

      NOTE: Step 3 results in synthetic lethal interaction from multiple organism with their orthologous human genes and drugs targeting these genes.

4. Establishing the set of currently tested drug combinations in clinical trials

  1. Get access to ClinicalTrials.gov data.
    1. Retrieve information on clinical trials in XML format from ClinicalTrials.gov on either (i) individual trials, (ii) trials resulting from a search query, or (iii) all trials in the database. Alternatively use the resources provided by the clinical trials transformation initiative which also hosts all data from ClinicalTrials.gov in a relational database. See step 4.4 for further details.
      NOTE: A free account is required to access the cloud-hosted database instance hosted by the clinical trials transformation initiative. In addition, a plsql client is required.
  2. Focus on interventional trials.
  3. Filter for trials specific for the indication of interest.
    NOTE: ClinicalTrials.gov provides disease names from the NCBI Medical Subject Headings (MeSH) controlled vocabulary. Contrary to submitter provided disease names, the controlled vocabulary allows to efficiently identify trials for the indication of interest. Nevertheless, one must keep in mind that the NCBI MeSH controlled vocabulary is a thesaurus. Therefore, check the MeSH Browser (https://meshb.nlm.nih.gov) if the general indication of interest has any child/narrower terms and include them if appropriate.
  4. Retrieve the identified trials together with the drugs tested in these trials. A query for trials in the general indication of ovarian cancer is provided below.

    ##retrieve interventional trials for the general indication ovarian cancer from the clinical trials transformation initiative hosted relational database containing ClinicalTrials.gov data
    cat <<EOF |
    \pset footer off
    SELECT DISTINCT s.nct_id, s.brief_title, i.intervention_type, i.name
    FROM studies s
    INNER JOIN browse_conditions c ON(s.nct_id = c.nct_id)
    INNER JOIN interventions i ON(s.nct_id = i.nct_id)
    WHERE s.study_type = 'Interventional'
    AND c.mesh_term IN (
    'Ovarian Neoplasms',
    'Carcinoma, Ovarian Epithelial',
    'Granulosa Cell Tumor',
    'Hereditary Breast and Ovarian Cancer Syndrome',
    'Meigs Syndrome',
    'Sertoli-Leydig Cell Tumor',
    ORDER BY s.nct_id, i.intervention_type;
    psql --host="aact-db.ctti-clinicaltrials.org" --username="XXX" --password --no-align --field-separator="^I" --output="clinical_trials.txt" aact
  1. Extract drug names and map to DrugBank names.
    NOTE: While it is tempting to directly use the drug names retrieved from clinical trials of interest one must be aware that intervention names in ClinicalTrials.gov are entered by the submitter as free text. As a consequence, the names are not standardized, brand names may be used instead of the common compound name and there is no guarantee for proper data normalization (e.g. multiple drug names in one entry). In addition, it is common that drugs are submitted with a different intervention type, differing from drug. Therefore, mapping of the retrieved intervention names to DrugBank drug names is best carried out manually.

      ##Obtain a list of interventions used in the previously retrieved set of clinical trials.
    cut -d "^I" -f3,4 clinical_trials.txt | tail -n +2 | sort -u

    NOTE: Columns three and four hold type of intervention and intervention name, respectively.

  1. Complement with drugs already in clinical use from guidelines
    NOTE: Step 4 results in a list of drugs under evaluation/in use for the indication of interest.

5. Identification of drug combinations targeting synthetic lethal interactions

  1. Search for synthetic lethal interactions being targeted by two drugs of interest. Restrict the dataset from step 3 to drugs of interest by filtering out lines in the file holding both drug A and drug B.

    ##only retain entries for synthetic lethal interactions and drugs triggering them where both partners are targeted by the two drugs of interest (drug_a and drug_b)
    awk -F "\t" '{
    if( ($12 == drug_a && $14 == drug_b) || ($12 == drug_b && $14 == drug_a) ){
    print $0
    }' drug_a="XXX" drug_b="YYY" bg_synlet_mapped_drugs.txt
  1. Ensure that neither of the two drugs alone is targeting both synthetic lethal interaction partners. Check the drug targets of each identified drug in the dataset from step 3.2 and evaluate whether both identified synthetic lethal partners are targets of the specific drug.

    ##find all drug target entries for a given drug name
    awk -F "\t" '{
    if($3 == drug){
    print $0

    }' drug="XXX" db_human_drug_targets.txt

    NOTE: A drug that would target both synthetic lethal interaction pathways would be toxic to any cell, so theoretically it is not a valuable multi- target agent. That is the reason why this possibility is excluded in this step of the algorithm.

6. Testing selected new drug combinations in vitro

  1. Treat human breast cancer cell lines and human benign mammary epithelial cells cultured in standard in vitro culturing methods in a humidified a 37 °C atmosphere with 5% CO2 with various drug combinations.
  2. Use media supplemented with fetal bovine serum and penicillin as well as streptomycin sulfate to hinder bacterial infection.
  3. Dilute drugs in solvents such as DMSO or phosphate-buffered saline in at least four different concentrations based on their previously established IC50 (inhibitory concentration) and use them in combination or alone for treatment of cells.
  4. Perform cell viability assays and apoptosis assays such AnnexinV/7-AAD stainings to determine cytotoxic effects caused by treatments.
  5. Monitor pharmacological inhibition of suspected molecular targets using western blots.
  6. Distinguish synthetic lethality from purely additive effects calculating the combinatory index (CI) as described by Chou and others23.

Subscription Required. Please recommend JoVE to your librarian.

Representative Results

Our group has recently published two studies applying the workflow depicted in this manuscript to identify drug combinations targeting synthetic lethal interactions in the context of ovarian and breast cancer24,25. In the first study, we evaluated drug combinations that are currently tested in late stage clinical trials (phase III and IV) or already being used in clinical practice to treat ovarian cancer patients regarding their impact on synthetic lethal interactions. In addition, we identified drug combinations that are currently not being tested in clinical trials but provide a rationale from the perspective of targeting synthetic lethal interactions. We therefore evaluated all possible drug combinations choosing drugs from the pool of all compounds in late stage ovarian cancer trials. We identified a unique set of 61 drug combinations that had been investigated in 68 late stage ovarian cancer trials. Twelve out of these 61 drug combinations addressed at least one synthetic lethal interaction. 84 additional drug combinations were proposed to address synthetic lethal interactions without being investigated in clinical trials to this date. 21 unique drugs contributed to the 84 identified drug combinations targeting a set of 39 synthetic lethal interactors as given in Figure 1.

Figure 1
Figure 1: Network of proposed novel drug combinations in the context of ovarian cancer. Figure 1 displays synthetic lethal interactions where interactors are addressed by two drugs currently not being tested in clinical trials. Synlet interactions are displayed in red, whereas drug-target links are indicated by grey edges. Dotted lines represent synthetic lethal interactions being addressed by other drug combinations in late stage ovarian cancer clinical trials. These investigated drug combinations are indicated with an asterisk (*), each in combination with paclitaxel with the additional investigated combination of cediranib and olaparib being indicated by a circle (o) [adapted from 25]. Please click here to view a larger version of this figure.

Using the same workflow in a second study, we identified 243 promising drug combinations targeting 166 synthetic lethal gene pairs in the context of breast cancer. We experimentally tested selected drug combinations regarding their impact on cell viability and apoptosis in two breast cancer cell lines. In particular, the proposed low-toxicity drug combination of celecoxib and zoledronic acid showed cytotoxicity beyond additive effects in breast cancer cell lines as determined by their combinatorial index. Results of viability and apoptosis assays for this drug combination are displayed in Figure 2.

Figure 2
Figure 2: Impact of celecoxib and zoledronic acid on viability and apoptosis in SKBR-3 cells. (A) Viability assay results for celecoxib (CEL), zoledronic acid (ZOL) and the combination of zoledronic acid and celecoxib (ZOL + CEL) in SKBR-3 breast cancer cell lines. Low and high CEL concentrations used were 50µM and 75µM. Low and high ZOL concentrations used were 500µM and 750µM. The drug combination had a significant synergistic effect on cell viability (** p < 0.001). (B, C) Annexin V (ANXA5) and 7-AAD stainings of SKBR-3 cells treated with CEL, ZOL, and the drug combination ZOL + CEL. The percentage of 7-AADpos/ANXA5pos cells was increased after treatment with the drug combination ZOL + CEL [adapted from 24]. Please click here to view a larger version of this figure.

Subscription Required. Please recommend JoVE to your librarian.


We have outlined a workflow to identify drug combinations impacting synthetic lethal interactions. This workflow makes use of (i) data on synthetic lethal interactions from model organisms, (ii) information of human orthologs, (iii) information on drug-target associations, (iv) drug information on clinical trials in the context of cancer, as well as (v) on information of drug-disease and gene-disease associations extracted from scientific literature. The consolidated information can be used to evaluate the impact of a given drug combination under investigation on synthetic lethal gene pairs. In addition, consolidated data can be used to evaluate a set of drugs currently being investigated or tested in clinical trials in the context of cancer to find combinations targeting the most relevant synthetic lethal interactions, therefore having a higher chance of impacting tumor cell survival. Lastly, the data generated can be used to screen for drug combinations consisting of drugs not initially developed for tumor treatment, thus providing a way for a computationally driven drug repositioning case.

For each step in the data integration workflow we present key data sources to complete the full data workflow but point out that the workflow can be further enhanced at various stages by making use of additional data sources. In our workflow we extracted synthetic lethal interaction pairs from the BioGRID database18. We specifically focused on interactions of experiment types “synthetic lethality” and “negative genetic”. Information in BioGRID on synthetic lethal interactions contains datasets from large genetic screen as for example a dataset published by Costanzo and colleagues26, which is also available in the DRYGIN database27, as well as data on single synthetic lethal interactions as described in individual experiments in scientific literature. There are additional data sources collecting and storing synthetic lethal interactions, as for example SynLethDB28. Further, on the level of orthology mapping, a large number of different tools and databases exist. We present a way to make use of Ensembl biomart to map synthetic lethal interaction partners identified in model organisms to their corresponding human orthologs. Other orthology databases and services include NCBI’s HomoloGene database29, the OMA orthology database from the Swiss Institute of Bioinformatics30, or the InParanoid ortholog groups database maintained by the Stockholm Bioinformatics Center31. In our workflow, we focused on synthetic lethal interactions from multiple model organisms, with the largest number of synthetic lethal interactions coming from yeast. One might consider restricting the input set for the orthology mapping to data from mouse and rat only, which are evolutionary closer to humans. An additional way of defining the input set of synthetic lethal interactions is to only focus on synthetic lethal interactions being conserved in multiple species, thereby increasing the chances that the synthetic lethal interaction is truly positive. This on the other hand might reduce the set of synthetic lethal interactions dramatically, as there is already a large difference in the identified synthetic lethal interactions between S. cerevisiae and S. pombe. Another approach is to be not too stringent at the beginning and to even extend the set of experimental synthetic lethal interactions by machine learning algorithms as we did in the two studies listed in the representative results section. In brief, a random forest model was used to predict synthetic lethal interactions for human genes for which no orthologous genes existed in yeast. The random forest model was trained on the set of synthetic lethal interaction pairs from yeast and their orthologous human genes using data on pathway associations, Gene Ontology assignment as well as disease and drug associations as described previously24,25. This allowed us to consider human genes for which no ortholog mapping information was available in our integration workflow. A widely used database storing information on drug-target associations is DrugBank, which is also the primary source of interactions in the workflow. Other databases holding to some extent complementary information on drug targets are the Therapeutic Target Database (TTD)32 or ChEMBL33. Major components of the workflow are also incorporated in the e.valuation platform from emergentec and SynLethDB, which has been developed by researchers from Nanyang Technological University. The last update of SynLethDB in 2015, however, was based on the datasets stored in the download section on their respective webpage28.

A way to rank identified drug combinations and targeted synthetic lethal interaction pairs is using the association of synthetic lethal partners and/or drugs with the disease of interest via literature mining methods. In our work on the evaluation of drug combinations in the context of ovarian cancer, we ranked novel proposed drug combinations based on the number of publications on ovarian cancer mentioning either one of the two synthetic lethal interactors of a respective drug combination. MeSH annotation in Pubmed can be used to identify publications for a specific disease using the exact disease terms as given in the major MeSH branch C. Information on genes in the identified publications can be extracted using NCBI’s gene2pubmed mapping file as described elsewhere34. Further, there are dedicated databases holding gene-disease and/or drug-disease links such as the Comparative Toxicogenomics Database35, DisGeNET36, or the e.valuation software platform. Ranking of drug combinations based on disease associations is one way of supporting the final selection of drug combinations for experimental testing. Additional aspects need to be considered when selecting drug combinations for further testing, like for example individual toxicity profiles of the drugs or expression status of synthetic lethal interactors in the respective target organ.

In the representative results section, we present data for the drug combination of celecoxib and zoledronic acid, which was identified following the workflow to identify drug combinations in the context of breast cancer. This particular drug combination was selected for experimental testing due to the low toxicity profiles of both compounds. We used various concentrations in in-vitro experiments to evaluate the impact of the drug combination on cell viability and apoptosis. Ideally, drug concentrations could be significantly lowered for individual drugs to minimize side effects while at the same time maximizing efficacy by combining two drugs. Seeing impact on viability at lower doses is even more meaningful, as drug concentrations used for in vitro testing could be criticized to be supratherapeutical, that are not reached in in vivo models. However, the concentrations were chosen based on cell culture experiments with these given drugs in the literature. Drug dosing may further influence what targets are primarily affected, as most compounds have more than one drug target, potentially impacting a large set of known and unknown downstream molecules as well. Drug combinations showing synergistic effects on cell viability in in-vitro cell culture systems should therefore be further investigated in 3D or in-vivo models.

Summarizing, we present a workflow that integrates information from different data sources to evaluate and propose drug combinations targeting synthetic lethal interactions. To date, the largest information on synthetic lethal interactions is still coming from model organisms, requiring a mandatory orthology mapping step to the human genome. First screens in human haploid cells have led to the identification of synthetic lethal interactions in human cells. Additionally, the CRISPR/CAS technology has opened new ways of studying synthetic lethal interactions on a cellular level. With more high quality biological synthetic lethal interaction data becoming available, we propose that data integration efforts such as ours will transform clinical cancer treatment in the future, by discovering novel and clinically meaningful synthetic lethal gene pairs aside from BRCA(1/2)/PARP1.

Subscription Required. Please recommend JoVE to your librarian.


AH and PP were employees of emergentec biodevelopment GmbH at the time of performing the analyses leading to the results presented in the representative results section. MM and MK have nothing to disclose.


Funding for developing the data integration workflow was obtained from European Community’s Seventh Framework Programme under grant agreement nu. 279113 (OCTIPS). Adaption of data within this publication was kindly approved by Public Library of Sciences Publications and Impact Journals, LLC.


Name Company Catalog Number Comments
BioGRID n/a n/a thebiogrid.org
ClinicalTrials.gov n/a n/a ClinicalTrials.gov
DrugBank n/a n/a drugbank.ca
Ensembl BioMart n/a n/a ensembl.org
for alternative computational databases please refer to the manuscript
7-AAD ebioscience 00-6993-50
AnnexinV-APC BD Bioscience 550474
celecoxib Sigma-Aldrich PZ0008-25MG
CellTiter-Blue Viability Assay Promega G8080
FACS Canto II BD Bioscience n/a
fetal bovine serum Fisher Scientific/Gibco 16000044
FloJo Software FloJo LLC V10
McCoy's 5a Medium Modified Fisher Scientific/Gibco 16600082
penicillin G/streptomycin sulfate Fisher Scientific/Gibco 15140122
SKBR-3 cells American Type Culture Collection (ATCC) ATCC HTB-30
zoledronic acid Sigma-Aldrich SML0223-50MG
further materials or equipment will be made available upon request



  1. Dobzhansky, T. Genetics of natural populations; recombination and variability in populations of Drosophila pseudoobscura. Genetics. 31, 269-290 (1946).
  2. Hartwell, L. H., Szankasi, P., Roberts, C. J., Murray, A. W., Friend, S. H. Integrating genetic approaches into the discovery of anticancer drugs. Science. 278 (5340), New York, N.Y. 1064-1068 (1997).
  3. D'Amours, D., Desnoyers, S., D'Silva, I., Poirier, G. G. Poly(ADP-ribosyl)ation reactions in the regulation of nuclear functions. The Biochemical Journal. 342 (2), 249-268 (1999).
  4. Gudmundsdottir, K., Ashworth, A. The roles of BRCA1 and BRCA2 and associated proteins in the maintenance of genomic stability. Oncogene. 25 (43), 5864-5874 (2006).
  5. Farmer, H., et al. Targeting the DNA repair defect in BRCA mutant cells as a therapeutic strategy. Nature. 434 (7035), 917-921 (2005).
  6. McCann, K. E., Hurvitz, S. A. Advances in the use of PARP inhibitor therapy for breast cancer. Drugs in Context. 7, 212540 (2018).
  7. Franzese, E., et al. PARP inhibitors in ovarian cancer. Cancer Treatment Reviews. 73, 1-9 (2019).
  8. Ashworth, A., Lord, C. J. Synthetic lethal therapies for cancer: what’s next after PARP inhibitors. Nature Reviews. Clinical Oncology. 15 (9), 564-576 (2018).
  9. Yu, B., Luo, J. Synthetic lethal genetic screens in Ras mutant cancers. The Enzymes. 34, Pt B 201-219 (2013).
  10. Thompson, J. M., Nguyen, Q. H., Singh, M., Razorenova, O. V. Approaches to identifying synthetic lethal interactions in cancer. The Yale Journal of Biology and Medicine. 88 (2), 145-155 (2015).
  11. Ruiz, S., et al. A Genome-wide CRISPR Screen Identifies CDC25A as a Determinant of Sensitivity to ATR Inhibitors. Molecular Cell. 62 (2), 307-313 (2016).
  12. Housden, B. E., Nicholson, H. E., Perrimon, N. Synthetic Lethality Screens Using RNAi in Combination with CRISPR-based Knockout in Drosophila Cells. Bio-Protocol. 7 (3), (2017).
  13. Blomen, V. A., et al. Gene essentiality and synthetic lethality in haploid human cells. Science. 350 (6264), New York, N.Y. 1092-1096 (2015).
  14. Forment, J. V., et al. Genome-wide genetic screening with chemically mutagenized haploid embryonic stem cells. Nature Chemical Biology. 13 (1), 12-14 (2017).
  15. Wildenhain, J., et al. Prediction of Synergism from Chemical-Genetic Interactions by Machine Learning. Cell Systems. 1 (6), 383-395 (2015).
  16. Madhukar, N. S., Elemento, O., Pandey, G. Prediction of Genetic Interactions Using Machine Learning and Network Properties. Frontiers in Bioengineering and Biotechnology. 3, 172 (2015).
  17. Fechete, R., et al. Synthetic lethal hubs associated with vincristine resistant neuroblastoma. Molecular BioSystems. 7 (1), 200-214 (2011).
  18. Oughtred, R., et al. The BioGRID interaction database: 2019 update. Nucleic Acids Research. 47, D1 529-541 (2019).
  19. Zeeberg, B. R., et al. Mistaken identifiers: gene name errors can be introduced inadvertently when using Excel in bioinformatics. BMC bioinformatics. 5, 80 (2004).
  20. Ziemann, M., Eren, Y., Gene El-Osta, A. name errors are widespread in the scientific literature. Genome Biology. 17 (1), 177 (2016).
  21. Kersey, P. J., et al. Ensembl Genomes 2018: an integrated omics infrastructure for non-vertebrate species. Nucleic Acids Research. 46, D1 802-808 (2018).
  22. Wishart, D. S., et al. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Research. 46, D1 1074-1082 (2018).
  23. Chou, T. C. Drug combination studies and their synergy quantification using the Chou-Talalay method. Cancer Research. 70 (2), 440-446 (2010).
  24. Marhold, M., et al. Synthetic lethal combinations of low-toxicity drugs for breast cancer identified in silico by genetic screens in yeast. Oncotarget. 9 (91), 36379-36391 (2018).
  25. Heinzel, A., et al. Synthetic lethality guiding selection of drug combinations in ovarian cancer. PloS One. 14 (1), 0210859 (2019).
  26. Costanzo, M., et al. The genetic landscape of a cell. Science. 327 (5964), 425-431 (2010).
  27. Koh, J. L. Y., et al. DRYGIN: a database of quantitative genetic interaction networks in yeast. Nucleic Acids Research. 38, 502-507 (2010).
  28. Guo, J., Liu, H., Zheng, J. SynLethDB: synthetic lethality database toward discovery of selective and sensitive anticancer drug targets. Nucleic Acids Research. 44, 1011-1017 (2016).
  29. NCBI Resource Coordinators. Database resources of the National Center for Biotechnology Information. Nucleic Acids Research. 44 (1), 7-19 (2016).
  30. Altenhoff, A. M., et al. The OMA orthology database in 2018: retrieving evolutionary relationships among all domains of life through richer web and programmatic interfaces. Nucleic Acids Research. 46 (1), 477-485 (2018).
  31. Sonnhammer, E. L. L., Östlund, G. InParanoid 8: orthology analysis between 273 proteomes, mostly eukaryotic. Nucleic Acids Research. 43, Database issue 234-239 (2015).
  32. Li, Y. H., et al. Therapeutic target database update 2018: enriched resource for facilitating bench-to-clinic research of targeted therapeutics. Nucleic Acids Research. 46 (1), 1121 (2018).
  33. Gaulton, A., et al. The ChEMBL database in 2017. Nucleic Acids Research. 45 (1), 945-954 (2017).
  34. Heinzel, A., Mühlberger, I., Fechete, R., Mayer, B., Perco, P. Functional molecular units for guiding biomarker panel design. Methods in Molecular Biology. 1159 (12), 109-133 (2014).
  35. Davis, A. P., et al. The Comparative Toxicogenomics Database: update 2019. Nucleic Acids Research. 47 (1), 948-954 (2019).
  36. Piñero, J., et al. DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants. Nucleic acids research. 45 (1), 833-839 (2017).
A Data Integration Workflow to Identify Drug Combinations Targeting Synthetic Lethal Interactions
Play Video

Cite this Article

Marhold, M., Heinzel, A., Merchant, A., Perco, P., Krainer, M. A Data Integration Workflow to Identify Drug Combinations Targeting Synthetic Lethal Interactions. J. Vis. Exp. (171), e60328, doi:10.3791/60328 (2021).More

Marhold, M., Heinzel, A., Merchant, A., Perco, P., Krainer, M. A Data Integration Workflow to Identify Drug Combinations Targeting Synthetic Lethal Interactions. J. Vis. Exp. (171), e60328, doi:10.3791/60328 (2021).

Copy Citation Download Citation Reprints and Permissions
View Video

Get cutting-edge science videos from JoVE sent straight to your inbox every month.

Waiting X
Simple Hit Counter