1Banting and Best Department of Medical Research and Department of Molecular Genetics, University of Toronto, 2Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 3Donnelly Sequencing Centre, University of Toronto, 4Genetics and Molecular Biology Branch, National Human Genome Research Institute, NIH, 5Stanford Genome Technology Center, Stanford School of Medicine, Stanford University, 6Department of Pharmaceutical Sciences, University of Toronto
Smith, A. M., Durbic, T., Oh, J., Urbanus, M., Proctor, M., Heisler, L. E., et al. Competitive Genomic Screens of Barcoded Yeast Libraries. J. Vis. Exp. (54), e2864, doi:10.3791/2864 (2011).
By virtue of advances in next generation sequencing technologies, we have access to new genome sequences almost daily. The tempo of these advances is accelerating, promising greater depth and breadth. In light of these extraordinary advances, the need for fast, parallel methods to define gene function becomes ever more important. Collections of genome-wide deletion mutants in yeasts and E. coli have served as workhorses for functional characterization of gene function, but this approach is not scalable, current gene-deletion approaches require each of the thousands of genes that comprise a genome to be deleted and verified. Only after this work is complete can we pursue high-throughput phenotyping. Over the past decade, our laboratory has refined a portfolio of competitive, miniaturized, high-throughput genome-wide assays that can be performed in parallel. This parallelization is possible because of the inclusion of DNA 'tags', or 'barcodes,' into each mutant, with the barcode serving as a proxy for the mutation and one can measure the barcode abundance to assess mutant fitness. In this study, we seek to fill the gap between DNA sequence and barcoded mutant collections. To accomplish this we introduce a combined transposon disruption-barcoding approach that opens up parallel barcode assays to newly sequenced, but poorly characterized microbes. To illustrate this approach we present a new Candida albicans barcoded disruption collection and describe how both microarray-based and next generation sequencing-based platforms can be used to collect 10,000 - 1,000,000 gene-gene and drug-gene interactions in a single experiment.
1. Background information
There are several ways to generate mutants that carry barcode tags. The current gold standard is the Yeast KnockOut (YKO) collection created by a consortium of labs and completed in 2002 1. Since the original YKO was introduced, other yeast collections have been generated; in different strain backgrounds, using over-expression constructs, and in other microbes such as E. coli 2. In parallel, the effort to create barcoded shRNA libraries is proceeding rapidly, and in fact, many of the design principles for these mammalian collections have been adopted from yeast. To demonstrate how barcoded transposons can be a rapid, widely applicable strategy for creating systematic mutant collections, we focus on one collection we recently created in the human fungal pathogen, Candida albicans. Our work on Candida was based on the success of barcode screens in S. cerevisiae, and is used here as an example organism. The sample protocol, can with minor modifications be used to screen any organism that can be grown in suspension culture. Because few organisms have the requisite high rates of transformation and efficient mitotic recombination needed to create perfect deletion mutants, accordingly we developed a protocol that uses transposon mutagenesis in vitro to mutagenize a genomic DNA library, and then transformed these barcoded genomic fragments into Candida albicans 3, 4. Inspired by the success of the original YKO collection and its role in fundamental discoveries on the nature of gene networks 5-8, genome-wide haploinsufficiency 9, drug target and mechanism of action 10,11, and the essentiality of all genes in the genome 12 we anticipate broadening this approach to other microbes will be extremely fruitful.
The protocol below assumes that the desired mutant collection has been created (e.g. YKO or Candida albicans disruption collection) and is available as individually archived strains. For a detailed description of strain construction see 1,13,14.
2. Combine individual mutants into a single pool
3. Experimental pool growth
This procedure is outlined in Figure 1.
Note: Always collect a starting cell sample (i.e. a "T0 time point") to assess initial strain representation in any newly created pool by adding 1-2 OD600 of pool directly from the freezer aliquots to a 1.5 ml tube and processing as described below.
4. Genomic DNA extraction, PCR and microarray hybridization or sequencing
5. Array analysis (see Figure 2 for an example obtained using the Candida albicans disruption collection)
6. Assessing fitness of barcoded yeast strains by sequencing
Note: Considering Bar-seq as an alternative to array hybridization. With the costs of high-throughput sequencing decreasing, using high-throughput sequencing as a readout of tag abundance is becoming feasible and in many cases, is more cost effective 18. In this way, amplified PCR product is measured directly as 'counts' rather than as signal intensity as hybridized to an array. This eliminates false negatives and positives that arise from tag cross-contamination, saturation, or issues arising from very high or very low signal intensities. Furthermore, multiple experiments can be combined prior to sequencing by the addition of a 4-8 base DNA index19. Because the yeast barcodes are 20 bp, one single, 2-step read of 26-28 bases captures both multiplex index and unique barcode , allowing for extreme 100+ multiplexing. At the time of this writing, Bar-seq offers a cost advantage over Bar-code microarrays, and furthermore, Bar-seq is inherently flexible such that as the number of reads/run increases, the level of multiplexing can increase to further decrease costs. Several "mid-capacity" sequencers from all of the major platform manufacturers will further democratize Bar-seq, with sequencing likely to become the readout of choice.
This protocol has also been validated on the Illumina HiSeq2000.
An excellent demonstration of the use of Bar-seq to address a fundamental biological question in Saccharomyces cerevisiae growth control is presented in a recent study by Gresham et al.20 who outline several important experimental design and interpretation guidelines.
7. Validation of pooled screening data
The results from any functional genomics screen should be verified using the individual strains in isolated culture. Because each experiment will differ in terms of the number of sensitive strains, selecting the number of candidate strains to confirm is somewhat arbitrary. As a guide, ranking the most sensitive strains by the log2 ratio or z-score and testing the top 25-50% of the candidates (which typically translates to 2-3 standard deviations from the mean of all strains in the pool) is a good balance between costs and benefits. Individual confirmations can be performed in any flask but we perform these tests for 5 generations of growth in 96 well plates using a starting inoculum of 0.06 OD600 in 100 μl of media in a shaking spectrophotometer, taking measurements every 15 minutes (See Figure 4).
8. Representative Results
Once a genome-wide screen is complete, and the arrays have been normalized and the behavior of each strain compared to a control treatment (e.g. by comparing microarray intensities or sequencing counts/strain) the data are most easily manipulated in an excel file with genes ranked by the log2 ratios of control/experiment. In this manner, the greater the negative log2 ratio, the more sensitive that particular strain is to the test condition. These excel files can be plotted in a variety of graphing software packages. We find it simplest to plot the log2 ratios on the Y axis and the gene or ORF names on the X axis. In the example shown in Figure 2a, such a plot of clotrimazole treatment (a known antifungal agent) is shown. All strain that are significantly sensitive to treatment with a log2 ratio of 2 are highlighted in red, and we would typically verify many of such strains in individual growth assays of each mutant in the presence of the same concentration of drug. In this example, 4 strains are highlighted, NCP1, ERG2 and 2 independent alleles of ERG11, the known protein target of clotrimazole. Each of these 4 genes is directly involved in ergosterol biosynthesis, the yeast equivalent of cholesterol. For example, NCP1 encodes a NADP-cytochrome P450 reductase that is involved in ergosterol biosynthesis and which is associated and coordinately regulated with Erg11. This example highlights the fact that the known drug target (Erg11) is identified in this unbiased screen, as well as several other key components of the target pathway. Finally, several of the highlighted genes in red represent genes that may be involved in ergosterol biosynthesis or in distinct biological processes. As mentioned above, each strain detected as sensitive in a pooled screen should be verified as to its sensitivity in individual growth assay. In the example shown in Figure 2b, four strains are confirmed to be sensitive to clotrimazole based on their decreased growth relative to the wild-type parent strain, BWP17. These individual growth curves highlight an important feature of such pooled gene-drug screens; that is the absolute rank of a particular strain does not necessarily reflect its exact level of sensitivity. Furthermore, Figure 2b also shows the value of having multiple alleles for each gene, in this case, the two erg11 disruption mutants have slightly differing sensitivities. Correlating the nature of these disruptions with the degree of sensitivity can provide additional insight into the drugs mechanism of action.
Figure 1. Workflow for pooled growth assay and barcode detection. Cultures are inoculated with thawed aliquots of pooled cells (step 1), and then grown for the desired number of generations (step 2) either robotically (Option A) or manually (Option B). Cells are harvested by centrifugation (step 3) and genomic DNA is then isolated from the harvested cells (step 4), uptags and downtags are independently amplified (step 5), and hybridized to an array (step 6a or sequenced directly step 6b).
Figure 2. Sample data collected at certain points of the protocol. (A) Sample data from screening results (adapted from 13). The pool of tagged mutants was grown for 20 generations in the presence of clotrimazole and DMSO (control). Log2 ratio (control intensity/treatment intensity) was calculated and plotted as a function of gene. Highly sensitive strains (red) included the known target of clotrimazole, ERG11p. Note that this assay frequently uncovers other sensitive mutants in addition to the compound's actual target. Generally, these mutants are those that act synthetically with the target, those that are part of a general stress/treatment response, or are false positives that fail to confirm. (B) Example of confirmation data (adapted from 13). Results from the pooled growth assays can be validated by growing the strain in individual culture and compared against wild-type growth (black).
Figure 3. Structure of the amplicon produced from pooled barcode assays for microarray hybridization or Barcode sequencing. The amplicon produced for each mutant in the collection contains homology to the genome for integration (blue regions labeled ATG and TAA), unique barcodes (labeled AG and indicated by a black dash). For microarray hybridization, the blue common primers are used to amplify a 60bp probe for microarray hybridization. For barcode sequencing, extended primers are used in the PCR reaction, comprised of sequences encoding the Illumina adapter (red bar), and index of 6bases cross-hatches) and the blue common primer for the upstream primer, and the same composite primer (minus the 6 base index) for the second primer.
Figure 4. Individual Growth assays for 1) Prescreening compounds against wild-type yeast to determine an appropriate dose for genome-wide screening and 2) Confirming results from genome wide screens. (A) A 96 well flat bottom plate is filled with 100 μl of cell suspension at an OD of 0.062. Each well can contain the same strain (for dose-determination) or different strain and drug combinations (for confirmation assays). 2 μl of compound (typically dissolved in DMSO) is added and cells are grown with constant shaking for 16-20 h at 30°C. The final concentration of DMSO should not exceed 2%. In this example, in each well of the plate the growth curve is plotted in black against a plot of the control growth curve in red. (B) Higher resolution image of several prescreens obtained with an example drug overlaid on top of one another. In this titration series, an IC10-15 is obtained with the purple dose and would be appropriate for deletion profiling (HIP and HOP). Due to the non-linearity at higher optical densities, Tecan (or any similar plate reader) ODs must be calibrated using those obtained with a "traditional" 1mm path-length cuvette.
Here, we outlined a protocol that, with modest modification, can easily be adapted to a wide range of existing collections of barcoded mutant collections of different microorganisms to create tagged mutant collections. We emphasize that while we have reported a protocol on the tagged transposon mutagenesis for the pathogenic yeast C. albicans, a very similar protocol could be adapted to a wide variety of unicellular fungi. Modified, this protocol works well in bacteria 13, and currently collections for a number of additional fungal and bacterial genomes are under construction. At present, this assay provides the only comprehensive, genome-wide unbiased screen for gene-small molecule interactions. One particularly compelling feature of the assay is that no prior knowledge of the gene or small molecule is required. Despite the scope and power of these assays, their transferability to other labs has been hindered somewhat by the initial capital investment and informatics tools for the analysis of the results. With the adoption of a next generation sequence readout combined with robust tools for analysis, we expect their adoption to increase.
No conflicts of interest declared.
We thank Ron Davis, Adam Deutschbauer, and the entire HIP HOP laboratory at the University of Toronto for discussions and advice. C.N. is supported by grants from the National Human Genome Research Institute (Grant Number HG000205), RO1 HG003317, CIHR MOP-84305, and Canadian Cancer Society (#020380). J.O. was supported by the Stanford Genome Training Program (Grant Number T32 HG00044 from the National Human Genome Research Institute) and the National Institutes of Health (Grant Number P01 GH000205). GG is supported by the NHGRI RO1 HG003317 and the Canadian Cancer Society, Grant # 020380, TD and the Donnelly Sequencing Centre is supported in part by grants from the Canadian Foundation for Innovation to Drs. Brenda Andrews and Jack Greenblatt. A.M.S. is supported by a University of Toronto Open Fellowship.