This protocol describes an experimental method and data analysis workflow for cleavage under targets and release using nuclease (CUT&RUN) in the human fungal pathogen Candida albicans.
Regulatory transcription factors control many important biological processes, including cellular differentiation, responses to environmental perturbations and stresses, and host-pathogen interactions. Determining the genome-wide binding of regulatory transcription factors to DNA is essential to understanding the function of transcription factors in these often complex biological processes. Cleavage under targets and release using nuclease (CUT&RUN) is a modern method for genome-wide mapping of in vivo protein-DNA binding interactions that is an attractive alternative to the traditional and widely used chromatin immunoprecipitation followed by sequencing (ChIP-seq) method. CUT&RUN is amenable to a higher-throughput experimental setup and has a substantially higher dynamic range with lower per-sample sequencing costs than ChIP-seq. Here, a comprehensive CUT&RUN protocol and accompanying data analysis workflow tailored for genome-wide analysis of transcription factor-DNA binding interactions in the human fungal pathogen Candida albicans are described. This detailed protocol includes all necessary experimental procedures, from epitope tagging of transcription factor-coding genes to library preparation for sequencing; additionally, it includes a customized computational workflow for CUT&RUN data analysis.
Candida albicans is a clinically relevant, polymorphic human fungal pathogen that exists in a variety of different modes of growth, such as the planktonic (free-floating) mode of growth and as communities of tightly adhered cells protected by an extracellular matrix, known as the biofilm mode of growth1,2,3. Similar to other developmental and cellular processes, biofilm development is an important C. albicans virulence trait that is known to be controlled at the transcriptional level by regulatory transcription factors (TFs) that bind to DNA in a sequence-specific manner4. Recently, chromatin regulators and histone modifiers have also emerged as important regulators of C. albicans biofilm formation5 and morphogenesis6 by mediating DNA accessibility. To understand the complex biology of this important fungal pathogen, effective methods to determine the genome-wide localization of specific TFs during distinct developmental and cellular processes is valuable.
Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is a widely used method to investigate protein-DNA interactions in C. albicans5,6 and has largely replaced the more classical chromatin immunoprecipitation followed by microarray (ChIP-chip)9 method. Both ChIP-seq and ChIP-chip methods, however, require a large number of input cells10, which can be a complicating factor when investigating TFs in the context of specific samples and modes of growth, such as biofilms collected from patients or animal models of infection. In addition, the chromatin immunoprecipitation (ChIP) assay often yields a significant amount of background signal throughout the genome, requiring a high level of enrichment for the target of interest to sufficiently separate signal from noise. While the ChIP-chip assay is largely outdated today, the sequencing depths necessary for ChIP-seq make this assay prohibitively expensive for many researchers, particularly those studying multiple TFs and/or chromatin-associated proteins.
Cleavage under targets and release using nuclease (CUT&RUN) is an attractive alternative to ChIP-seq. It was developed by the Henikoff lab in 2017 to circumvent the limitations of ChIP-seq and chromatin endogenous cleavage followed by sequencing ChEC-seq11,12, another method to identify protein-DNA interactions on a genome-wide level, while providing high-resolution, genome-wide mapping of TFs and chromatin-associated proteins13. CUT&RUN relies on the targeted digestion of chromatin within permeabilized nuclei using tethered micrococcal nucleases, followed by sequencing of the digested DNA fragments9,10. As DNA fragments are specifically generated at the loci that are bound by a protein of interest, rather than being generated throughout the genome via random fragmentation as in ChIP assays, the CUT&RUN approach results in greatly reduced background signals and, thus, requires 1/10th of the sequencing depth compared to ChIP-seq11,13,14. These improvements ultimately lead to significant reductions in sequencing costs and reductions in the total number of input cells needed as starting material for each sample.
Here, a robust CUT&RUN protocol is described that has been adapted and optimized for determining the genome-wide localization of TFs in C. albicans cells isolated from biofilms and planktonic cultures. A thorough data analysis pipeline is also presented, which enables the processing and analysis of the resulting sequence data and requires users to have minimal expertise in coding or bioinformatics. Briefly, this protocol describes epitope tagging of TF-coding genes, harvesting of biofilm and planktonic cells, isolation of intact permeabilized nuclei, incubation with primary antibodies against the specific protein or epitope-tagged protein of interest, tethering of the chimeric A/G-micrococcal nuclease (pAG-MNase) fusion proteins to the primary antibodies, genomic DNA recovery after chromatin digestion, and preparation of genomic DNA libraries for sequencing.
The experimental CUT&RUN protocol is followed by a purpose-built data analysis pipeline, which takes raw DNA sequencing reads in FASTQ format and implements all required processing steps to provide a complete list of significantly enriched loci bound by the TF of interest (targeted by the primary antibody). Note that multiple steps of the described library preparation protocol have been specifically adapted and optimized for CUT&RUN analysis of TFs (as opposed to nucleosomes). While the data presented in this manuscript were generated using TF-specific adaptations of a commercial CUT&RUN kit, these protocols have also been validated using individually sourced components (i.e., pAG-MNase enzyme and magnetic DNA purification beads) and in-house-prepared buffers, which can significantly reduce experimental cost. The comprehensive experimental and data analysis protocols are described in detail below in a step-by-step format. All reagents and critical equipment, as well as buffer and media recipes, are listed in the Table of Materials and Supplementary File 1, respectively.
1. Epitope Tagging of C. albicans strains
2. Sample preparation of biofilm cultures
3. Sample preparation of planktonic cultures
4. Isolation of nuclei
NOTE: On the day of the experiment, prepare fresh Ficoll Buffer, add 2-mercaptoethanol and protease inhibitor to aliquot(s) of the Resuspension Buffer, and add protease inhibitor to aliquot(s) of the SPC Buffer (see Supplementary File 1). To resuspend the pellets, gently pipette using either 200 µL or 1 mL pipette tips to avoid damaging the cells or nuclei. Before beginning the nuclei isolation, turn on the heat block to preheat it to 30 °C. All pipette tips and tubes for the remainder of this protocol should be certified DNA/RNA and DNase/RNase-free, and the use of filter tips is recommended for all subsequent pipetting steps.
5. Concanavalin A bead activation
NOTE: This is a critical step. From this point forward, users have the option to continue with the protocol using a commercially available CUT&RUN kit or source key components individually and prepare buffers in-house. If using the commercial kit, all buffers and reagents used below are included in the kit unless otherwise noted. Individual catalog numbers for sourcing reagents independently are also provided in the Table of Materials. Chill all buffers on ice before use. Once step 5 is completed, it is recommended to proceed to step 6 immediately. Avoid multiple freeze-thawing of isolated nuclei as it is known to increase DNA damage and could lead to poor quality results.
6. Binding nuclei to activated beads
NOTE: Chill all buffers on ice before use. All buffers supplemented with protease inhibitors should be prepared fresh on the day of the experiment. It is recommended to use 0.2 mL strip tubes in the subsequent steps.
7. Primary antibody binding
NOTE: pAG-MNase fusion protein binds well to rabbit, goat, donkey, guinea pig, and mouse IgG antibodies17. Generally, most commercial ChIP-seq-certified commercial antibodies are compatible with CUT&RUN procedures. The amount of primary antibody used depends on the efficiency of the antibody, and titration of the antibody (e.g., 1:50, 1:100, 1:200, and 1:400 final dilution) may be necessary if the antibody of interest has not been previously tested in ChIP or CUT&RUN experiments. Chill all buffers on ice prior to use. All buffers used for antibody binding steps should be prepared fresh on the day of the experiment.
8. Binding of pAG-MNase to antibody
9. Targeted chromatin digestion and release
10. Cleanup of collected DNA samples
NOTE: Incubate DNA Purification Beads at room temperature for 30 min before use. Prechill 100% isopropanol on ice. When mixing the samples, pipette up and down 10 times.
11. Library preparation for sequencing
NOTE: The following steps use a commercially available library prep kit. When performing steps using the Ligation Master Mix, minimize touching the tubes and always keep them on ice.
12. CUT&RUN sequence analysis
NOTE: This section presents the computational protocol used to analyze the CUT&RUN sequence data. The protocol begins with setting up the computational virtual environment and walks users through executing the commands on their local machine. This protocol will work on all computational resources, such as local machines, virtual cloud servers, and high-performance computing clusters. All CUT&RUN data presented in this paper can be accessed at NCBI GEO under accession number GSE193803.
13. Generation of the genome file for alignment
14. Downloading C. albicans genome assembly 21
15. Generate a Bowtie 2 index database (database name: ca21)
16. Run the CUT&RUN analysis pipeline
17. Execute the cut_n_run_pipeline.sh file with relevant parameters
18. Organize output files
19. Remove matches to blocklisted genomic regions using the BedTools subtract function
20. Merge BigWig files from replicates using the UCSC bigWigMerge function22
This robust CUT&RUN protocol was adapted and optimized for investigating the genome-wide localization of specific TFs in C. albicans biofilms and planktonic cultures (see Figure 2 for an overview of the experimental approach). A thorough data analysis pipeline is also included to facilitate analysis of the resulting CUT&RUN sequencing data and requires users to have minimal expertise in coding or bioinformatics (see Figure 3 for an overview of the analysis pipeline). Contrary to the ChIP-chip and ChIP-seq methods, CUT&RUN is carried out using intact, permeabilized nuclei prepared from a significantly reduced number of input cells, without formaldehyde crosslinking. Isolating intact nuclei from C. albicans spheroplasts is a critical step in the protocol. Efficient spheroplasting via the digestion of the C. albicans cell wall using lyticase can be challenging as the enzymatic digestion reaction conditions must be optimized for each cell type. Thus, to ensure a successful CUT&RUN experiment with high-quality sequencing results, an early quality control step is included, and the presence of intact nuclei is verified using a standard fluorescence microscope.
Cell wall digestion and nuclear integrity are regularly assessed by visualizing both control intact cells and isolated nuclei stained with fluorescent cell wall and nucleic acid stains. In contrast to the isolated intact nuclei, where cell wall staining is not observed, both nuclei and the cell walls are fluorescently labeled in the intact control cells (Figure 4A). Lastly, prior to sequencing, fragment size distribution of CUT&RUN libraries is evaluated using a capillary electrophoresis instrument. This quality control step is a reliable measure in assessing the quality of CUT&RUN libraries. As seen in the electrophoresis trace on the top panel of Figure 4B, successful libraries generated for experiments investigating TFs show high enrichment for fragments smaller than 280 bp. The electrophoresis trace in the bottom panel of Figure 4B represents results from an unsuccessful CUT&RUN experiment.
Here, the peak at ~2,000 bp arose mainly from undigested DNA released from nuclei extensively damaged or broken during the CUT&RUN experiment. Assessing the fragment size distribution of the final pooled libraries is also recommended to confirm the complete removal of contaminating adapter dimers (Figure 4C). Typically, 5-10 million paired-end reads (~40 base read length) per library provide sufficient sequencing depth for most TF CUT&RUN experiments in C. albicans. Alternative sequencing read lengths, either paired-end or single-end, can be used to sequence CUT&RUN libraries. For example, paired-end read lengths between 25 and 150 bp should not affect the quality of the results. Single-end reads greater than or equal to 150 bp should also work for TF CUT&RUN experiments. However, the accompanying data analysis pipeline must be modified to accommodate single-end reads.
This CUT&RUN protocol and accompanying data analysis pipeline were validated by investigating two TFs, Ndt80 and Efg1, which control C. albicans biofilm formation. As shown in Figure 4D, Ndt80 is bound at intergenic regions (highlighted by the black bars indicating significantly enriched Ndt80 ChIP-chip loci) upstream of the EFG1 ORF. This intergenic region upstream of EFG1 was previously shown to be highly enriched for Ndt80 binding during biofilm formation by ChIP-chip4. However, CUT&RUN experiments identified significantly more peaks within this region than ChIP-chip experiments (10 peaks vs 4 peaks). Ndt80 DNA binding motifs are enriched across all the Ndt80-bound loci identified by CUT&RUN (Supplementary Figure 1), indicating that the additional peaks identified by this methodology are likely bona fide Ndt80-bound sites. A systematic comparative analysis indicated that this presented CUT&RUN protocol successfully identified the majority of the previously known binding events for Ndt80 and Efg1 during biofilm formation4 (Figure 5).
Furthermore, many new TF binding events that were not captured in the previously published ChIP-chip experiments were identified using CUT&RUN (Figure 5). Overall, both Ndt80 and Efg1 bound to loci overlapping with previously published ChIP-chip data and to loci identified only using the CUT&RUN method (overlaps between these CUT&RUN data and previously published ChIP-chip data for Ndt80 and Efg1 are summarized in the Venn diagrams in Figure 5). Furthermore, the fraction of reads within significantly called peaks (FRiP scores) were consistently higher in GFP-tagged samples than in their control IgG samples (see Supplementary Figure 2). In summary, these results show that the CUT&RUN protocol described here is a robust method optimized to investigate C. albicans TF-DNA binding interactions from low-abundance samples.
Figure 1: Epitope tagging of C. albicans TFs. (A) Identify a TF gene of interest (NDT80 in this case) and select a gRNA to target the cutting of Cas9 (represented by the orange shape) at the 3' end of the ORF. (B) Candida clade-optimized eGFP with >50 bp homology upstream and downstream from the cut site will provide the dDNA to repair the DSB. (C) Use colony PCR to screen individual colonies to confirm the intended integration. Primers are indicated by red arrows, and amplicons are denoted by dashed boxes. This figure was created using BioRender.com. Abbreviations: TF = transcription factor; gRNA = guide RNA; ORF = open reading frame; eGFP = enhanced green fluorescent protein; DSB = double-stranded break; US = upstream; DS = downstream. Please click here to view a larger version of this figure.
Figure 2: Schematic overview of the CUT&RUN experimental protocol. C. albicans cells collected from biofilm or planktonic culture conditions are permeabilized to isolate intact nuclei. ConA beads are activated, and the intact nuclei are then bound to the activated ConA beads. The antibody of interest is added to the bead-bound nuclei and incubated at 4 °C. Next, pAG-MNase is added and allowed to bind to the target antibody. After the addition of CaCl2, pAG-MNase is activated, and targeted chromatin digestion proceeds until the addition of the chelating reagent to inactivate pAG-MNase. The pAG-MNase-bound antibody complex is allowed to diffuse out of the permeabilized nuclei, and the resulting DNA is extracted and cleaned. Sequencing libraries are prepared from the CUT&RUN-enriched DNA fragments, and the resulting libraries are then run on a 10% PAGE gel to separate and remove contaminating adapter dimers prior to sequencing. This figure was created using BioRender.com. Abbreviations: CUT&RUN = cleavage under targets and release using nuclease; ConA = concanavalin A; PAGE = polyacrylamide gel electrophoresis. Please click here to view a larger version of this figure.
Figure 3: Schematic overview of the CUT&RUN data analysis pipeline. The workflow starts by performing quality checks on the raw FASTQ files using FastQC, followed by trimming to remove sequencing adapters. The trimmed reads are then aligned to the reference genome, and the aligned reads are filtered based on their size to enrich for TF-sized binding signals (20 bp ≤ aligned read ≤ 120 bp). Size-selected reads are then calibrated against spike-in Escherichia coli reads, and calibrated reads are used to call peaks using MACS2. The <> symbol is a visual representation to indicate that C. albicans read counts were calibrated using E. coli read counts for each sample. Abbreviations: CUT&RUN = cleavage under targets and release using nuclease; TF = transcription factor. Please click here to view a larger version of this figure.
Figure 4: Quality control steps critical for successful CUT&RUN experiments. (A) Cells are stained with fluorescent cell wall (blue) and nucleic acid (green) stains before and after nuclei isolation and visualized using a fluorescence microscope. (B) CUT&RUN TF libraries are analyzed using a capillary electrophoresis instrument. Successful CUT&RUN TF libraries (indicated by the green checkmark) are enriched for short fragments smaller than 200 bp. Suboptimal CUT&RUN TF libraries (indicated by the red "X") show enrichment for large DNA fragments. (C) 48 CUT&RUN libraries are pooled together and analyzed using a capillary electrophoresis instrument. High-quality, pooled libraries (indicated by the green checkmark) are free of adapter dimers (lane 1), while low-quality pooled libraries (indicated by the red "X") retain small amounts of adapter dimers (lane 2). (D) Representative IGV tracks from CUT&RUN datasets (including the Ndt80 IgG control, bottom track) showing significant enrichment for Ndt80 binding (top track) at the intergenic region upstream of the EFG1 ORF (Orf19.610). The black bars represent significantly enriched Ndt80 binding peaks identified by previously published ChIP-chip experiments4. The blue bars represent significantly enriched Ndt80 binding peaks identified in the CUT&RUN experiments presented here. Scale bars = 50 µm (A). Abbreviations: CUT&RUN = cleavage under targets and release using nuclease; TF = transcription factor; GFP = green fluorescent protein; ORF = open reading frame. Please click here to view a larger version of this figure.
Figure 5: Evaluation of Ndt80 and Efg1 enriched peaks identified using the presented CUT&RUN protocol and data analysis pipeline on Candida albicans biofilms. The Venn diagrams in the top row illustrate the degree of overlap between Ndt80 and Efg1 binding sites identified via CUT&RUN with previously published ChIP-chip data4. The CUT&RUN signals for all binding events for Ndt80 and Efg1 are shown in the bottom row as colored heatmaps (red = high peak signal; blue = low/no peak signal; colored bar indicates IgG signal subtracted from GFP signal); 1,000 bp regions upstream (-1.0 kb) and downstream (+1.0 kb) are displayed in the heatmaps. The signal intensity (i.e., enrichment) as a profile plot is shown above the heatmaps. Three biological replicates for Ndt80 and Efg1 were evaluated for visualizing the CUT&RUN enrichment. Abbreviations: ChIP = chromatin immunoprecipitation; CUT&RUN = cleavage under targets and release using nuclease; GFP = green fluorescent protein. Please click here to view a larger version of this figure.
Table 1: PCR reaction mix and PCR cycling conditions to amplify universal A and unique B fragments. Please click here to download this Table.
Table 2: PCR reaction mix and PCR cycling conditions to stitch A and B fragments to create the full-length C fragment. Please click here to download this Table.
Table 3: PCR cycling conditions to PCR amplify the full-length C fragment. Please click here to download this Table.
Table 4: PCR reaction mix and PCR cycling conditions to amplify the donor DNA GFP tag. Please click here to download this Table.
Table 5: Plasmid digestion reaction mix and reaction conditions for the plasmid digestion reactions. Please click here to download this Table.
Table 6: PCR reaction mix and PCR cycling conditions for colony PCR. Please click here to download this Table.
Table 7: PCR cycling conditions for the end repair step of the library in preparation for sequencing. Please click here to download this Table.
Table 8: Adapter dilution recommendations for the input DNA to prepare sequencing libraries. Please click here to download this Table.
Table 9: PCR cycling conditions to PCR amplify the adapter-ligated DNA. Please click here to download this Table.
Supplementary Figure 1: Ndt80 DNA binding sequence motif enrichment. Cumulative Ndt80 motif enrichment scores for Ndt80-bound loci identified in biofilm ChIP-chip data (dark blue dotted line) or CUT&RUN data (light blue dotted line) compared against random intergenic loci (red dotted line). The lefthand Y-axis indicates the percentage of total loci for each dataset that contains a corresponding cumulative motif score on the X-axis. Dashed lines indicate the significance of the motif score enrichment (-log10 P-value, righthand Y-axis) at each point on the X-axis, relative to the random intergenic control loci. Cumulative motif scores and P-values were determined using MochiView23 (see Table of Materials). All Ndt80-bound loci were sampled as 500 bp windows centered under each peak of Ndt80 binding within the ChIP-chip and CUT&RUN datasets and compared against randomly selected 500 bp intergenic loci to control for differences in length. Abbreviations: ChIP = chromatin immunoprecipitation; ChIP-chip = chromatin immunoprecipitation followed by microarray; CUT&RUN = cleavage under targets and release using nuclease. Please click here to download this File.
Supplementary Figure 2: Fraction of reads within a peak score per sample. Bar plot depicting the FRiP score for each replicate. FRiP scores are calculated as the number of mapped reads within a peak divided by the total number of mapped reads in a CUT&RUN sample. The different bar colors represent individual biological replicate samples, as indicated along the X-axis. As expected, FRiP scores of positive GFP samples are consistently higher than those for negative IgG samples. All samples show acceptable FRiP thresholds (>1%), as per the recommendations set by Landt et al.20. Abbreviations: FRiP = fraction of reads within a peak; CUT&RUN = cleavage under targets and release using nuclease. Please click here to download this File.
Supplementary File 1: Recipes for all buffers and solutions used. Please click here to download this File.
Supplementary File 2: Bioinformatic tools required for the data analysis pipeline. Please click here to download this File.
Supplementary File 3: Recommended blocklisted regions in the C. albicans genome. This list of blocklisted loci primarily contains highly repetitive sequence elements and regions such as telomeric repeats and centromeres that have historically yielded false-positive results in previous genome-wide binding experiments. Please click here to download this File.
This protocol presents a comprehensive experimental and computational pipeline for genome-wide localization of regulatory TFs in C. albicans. It is designed to be highly accessible to anyone with standard microbiology and molecular biology training. By leveraging the high dynamic range and low sample input requirements of the CUT&RUN assay and including optimizations for the localization of TF-DNA binding interactions in C. albicans biofilm and planktonic cultures, this protocol presents a powerful and low-cost alternative to traditional ChIP-seq approaches. Compared to ChIP-seq, CUT&RUN yields significantly higher sensitivity, with a higher proportion of sequencing reads mapped to bound peaks, is more amenable to high-throughput, requires substantially lower input cell numbers, does not require the use of toxic crosslinking agents, and requires tenfold fewer sequencing reads per sample to produce high-quality results13,14,17,23,24. To further reduce the per-sample cost of this protocol, buffer and media recipes are included along with a detailed reagent list to facilitate the in-house preparation of all necessary buffers and media, as well as the bulk sourcing of essential reagents. As C. albicans biofilm formation, phenotypic switching, and commensalism are all regulated by complex interwoven transcriptional networks26, this robust, facile, and cost-effective CUT&RUN protocol provides a powerful new tool for understanding these and many other cellular processes in this important human fungal pathogen.
TFs are not as abundant as histones or other chromatin-associated proteins, creating a unique challenge for investigating TF-DNA binding interactions via CUT&RUN. To address this challenge, critical adjustments and optimizations were made to the standard CUT&RUN experimental protocol13. As most successful CUT&RUN experiments targeting TFs yield a small amount of DNA that is too dilute to quantify and is often enriched for fragments smaller than 150 bp13,23, the library preparation reaction conditions were also optimized to favor these smaller fragments27. Even with this optimization step, the resulting PCR-amplified libraries contain a significant proportion of adapter dimers, which are not completely removed using magnetic bead-based DNA size selection methods. To address this issue, a PAGE gel size-selection step was included to generate final sequencing-ready libraries that are largely devoid of adapter dimers. This is a critical step of the protocol, as removing adapter dimers while retaining the smaller TF-derived CUT&RUN fragments is essential for obtaining high-quality results.
Furthermore, the detailed computational pipeline filters the sequencing data to focus on the smaller reads derived from TF-DNA binding interactions in the CUT&RUN assay. Due to these TF-specific adjustments, this protocol is not recommended for profiling large chromatin-associated complexes such as nucleosomes. While it is theoretically possible to adapt the protocol for this purpose by following the standard library preparation protocol included with the commercially available library prep kit, the user would need to adjust the post sequencing size selection included in the computational pipeline. Specifically, in the size-filtering section in the code file cut_n_run_pipeline.sh, users would need to replace the current value of “14400” (120 bp * 120 bp) with the square of the desired fragment length to enable the analysis pipeline to analyze the sequencing results generated for other types of chromatin-DNA binding interactions.
Another key step in a successful CUT&RUN experiment includes choosing optimal post sequencing data analysis parameters. While most of the computational pipeline is designed to be standardized and applicable to the study of any regulatory TF of interest in C. albicans, there are two important considerations that the user should evaluate while running the pipeline. The first consideration is whether to include or remove duplicate reads from the sequencing data prior to the identification of bound target sites. As low-abundance TFs will typically yield sequencing data containing a significant percentage of reads that are derived from PCR duplication during the library amplification step, removing PCR duplicates can have a significantly negative impact on the results. However, with highly abundant TFs or chromatin-associated proteins, PCR duplicates typically represent a smaller portion of the total number of reads and are often removed to suppress background noise in the data. Ultimately, this decision to keep or remove PCR duplicates is dependent on the TF of interest and the depth of the sequencing data obtained. Thus, the pipeline automatically generates independent output files for data derived with or without PCR duplicate reads, so the user can decide which output files yield the best results for each experiment.
The second consideration is whether to identify and remove problematic loci that yield significant, yet highly variable, enrichment in both experimental (antibody against the protein of interest) and negative control (IgG) samples. The peak-calling algorithm uses MACS2 to identify significantly enriched loci in both the experimental and control samples and excludes those that appear in both. While this approach typically eliminates most problematic loci, some occasionally appear as significant peaks in certain experiments, even though prior experience indicates that they are unlikely to be true positive sites of TF enrichment. Thus, an optional filtering step is provided to remove these problematic loci, referred to as “blocklisted” loci. The list of blocklisted loci primarily contains highly repetitive sequence elements and regions such as telomeric repeats and centromeres that have historically yielded false-positive results in previous genome-wide binding assays. This is a very conservative list of loci assigned as problematic with high confidence. However, each user should evaluate whether this filter is appropriate for their experiment(s) on a case-by-case basis. A potential alternative to the IgG negative control would be to perform CUT&RUN against a nuclear-localized protein that does not bind DNA. This approach has been shown to be an ideal control for ChIP experiments28, and a similar approach is worth considering for CUT&RUN.
CUT&RUN has become a popular choice for investigating protein-DNA interactions in higher eukaryotes and the model yeast Saccharomyces cerevisiae. Here, it has been successfully adapted to investigate genome-wide TF-DNA binding interactions in the clinically relevant fungal pathogen C. albicans. This protocol provides detailed methods for all necessary experimental and computational procedures, from the engineering of strains that express epitope-tagged TFs, through to the computational analysis of the resulting CUT&RUN sequencing data. Overall, this protocol and the accompanying data analysis pipeline produce robust TF-DNA binding profiles, even when using complex multimorphic populations of cells isolated from low-abundance biofilm samples and provides superior data quality at a lower overall cost than ChIP-seq methodologies.
The authors have nothing to disclose.
We thank all past and present members of the Nobile and Hernday laboratories for feedback on the manuscript. This work was supported by the National Institutes of Health (NIH) National Institute of General Medical Sciences (NIGMS) award number R35GM124594 and by the Kamangar family in the form of an endowed chair to C.J.N. This work was also supported by the NIH National Institute of Allergy and Infectious Diseases (NIAID) award number R15AI137975 to A.D.H. C.L.E. was supported by the NIH National Institute of Dental and Craniofacial Research (NIDCR) fellowship number F31DE028488. The content is the sole responsibility of the authors and does not represent the views of the funders. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.
0.22 μm filter | Millipore Sigma | SLGPM33RS | |
0.65 mL low-adhesion tubes | VWR | 490003-190 | |
1 M CaCl2 | Fisher Scientific | 50-152-341 | |
1 M PIPES | Fisher Scientific | AAJ61224AK | |
12-well untreated cell culture plates | Corning | 351143 | |
2-mercaptoethanol | Sigma-Aldrich | 60-24-2 | |
2% Digitonin | Fisher Scientific | CHR103MI | |
50 mL conical tubes | VWR | 89039-658 | |
5x phusion HF buffer | Fisher Scientific | F530S | Item part of the Phusion high fidelity DNA polymerase; referred to in text as "DNA polymerase buffer" |
Agar | Criterion | C5001 | |
Agencourt AMPure XP magnetic beads | Beckman Coulter | A63880 | |
Agilent Bioanalyzer | Agilent | G2939BA | Referred to in the text as "capillary electrophoresis instrument"; user-dependepent |
Amplitube PCR reaction strips with attached caps, Simport Scientific | VWR | 89133-910 | |
Bacto peptone | BD Biosciences | 211677 | |
Benchling primer design tool | Benchling | https://www.benchling.com/molecular-biology/; Referred to in the text as "the primer design tool" | |
Betaine | Fisher Scientific | AAJ77507AB | |
Calcofluor white stain | Sigma-Aldrich | 18909-100ML-F | |
Candida Genome Database | http://www.candidagenome.org/ | ||
Concanavalin A (ConA) conjugated paramagnetic beads | Polysciences | 86057-3 | |
Conda software | https://docs.conda.io/en/latest/miniconda.html | ||
curl tool | http://www.candidagenome.org/download/sequence/C_albicans_SC5314/Assembly21/current/C_albicans_SC5314_A21_current_ chromosomes.fasta.gz |
||
CUTANA ChIC/CUT&RUN kit | Epicypher | 14-1048 | Referred to in the text as "the CUT&RUN kit" |
Deoxynucleotide (dNTP) solution mix (10 mM) | New England Biolabs | N0447S | |
Dextrose (D-glucose) | Fisher Scientific | D163 | |
Difco D-mannitol | BD Biosciences | 217020 | |
Disposable cuvettes | Fisher Scientific | 14-955-127 | |
Disposable transfer pipets | Fisher Scientific | 13-711-20 | |
DNA Gel Loading Dye (6x) | Fisher Scientific | R0611 | |
DreamTaq green DNA polymerase | Fisher Scientific | EP0713 | Referred to in the text as "cPCR DNA polymerase" |
DreamTaq green DNA polymerase buffer | Fisher Scientific | EP0713 | Item part of the DreamTaq green DNA polymerase; referred to in the text as "cPCR DNA polymerase buffer" |
E. coli spike-in DNA | Epicypher | 18-1401 | |
ELMI Microplate incubator | ELMI | TRMS-04 | Referred to in the text as "microplate incubator" |
End Prep Enzyme Mix | Item part of the NEBNext Ultra II DNA Library Prep kit | ||
End Prep Reaction Buffer | Item part of the NEBNext Ultra II DNA Library Prep kit | ||
Ethanol 200 proof | VWR | 89125-170 | |
FastDigest MssI | Fisher Scientific | FD1344 | Referred to in the text as "restriction enzyme" |
FastDigest MssI Buffer | Fisher Scientific | FD1344 | Item part of the FastDigest MssI kit; referred to in the text as "restriction enzyme buffer" |
Ficoll 400 | Fisher BioReagents | BP525-25 | |
Fluorescence microscope | User-dependent | ||
Gel electrophoresis apparatus | User-dependent | ||
GeneRuler low range DNA ladder | Fisher Scientific | FERSM1192 | |
GitBash workflow | https://gitforwindows.org/ | ||
GitHub source code | https://github.com/akshayparopkari/cut_run_analysis | ||
HEPES-KOH pH 7.5 | Boston BioProducts | BBH-75-K | |
High-speed centrifuge | User-dependent | ||
Isopropanol | Sigma-Aldrich | PX1830-4 | |
Lens paper | VWR | 52846-001 | |
Ligation Enhancer | Item part of the NEBNext Ultra II DNA Library Prep kit | ||
Lithium acetate dihydrate | MP Biomedicals | 215525683 | |
Living Colors Full-Length GFP polyclonal antibody | Takara | 632592 | User-dependent |
MACS2 | https://pypi.org/project/MACS2/ | ||
Magnetic separation rack, 0.2 mL tubes | Epicypher | 10-0008 | |
Magnetic separation rack, 1.5 mL tubes | Fisher Scientific | MR02 | |
MgCl2 | Sigma-Aldrich | M8266 | |
Microcentrifuge tubes 1.5 mL | Fisher Scientific | 05-408-129 | |
Microplate and cuvette spectrophotometer | BioTek | EPOCH2TC | Referred to in the text as "spectrophotometer"; user-dependent |
MochiView | http://www.johnsonlab.ucsf.edu/mochiview-downloads | ||
MOPS | Sigma-Aldrich | M3183 | |
NaCl | VWR | 470302-522 | |
NaOH | Fisher Scientific | S318-500 | |
NCBI GEO | https://www.ncbi.nlm.nih.gov/geo/ | ||
NEBNext Adaptor for Illumina | Item part of the NEBNext Multiplex Oligos for Illumina (Index Primers Set 1); referred to in the text as "Adapter" | ||
NEBNext Index X Primer for Illumina | Item part of the NEBNext Multiplex Oligos for Illumina (Index Primers Set 1); referred to in the text as "Reverse Uniquely Indexed Library Amplification Primer" | ||
NEBNext Multiplex Oligos for Illumina (Index Primers Set 1) | New England Biolabs | E7335S | |
NEBNext Ultra II DNA Library Prep kit | New England Biolabs | E7645S | Referred to in the text as "library prep kit" |
NEBNext Universal PCR Primer for Illumina | Item part of the NEBNext Multiplex Oligos for Illumina (Index Primers Set 1); referred to in the text as "Universal Forward Library Amplification Primer" | ||
Nourseothricin sulfate (NAT) | Goldbio | N-500-2 | |
Novex TBE Gels, 10%, 15 well | Fisher Scientific | EC62755BOX | |
Nutating mixer | VWR | 82007-202 | |
Nutrient broth | Criterion | C6471 | |
pADH110 | Addgene | 90982 | Referred to in the text as "plasmid repository ID# 90982" |
pADH119 | Addgene | 90985 | Referred to in the text as "plasmid repository ID# 90985" |
pADH137 | Addgene | 90986 | Referred to in the text as "plasmid repository ID# 90986" |
pADH139 | Addgene | 90987 | Referred to in the text as "plasmid repository ID# 90987" |
pADH140 | Addgene | 90988 | Referred to in the text as "plasmid repository ID# 90988" |
pAG-MNase | Epicypher | 15-1016 or 15-1116 | 50 rxn or 250 rxn |
pCE1 | Addgene | 174434 | Referred to in the text as "plasmid repository ID# 174434" |
Petri dishes with clear lid | Fisher Scientific | FB0875712 | |
Phusion high fidelity DNA polymerase | Fisher Scientific | F530S | Referred to in the text as "DNA polymerase" |
Polyethylene glycol (PEG) 3350 | VWR | 10791-816 | |
Potassium phosphate monobasic | Fisher Scientific | P285-500 | |
Qubit 1x dsDNA HS assay kit | Invitrogen | Q33230 | |
Qubit fluorometer | Life Technologies | Q33216 | Referred to in the text as "fluorometer"; user-dependent |
Rabbit IgG negative control antibody | Epicypher | 13-0042 | |
RNase A | Sigma-Aldrich | 10109169001 | |
Roche Complete protease inhibitor (EDTA-free) tablets | Sigma-Aldrich | 5056489001 | |
RPMI-1640 | Sigma-Aldrich | R6504 | |
Shaking incubator | Eppendorf | M12820004 | User-dependent |
Sorbitol | Sigma-Aldrich | S1876-500G | |
Spin-X centrifuge tube filters | Fisher Scientific | 07-200-385 | |
Sterile inoculating loops | VWR | 30002-094 | |
SYBR Gold nucleic acid gel stain | Fisher Scientific | S11494 | |
SYTO 13 nucleic acid stain | Fisher Scientific | S7575 | Referred to in the text as "nucleic acid gel stain" |
Thermocycler | User-dependent | ||
ThermoMixer C | Eppendorf | 5382000023 | |
Tris (hydroxymethyl) aminomethane | Sigma-Aldrich | 252859-100G | |
Ultra II Ligation Master Mix | Item part of the NEBNext Ultra II DNA Library Prep kit; referred to in the text as "Ligation Master Mix" | ||
Ultra II Q5 Master Mix | Item part of the NEBNext Ultra II DNA Library Prep kit; referred to in the text as "High Fidelity DNA Polymerase Master Mix " | ||
UltraPure salmon sperm DNA solution | Invitrogen | 15632011 | |
USER Enzyme | Item part of the NEBNext Ultra II DNA Library Prep kit; referred to in the text as "Uracil Excision Enzyme" | ||
Vortex mixer | VWR | 10153-834 | |
wget tool | http://www.candidagenome.org/download/sequence/C_albicans_SC5314/Assembly21/current/C_albicans_SC5314_A21_current_ chromosomes.fasta.gz |
||
Yeast extract | Criterion | C7341 | |
Zymolyase 100T (lyticase, yeast lytic enzyme) | Fisher Scientific | NC0439194 |