A Semiautomated ChIP-Seq Procedure for Large-scale Epigenetic Studies

Chromatin immunoprecipitation followed by sequencing (ChIP-Seq) is a powerful and widely used approach to profile chromatin DNA associated with specific histone modifications, such as H3K27ac, to help identify cis-regulatory DNA elements. The manual process to complete a ChIP-Seq is labor intensive, technically challenging, and often requires large-cell numbers (>100,000 cells). The method described here helps to overcome those challenges. A complete semiautomated, microscaled H3K27ac ChIP-Seq procedure including cell fixation, chromatin shearing, immunoprecipitation, and sequencing library preparation, for batch of 48 samples for cell number inputs less than 100,000 cells is described in detail. The semiautonomous platform reduces technical variability, improves signal-to-noise ratios, and drastically reduces labor. The system can thereby reduce costs by allowing for reduced reaction volumes, limiting the number of expensive reagents such as enzymes, magnetic beads, antibodies, and hands-on time required. These improvements to the ChIP-Seq method suit perfectly for large-scale epigenetic studies of clinical samples with limited cell numbers in a highly reproducible manner.


Introduction
The wide use of ChIP-Seq assays for determining fragments of DNA associated with specific histone modifications is in part due to its ability to identify cis-regulatory DNA elements, including active enhancers, promoters, silencers, heterochromatin, and others 1,2,3,4 . Identification of non-coding regulatory regions across the genome has shown valuable insight to better understand gene regulation in health and diseases 4 . Previous work from the lab has used ChIP-Seq to show that cis-regulatory elements can play important roles in different cell types 5 . Transcription factor (TF) ChIP assays has been utilized to show disease associated risk single-nucleotide polymorphisms 6 .
The use of ChIP-Seq with human clinical samples is challenging, mainly due to the limitation of cell numbers or the desired tissue sample. As a result, there has been a concerted effort in the field to improve and microscale these techniques and as a result, several assays have emerged, such as CUT&TAG 5,7,8,9,10,11,12 . This assay utilizes a transposase to tagment and isolate genomic regions bound by a specific antibody 9 . This technique has been able to reduce the cell numbers down to 1,000s and in some cases to a single-cell, however, the use of this technique in translational research and clinical set-up has shown limitations due to the requirements of using live cells for this method 9,12 . The live cell requirement makes clinical samples logistically difficult to handle and can introduce batch effects if the samples are not processed at the same time. Others have optimized microscaled techniques for formaldehyde-fixed cells, including the development of ChIPmentation 11 , which is adapted here in a high-throughput manner. The use of fixed cells allows samples to be stored until collection and subsequent processing of all samples together to minimize batch effects.
Here, a semiautomated microscaled ChIP-Seq assay is described which reduces experimental hands-on time to profile histone modifications 10 . The semiautomated method allows for high-throughput ChIP-Seq assays, allowing for up to 48 samples to be fully processed and ready for sequencing in as little as 5 days, for as few as 10,000 cells per sample using a ChIP liquid-handler. The handler completes the immunoprecipitation (IP) and subsequent washes in an autonomous manner, which helps to reduce variability between samples. The semiautomated method lowers both the hands-on time by over 15 h for 48 samples and the technical variability, enabling large-scale epigenetic studies to be conducted in a reproducible and rapid manner for either primary or cultured cells. The protocol explains the process from start to finish for high quality ChIP-Seq. If the specific machines are not available, the protocol will still be a useful resource to set up and trouble-shoot ChIP-Seq experiments manually.
The assay was performed with three different primary human immune cell types and one cultured cell line (HUT78 -ATCC: TIB-161). For clarity, the protocol has been divided into seven sections: cell fixation, chromatin shearing via sonication, automated chromatin immunoprecipitation, library preparation by DNA fragment tagmentation, library amplification, library purification, followed by DNA quantification. For buffer recipes please refer to Supplementary Table 1. Volume of antibody = (number of samples x antibody per reaction) / antibody concentration

2.
Add the calculated amount of antibody into 500 μL of tBW1 buffer. Quickly vortex and pulse-spin.

3.
Pipette 70 μL of tBW1 into each of the two ChIP 8-tube strips and add 30 μL of the antibody + tBW1 to each of the tubes. This will bring the total volume in each of the tubes to 100 μL.

4.
Preparation of the magnetic bead

1.
Vortex the protein A bead solution thoroughly. For 0.5 μg of antibody, pipette 5 μL of beads into a new set of ChIP 8-tube strips and pulse spin.

5.
Fill the last row of the ChIP liquid-handler with labeled, empty ChIP 8-tube strips.

6.
Follow the ChIP-16-IPure-200D program specifications for the placement of all the strips in the ChIP liquid-handler machine. Add the buffers in the correct position but use tW4 instead of tE1 buffer.
NOTE: Organize the day such that the ChIP liquid-handler will perform the ChIP overnight. The program will run for about 16 h for 16 samples. This marks the end of Day 1.

1.
Pre-set a thermomixer to 37 °C and 500 rpm. Cool down a magnet for 0.2 mL tube strips on ice.

2.
For 16 samples, prepare 440 μL of tagmentation buffer on ice. Pipette 53 μL into a single new 8-tube strip and keep on ice.

3.
In a new 0.2 mL 8-tube strip, add 220 μL of cold tC1 buffer and keep on ice. The 8 strip tubes can hold this volume and still be capped.

4.
Remove the "IP samples" strip tube from the ChIP liquid-handler machine (row 12) and cap the tubes prior to pulse-spinning. Capture the beads using the magnet for 8-tube strips for 2 min and carefully remove the supernatant.

5.
Transfer 25 μL of the tagmentation buffer to the beads with a multi-channel, remove from the magnet, and mix gently until the beads are homogenous (about 5 times up and down with the pipette set to 20 μL).

6.
Cap the tubes and place into the pre-heated thermomixer and incubate for 3 min.
Increasing the time will decrease the efficiency of the library preparation.

7.
Transfer the tubes to a chilled metal rack and add 100 μL chilled tC1 buffer to each sample. Set a multi-channel pipette to 80 μL and mix the sample until the beads are homogenous, stopping the tagmentation reaction. Place the samples back into the ChIP liquid-handler and proceed with the wash procedure Washing_for_IP-reacts_16_Ipure. Ensure the washing is performed twice with tC1 buffer and twice with tW4. The elution should be completed as marked by the program layout, with buffer tE1.

9.
Decrosslinking of the DNA

1.
Remove the ChIP 8-tube strips in the last row of the ChIP liquidhandler and add 2 μL RNase A to each sample.

2.
Cap the tubes, pulse-spin, gently mix the beads with a multichannel pipette until the mixture is homogenous, and re-cap the tubes.

3.
Incubate the samples in a thermomixer for 30 mins at 37 °C and 900 rpm.

4.
Remove the samples from the thermomixer, add 2 μL of Proteinase K. Follow the same procedure as 4.9.2 after the addition.

5.
Incubate the samples in a thermomixer for 4 h at 55 °C and 1,250 rpm, followed by 65 °C at 1,000 rpm overnight.
NOTE: This is the end of Day 2.

1.
Label sixteen 1.5 mL tubes with the appropriate sample number and add 400 μL of DNA binding buffer from the DNA clean-up kit to each.

2.
Remove the 8-tube strips from the thermomixer and pulse-spin the strips to ensure any evaporated product is retained. Place strips on an 8-strip magnet to capture the beads.

3.
Transfer 100 μL of decrosslinked DNA into each of the 1.5 mL tubes. Add 100 μL of the DNA binding buffer to the 8-tube strips to wash the beads and then transfer to the appropriate 1.5 mL tube.

4.
Vortex for about 10 s and pulse-spin the 1.5 mL tubes.

5.
Load the columns with the 600 μL containing the DNA binding buffer and ChIP sample.

6.
Spin samples for 20 s at 10,000 x g and reload the column with the flow-through.
Spin again with the same conditions and discard the flow-through.

7.
Wash the columns twice with 200 μL wash buffer (same centrifugation as the previous step) and discard the flow-through.

8.
Dry the columns by centrifuging for 2 min at 12,000 x g.

9.
Transfer the column to a new 1.5 mL collection tube and add 9 μL warm TE Buffer (pre-heated to 55 °C) directly to the column matrix. Allow the column to incubate for 1 min before centrifugation for 1 min at 10,000 x g.

10.
Transfer the 9 μL of the elute to an appropriate new set of 8-tube strips.

6.
Optional QC step: Remove 4 μL from the samples and transfer to a QC plate. Add 4 μL water back to the samples. This determines the percentage of large fragments.

8.
Place on magnet to capture the beads for 5 min and transfer the supernatants to columns 7-12 of the "sample" plate. Remove the plate from the magnet and add 30 μL of beads (final ratio of 1:1.3). Mix carefully and allow it to sit at RT for 2 min.

9.
Capture the beads for 5 min and then discard the supernatant.

11.
Once the pellets are dry, elute DNA with 8 μL pre-warmed TE buffer to each well, while still on the magnet.

12.
Remove the plate from the magnet, seal, and vortex thoroughly. Allow the plate to incubate for 2 min at RT, pulse-spin, and place the plate back on the magnet for 2 min. Transfer the supernatant to a new plate (Plate 2).

13.
For maximum recovery, repeat elution with an extra 8 μL of prewarmed TE buffer. Place the samples into the appropriate wells such that each sample has 16 μL of final library.
NOTE: At the end of this step there should be two plates (one if no QC plate was completed). The QC plate will have the pre-size-selected fragments and the second plate should have 48 wells of final library (16 μL total).

1.
Complete DNA quantification using a fluorescence quantifying assay or a similar method.

2.
If QC quantification was completed, determine the percentage of loss of sample that were < 1,000 bp. There should be no more than about 20% loss -if there was more, there could be an issue with the applied bead ratios.

3.
Determine the size of the fragments of each sample, preferably using a capillary electrophoresis machine. To calculate the molar concentration use the following equation: [Library concentration (ng/μL) * 10 6 ]/[660 * Median fragment size (bp)]).

Representative Results
As proof of concept, ChIP-Seq was completed for six human donors with three sets of immune cell types: naive CD4 T cells (CD4), classical monocytes (MO) and natural killer cells (NK), enriched by FACS sorting as described before 13 . The underlined procedure consists of nine distinct procedures as represented in Figure 1.
After cell isolation by flow cytometry 13 , sorted cells were centrifuged and cells fixed and stored as described above. Once all the samples were collected, the samples were lysed and prepared for chromatic shearing in batches of 12 as described above. For each sample, the number of cycles to reach optimal sonication was completed 10 . Quantitative measurement, as well as sheared chromatin fragment size measurements showed great reproducibility of our method on the three sets of immune cells (Figure 2A). The different human immune cells were sonicated in separate batches and yielded very consistently with > 70% of the sample between 100 -500 bp for 14 cycles (16 s ON, 32 s OFF per cycle). At this point, samples with large fragments after sonication (< 70% of the sample between 100 -500 bp) were considered as failed. These samples could either sonicated for 1-2 additional cycles or were discarded and replaced later with cells from another pellet. Our method showed none of the samples required more sonication or were eliminated, suggesting absolute robustness of the procedure.
After quantification, the samples were run on a ChIP liquid-handler with H3K27ac antibodies, followed by tagmentation with Tn5 transposase enzyme. To determine the appropriate number of amplification cycles by qPCR, 10% of tagmented samples were used.
For the determination of the number of cycles for the amplification of the samples, we find the cycle at which the intensity of the sample is half the average maximum for cycle determination ( Figure 2B). Samples with Ct values of more than 18 did not perform well post sequencing and their Ct value was thus indicative of a failed ChIP sample. These samples generally also yielded a lower amount of DNA after amplification. Samples (100,000 cells input) with a Ct value equal or lesser than 15 were ideal and samples between 15 and 18 were acceptable but less consistent post sequencing. For samples with less than 100,000 input cells, the Ct values were usually found between 15 and 18 but did not need more than 18 cycles to yield enough product for sequencing.
After DNA-tagmented amplification, libraries were purified and size-selected to obtain an ideal size distribution, ranging from 200 to 1,000 bp, for the NextGen sequencing. Size distribution assessment on each of the libraries was completed because best sequencing data were obtained when more than 85% of the DNA fragments ranged between 200 to 1,000 bp ( Figure 2C). Notably, as the same quantity of DNA (measured by fluorescence quantification) was loaded, it was noticed the samples with lower fluorescence intensity generally sequenced poorly ( Figure 2C).
Post sequencing, standard quality controls based on the ENCODE ChIP-Seq guidelines were applied 5,14,15 .
For visual quality control, H3K27ac enrichment tracks for display in the UCSC genome browser were prepared. For four gene loci, individual tracks for each sample showed high

Author Manuscript
Author Manuscript

Author Manuscript
Author Manuscript mapping quality and signal-to-noise ratio reflecting the high consistency and robustness of our assay ( Figure 3A). The two loci to the left harbor well-expressed genes in these cell types, while the genes in the two loci to the right are not expressed and served as background controls 13 ( Figure 3A). Further, the MEDIPS analysis package was used as post-sequencing variable to assess the correlation index between technical replicates ( Figure  3B) 5,16,17 , establishing the degree of correlation for reads enrichment level for 500 bp bins 16 . For the majority of the pairwise comparisons, Pearson correlations indexes showed more than 90% correlation suggesting high level of consistence between the biological replicates ( Figure 3B). Replicates with acceptable correlation were merged to increase signal-to-noise ratio. While cell type-specific loci showed high enrichment in the appropriate cells, a house-keeping gene (B2M) showed very consistent histone modification ( Figure 3C). For the analysis, merging tracks from replicates will increase the enrichment, reinforce the specific signal, including for important cell type-specific enhancers, and reduces the interindividual variability inherent to human samples 5 .
Although 100,000 cells were used for this study, there was high reproducibility for as few as 10,000 cells in a human cultured T-cell line (HUT78). Correlation analysis between ChIP-Seq dataset performed from samples with less than 100,000 cells showed high reproducibility and correlation down to 10,000 cells ( Figure 4A).
Pearson correlation analysis showed high correlation index (83% to 92%), suggesting maintenance of signal in low cell number samples. However, there was increased background as the cell numbers were reduced as well as a dropping of the correlation coefficients ( Figure 4B). To maintain low background signals, technical duplicates were merged, and the correlation was tested between groups ( Figure 4C).

Discussion
The method described here expands on the ChIPmentation procedure 11 , which implements a tagmentation library preparation protocol prior to DNA purification, by automating and microscaling the protocol. Since the onset of ChIP-Seq, the required cell numbers have been reduced drastically, from about 20 million cells for histones down to hundreds and even single-cells 1,7,10,12,18,19,20,21 . These newly developed methods have allowed for a deeper understanding of how cis-regulatory mechanisms are working in cells by increasing the sensitivity and allowing for rare clinical cell populations to be tested 5,6,12,17 . For instance, one of the more recent and popular procedure, called CUT&TAG, as robust and sensitive ChIP-Seq alternative 9 . It produces an excellent signal-to-noise ratio as the Tn5 enzyme is covalently bound to protein A and recognizes the Fc chain of the ChIP antibody with high specificity 9 . Background activity of Tn5-enzyme is reduced as the enzyme is not functional before binding to the target antibody 9 . However, the implementation of this method in a clinical context is limited since it requires non-fixed, live cells. Also, the removal of DNA fragments from the hypotonic nucleus could have negative effects on the chromatin as it is removed from during the assay. The necessary requirement to work with fresh and living cells is a source of issues for rare clinical samples and for large cohorts of samples, since large cohorts can take numerous years to collect 5 . Another type of method, drop-ChIP, elegantly uses a microfluidics device to generate droplet based tagmention prior to processing the ChIP 19 . However, it uses a highly specialized microfluidic device and, while it is possible to complete single-cell ChIP-Seq, it is also limited to the use of live cells 7,8,9,18,19 . Newer methods relying on ChIP-Seq such as PLAC-Seq or HiChIP, attempt to understanding 3-dimension (3D) interactions between the ChIP-Seq peaks 22,23 . These 3D methods are exciting as they are identifying cis-regulatory or TF mediated interactions across the genome and better the understanding of the regulation of gene expression in cell types of interest, in healthy tissues and in the context of disease.
There are a few critical steps to consider for the protocol to be successful such as quality of the sonicated chromatin and quality of the antibody. Shearing efficiency is critical, if the chromatin is not sonicated well, the efficiency of the assay decreases drastically 24 . Sonication is a challenging aspect of ChIP-Seq due to the cell numbers required. On the sonicator used in the protocol, efficiency was drastically reduced under 300,000 cells. This is a challenging aspect in ChIP-Seq as to sonicate under that level would often require enzymatic fragmentation, which is less impartial. As a result, sonication is a major limited factor for true microscaled ChIP-Seq. Other sonication platforms and commercially available kits were tested for sonicating chromatin, but the sonicator used here had the most robust and reproducible results. Another advantage of the sonicator is not having to purchase specialized tubes to run the sonication, which reduces costs when dealing with large number of samples. For optimal sonication, firstly, it is important to pre-warm the sonicator as described above. Second, to lyse the pellet, it is recommended to have the pipette tip touching the bottom of the tube while lysing to break up the cells with more physical constrains. Third, any bubble formation prior to sonication hinders the ability of the sample to be sonicated evenly. If there are any bubbles formed during the lysis, it is important to remove them with a pipette. This can be challenging without removing a lot of sample, but if the tip is lightly pressed against the bubble it can be slowly drawn up without loss of much sample. Lastly, when determining the number of cycles, complete a time-course where every three cycles, sample is removed, purified, and ran on an agarose gel. Avoid over/under sonication of samples as this decreases the ChIP efficiency. If the sample is under sonicated, the large fragments can have a negative effect on the ChIP-Seq quality 24 . On the other hand, if the sample is over sonicated, there is a risk of the target epitope getting lost in the process.
Another essential part of ChIP-Seq is the quality of the antibody. Prior to running any largescale study, it is necessary to optimize the antibody which will be used. The goal is to obtain a significantly high signal to noise ratio of known regions of the genome and another is the reproducibility. If the antibody is pulling a lot of background signal, it might be recommended to use a larger input or try a different lot/supplier. This will add time before starting a large-scale experiment, but it is an essential step. To test for the signal-to-noise it is recommended to use qPCR with regions known to be a target of your antibody and another region known to be absent. It has been noticed histone modifications are more robust and easier to optimize than TFs.
The protocol described above provides a robust method for high-throughput histonemodification ChIP-Seq in a semiautonomous, microscaled manner. The method limits the amount of hands-on time and increases the reproducibility over manual ChIP-Seq. Previous studies completed in the lab used manual ChIP on technical replicates and obtained a

Author Manuscript
Author Manuscript Author Manuscript

Author Manuscript
Spearman correlation average of 0.50 5 , however, with the semiautomated system, the Spearman correlation between different donors with an NK cells average of 0.66 (Supplementary Table 2). This was also completed with about 40% less hands-on time. The method described here has been optimized for histone-modifications (H3K27ac shown here, but the protocol should not need any modification for others) and would only require minor modifications for to be implemented for TF ChIP-Seq. Despite the quality of the antibody, the main modification would be for the sonication time and potentially the buffers used during the IP. Usually, for TF ChIP assays, the method may work better with slightly longer fragments of chromatin (with a range of around 350-800 bp) as TF:DNA complexes are likely less able to be maintained through rigorous sonication 6 . The buffers might also need to change to a custom mix or other industry available kits, as TFs can behave differently than histone modifications.
Although the automated ChIP liquid-handler was been tested for as few as 10,000 cells, there was a noticeable decrease in reproducibility at lower chromatin concentrations. Due to this, the protocol was not recommended to less than 10,000 cells, with 100,000 cells being the optimal conditions. The protocol was also completed using industry ChIP buffers, which was an added expense but provided higher quality data. The protocol could be modified with regard to the sonication conditions (as long as the sheared chromatin is kept within the same range), buffers could be customized for the immunoprecipitation (IP; optimization may be required), or the ChIP liquid-handler may not be used. A limitation of the protocol is the use of the ChIP liquid-handler, which can be an expensive investment and can only run 16 samples at once. The ChIP liquid-handler is limited to small-scale reactions and cell numbers greater than one million are not recommended. However, the protocol could be completed without it, by completing the IP and wash steps manually. If the IP and washes were completed by hand, the time to complete the assay will increase and the reproducibility may decrease, but this guide will still be useful running a high-quality ChIP-Seq experiment. Of note, other liquid handlers could be adapted to run semiautomated ChIP reactions.
To summarize, the major benefits of this system is the high-throughput nature, since the IP and washing steps are completed autonomously. As such, sequential rounds of ChIP experiments can be completed, allowing up to 48 samples to be fully processed and ready for sequencing in 5 days, with limited hands-on time compared to manual ChIP-Seq experiments. Another benefit is the increased reproducibility since ChIP-Seq can be difficult to obtain highly reproducible results. Other methods either require live cells, complex micropipetting systems, or the work to be completed all by hand. This system will have to be optimized for low-input samples (<10,000 cells), ultimately allowing single-cell ChIP reactions. The system is also capable of being adapted for the newer ChIP methods, such as PLAC-Seq and HiChIP 22, 23 .