qKAT: Quantitative Semi-automated Typing of Killer-cell Immunoglobulin-like Receptor Genes

Killer cell immunoglobulin-like receptors (KIRs) are a set of inhibitory and activating immune receptors, on natural killer (NK) and T cells, encoded by a polymorphic cluster of genes on chromosome 19. Their best-characterized ligands are the human leukocyte antigen (HLA) molecules that are encoded within the major histocompatibility complex (MHC) locus on chromosome 6. There is substantial evidence that they play a significant role in immunity, reproduction, and transplantation, making it crucial to have techniques that can accurately genotype them. However, high-sequence homology, as well as allelic and copy number variation, make it difficult to design methods that can accurately and efficiently genotype all KIR genes. Traditional methods are usually limited in the resolution of data obtained, throughput, cost-effectiveness, and the time taken for setting up and running the experiments. We describe a method called quantitative KIR semi-automated typing (qKAT), which is a high-throughput multiplex real-time polymerase chain reaction method that can determine the gene copy numbers for all genes in the KIR locus. qKAT is a simple high-throughput method that can provide high-resolution KIR copy number data, which can be further used to infer the variations in the structurally polymorphic haplotypes that encompass them. This copy number and haplotype data can be beneficial for studies on large-scale disease associations, population genetics, as well as investigations on expression and functional interactions between KIR and HLA.

Introduction sequence-specific oligonucleotide probe (SSOP) PCR 21 , and matrix assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) 22 . The drawbacks of these techniques are that they only provide partial insight into the genotype of an individual whilst also being laborious to perform. Recently next-generation sequencing (NGS) has been applied to type the KIR locus specifically. While this method is very powerful, it can be expensive to run, and it is time-consuming to conduct in-depth analysis and data checks.
qKAT is a high-throughput quantitative PCR method. While conventional methods are laborious and time-consuming, this method makes it possible to run nearly 1,000 genomic DNA (gDNA) samples in five days and gives the KIR genotype, as well as the gene copy number. qKAT consists of ten multiplex reactions, each of which targets two KIR loci and one reference gene of a fixed copy number in the genome (STAT6) used for the relative quantification of the KIR gene copy number 23 . This assay has been successfully used in studies involving large population panels and disease cohorts on infectious diseases such as HCV, autoimmune conditions like type 1 diabetes, and pregnancy disorders such as preeclampsia, as well as providing a genetic underpinning to studies aimed at understanding the NK cell function 1,4,24,25,26 .
reference gene. The probes that were published in Jiang et al. 27 were modified so that the oligonucleotides are now labeled with ATTO dyes since they offer improved photostability and long signal lifetimes. Pre-aliquoted primer combinations are commercially available (see Table of Materials).

1.
Prepare primer combinations for each reaction as per the dilutions given in Table  1.

2.
Prepare probe combinations for each reaction as per Table 1. Test each individual probe prior to making the combination.

Preparation of the Master Mix
NOTE The volumes mentioned below are for performing one qKAT reaction on a set of 10x 384-well plates.

1.
Ensure that the gDNA samples plated on the 384-well plates are completely dry. Conduct all steps on ice and keep the reagents covered from exposure to light as much as possible since the fluorescence-labeled probes are photo-and thermosensitive.

3.
On ice, prepare a master mix for 10x 384-well plates by adding 18.86 mL of ultrapure water, 20 mL of qPCR buffer, 1,000 μL of preprepared primer combination, and 180 μL of preprepared probe combination (Table 2).

4.
Distribute the master mix evenly across a 96-deep well plate using a multichannel pipette, pipetting 415 μL into each well. Keep this plate in an ice box covered from light.

5.
Using a liquid handling instrument, dispense 9.5 μL of the master mix into each well of the 384-well plate with dried gDNA. Seal the plate with a foil and immediately place it at 4 °C. Repeat this process for the remaining plates, ensuring that the needles of the liquid handling system are washed with water between each plate.

6.
Centrifuge the 384-well plates at 450 x g for 3 min and incubate them at 4 °C overnight or between 6 -12 h to resuspend the DNA and to dissipate any air bubbles.

1.
Following the overnight incubation, centrifuge at 450 x g for 3 min to dissipate any remaining air bubbles.

2.
For purposes of automation, connect the qPCR machine (e.g., LightCycler 480) to a microplate handler (see Table of Materials). Program the microplate handler to place the plates into the qPCR machine from a cooled storage dock that is protected from light.

NOTE
The assays should, in theory, work on other qPCR machines with compatible optic settings.

3.
Use the following cycling conditions: 95 °C for 5 min followed by 40 cycles of 95 °C for 15 s and 66 °C for 50 s, with data collection at 66 °C.

4.
Once the run is complete, have the robot collect the plate from the qPCR machine and place it in the discard dock.

1.
After amplification, calculate the quantification cycle (Cq) values using either the second derivative maximum method or the Fit Points method with the software of the qPCR machine (see Table of Materials), following the steps below.

2.
Open the qPCR software and, in the Navigator tab, open the saved reaction experiment file for one plate.

3.
For the analysis using the second derivative maximum method, select the Analysis tab, and create a new analysis using Abs Quant/Second Derivative Max method.

1.
In the Create new analysis window, select analysis type: Abs Quant/ Second Derivative Max method, subset: All Samples, program: Amplification, name: Rx-DFO (where x is the reaction number).

2.
Select Filter Comb and choose VIC/HEX/Yellow555 (533-580). This ensures that the data collected for STAT6 is selected.

1.
In the Create new analysiswindow, select analysis type: Abs Quant/Fit Points method, subset: All Samples, program: Amplification, name: RxF-DFO(where x is the reaction number).

2.
Select the correct filters and color compensations for STAT6 and each of the KIR genes (Fam/Cy5). In the Noiseband tab, set the noise band to exclude the background noise.

3.
In the Analysis tab, set the fit points to 3 and select Show fit points. Click Calculate. Click Save file.

1.
In the qPCR software, open the Navigator tab. Select Results Batch Export.

2.
Open the folder in which the experiment files are saved and transfer the files into the right-hand side section of the window. Click Next. Select the name and the location of the export file.

3.
Select Analysis type Abs Quant/Second Derivative Max method or Abs Quant/Fit Points. Click Next. Check that the name of the file, the export folder, and the analysis type are correct and click Next to start the export process.

4.
Wait until the Export Status is Ok. The screen will automatically move to the next step. Check that all selected files have been exported successfully so that the number of files failed = 0. Click Done.

5.
Use scripts split_file.pl and roche2sds.pl to split the exported plates into individual reactions for each plate.
NOTE The scripts are provided on request/GitHub.

1.
Open the copy number analysis software (e.g., CopyCaller). Select Import realtime PCR results file and load text files created by roche2sds.pl.

2.
Select Analyze and conduct the analysis by either selecting calibrator sample with known copy number or by selecting most frequent copy number. See Table 5 for the most frequent copy number of KIR genes typically observed in European-origin populations.
8 Data-quality Checks

1.
Use R script KIR_CNVdata_analysis_for_Excel_ver020215.R to combine copy number data from all the plates into a spreadsheet.
NOTE The scripts are provided on request/GitHub.

2.
Recheck the raw data on the copy number analysis software for samples that do not conform to the known linkage disequilibrium (LD) for KIR genes (Table 6).

Representative Results
Copy number analysis can be carried out by exporting the files to the copy number analysis software, which provides the predicted and estimated copy number based on the ΔΔCq method.
The copy number can be predicted either based on the known copy number of control DNA samples on the plate or by inputting the most frequent gene copy number (Table 5). Figure 1 shows the results of a plate for a reaction that targets KIR2DL4 and KIR3DS1, as well as the reference gene STAT6. The most frequent copy number for KIR2DL4, a framework gene in the KIR locus, is two copies, whereas the most frequent copy number for KIR3DS1, an activating gene, is one copy. The results in the figure show the PCR amplification plots observed on the qPCR software and the copy number data generated from the qPCR data. As shown, the assay is able to distinguish between 0, 1, 2, 3, and 4 KIR gene copy numbers.
The copy number analysis software also enables a viewing of the distribution of the copy number across the plate as a pie chart or a bar graph. The efficacy of the copy number prediction is lower for samples with a higher copy number.
The quality of all the materials used in the reactions, gDNA, buffer, primers, and probes, can affect the accuracy of the results obtained. However, discordance in results is most likely to be caused due to variation in the concentration of DNA across a plate. The purity of the extracted gDNA, which can be measured using the 260/280 and 260/230 ratios, can also have an effect on the quality. A 260/280 ratio of 1.8 -2 and a 260/230 ratio of 2 -2.2 are desirable. An uneven range of DNA concentrations across a plate can lead to a high variability in the threshold cycle (C t ) between samples and discordance in the range of the estimated copy number. The results in Figure 2 show the effect the disparity between the C t values across a plate can have on the accuracy in the prediction of the copy number. The red line indicates the range of the estimated copy number for a sample and, ideally, should be as close to an integer as possible.
The copy number data, once analyzed, can be exported as a spreadsheet file in a 96-well format. We used an R script (available on request) to combine the copy number data of all 10 plates that are run as a set into one spreadsheet. Published data about KIRs from mostly European-origin populations enables the prediction of LD rules that exist between various genes in the KIR complex 1 . These predictions are used to conduct downstream checks on the copy number results obtained ( Table 6). Samples that do not conform to the predicted LD between the genes might contain unusual polymorphism or haplotypic structural variations. A flowchart describing the protocol is shown in Figure 3.
A tool called KIR Haplotype Identifier (http://www.bioinformatics.cimr.cam.ac.uk/ haplotypes/) was developed to facilitate the imputation of haplotypes from the data set. The imputation works on the basis of a list of reference haplotypes observed in a Europeanorigin population 1 . However, the tool also allows for a custom set of reference haplotypes to be used instead. Three separate files are generated; the first file lists all haplotype combinations for a sample, the second file provides a trimmed list of the haplotypes combinations that have the highest combined frequencies, and the third file lists the samples that cannot be assigned haplotypes. Non-assignment of haplotypes could be used as an indicator of novel haplotypes.

Discussion
We described a novel semi-automated high-throughput method, called qKAT, which facilitates copy number typing of KIR genes. The method is an improvement over conventional methods like SSP-PCR, which are low-throughput and can only indicate the presence or absence of these highly polymorphic genes.
The accuracy of the copy number data obtained is dependent on multiple factors, including the quality and concentration-uniformity of the gDNA samples and the quality of the reagents. The quality and accuracy of the gDNA samples across a plate are extremely important since variations in concentration across the plate can result in errors in the calculation of the copy number. Since the assays were validated using European-origin sample sets, data from cohorts from other parts of the world require more thorough checks. This is to ensure that instances of allele dropout or non-specific primer/probe binding are not misinterpreted as copy number variation.
While the assays were designed and optimized to run as high-throughput, they can be modified to run fewer samples. The confidence metric in the copy number analysis software is affected when analyzing fewer samples, but this can be improved if control genomic DNA samples with a known KIR gene copy number are included on the plate and additional sample replicates are included.
For laboratories without liquid/plate-handling robots, master mix can be dispensed using multi-channel pipettes and plates can be manually loaded into the qPCR instrument.
The main aim behind the development of qKAT was to create a simple, high-throughput, high-resolution, and cost-effective method to genotype KIRs for disease association studies.
This was successfully achieved since qKAT has been employed in investigating the role of KIR in several large disease association studies, including a range of infectious diseases, autoimmune conditions, and pregnancy disorders 4,24,25,26 .    Table 1 Combination and concentration of primers and probes used in each qKAT reaction 27 Table 3 List of probes used in qKAT1,27.
The fluorescent dyes used at the 5' end of the oligo probes P5b, P5b-2DL4, P9, and PSTAT6 were modified to ATTO dyes.