The micronucleus (MN) assay is a well-established test for quantifying DNA damage. However, scoring the assay using conventional techniques such as manual microscopy or feature-based image analysis is laborious and challenging. This paper describes the methodology to develop an artificial intelligence model to score the MN assay using imaging flow cytometry data.
The micronucleus (MN) assay is used worldwide by regulatory bodies to evaluate chemicals for genetic toxicity. The assay can be performed in two ways: by scoring MN in once-divided, cytokinesis-blocked binucleated cells or fully divided mononucleated cells. Historically, light microscopy has been the gold standard method to score the assay, but it is laborious and subjective. Flow cytometry has been used in recent years to score the assay, but is limited by the inability to visually confirm key aspects of cellular imagery. Imaging flow cytometry (IFC) combines high-throughput image capture and automated image analysis, and has been successfully applied to rapidly acquire imagery of and score all key events in the MN assay. Recently, it has been demonstrated that artificial intelligence (AI) methods based on convolutional neural networks can be used to score MN assay data acquired by IFC. This paper describes all steps to use AI software to create a deep learning model to score all key events and to apply this model to automatically score additional data. Results from the AI deep learning model compare well to manual microscopy, therefore enabling fully automated scoring of the MN assay by combining IFC and AI.
The micronucleus (MN) assay is fundamental in genetic toxicology to evaluate DNA damage in the development of cosmetics, pharmaceuticals, and chemicals for human use1,2,3,4. Micronuclei are formed from whole chromosomes or chromosome fragments that do not incorporate into the nucleus following division and condense into small, circular bodies separate from the nucleus. Thus, MN can be used as an endpoint to quantify DNA damage in genotoxicity testing1.
The preferred method for quantifying MN is within once-divided binucleated cells (BNCs) by blocking division using Cytochalasin-B (Cyt-B). In this version of the assay, cytotoxicity is also assessed by scoring mononucleated (MONO) and polynucleated (POLY) cells. The assay can also be performed by scoring MN in unblocked MONO cells, which is faster and easier to score, with cytotoxicity being assessed using pre- and post-exposure cell counts to assess proliferation5,6.
Physical scoring of the assay has historically been performed through manual microscopy, since this permits visual confirmation of all key events. However, manual microscopy is challenging and subjective1. Thus, automated techniques have been developed, including microscope slide scanning and flow cytometry, each with their own advantages and limitations. While slide-scanning methods allow key events to be visualized, slides must be created at optimal cell density, which can be difficult to achieve. Additionally, this technique often lacks cytoplasmic visualization, which can compromise the scoring of MONO and POLY cells7,8. While flow cytometry offers high-throughput data capture, the cells must be lysed, thus not permitting the use of the Cyt-B form of the assay. Additionally, as a non-imaging technique, conventional flow cytometry does not provide visual validation of key events9,10.
Therefore, imaging flow cytometry (IFC) has been investigated to perform the MN assay. The ImageStreamX Mk II combines the speed and statistical robustness of conventional flow cytometry with the high-resolution imaging capabilities of microscopy in a single system11. It has been shown that by using IFC, high-resolution imagery of all key events can be captured and automatically scored using feature-based12,13 or artificial intelligence (AI) techniques14,15. By using IFC to perform the MN assay, the automatic scoring of many more cells compared to microscopy in a shorter amount of time is achievable.
This work deviates from a previously described image analysis workflow16 and discusses all steps required to develop and train a Random Forest (RF) and/or convolutional neural network (CNN) model using the Amnis AI software (henceforth referred to as "AI software"). All necessary steps are described, including populating ground truth data using AI-assisted tagging tools, interpretation of model training results, and application of the model to classify additional data, permitting calculation of genotoxicity and cytotoxicity15.
1. Data acquisition using imaging flow cytometry
NOTE: Refer to Rodrigues et al.16 with the following modifications, noting that the acquisition regions using IFC may need to be modified for optimal image capture:
2. Creating .daf files for all .rif files
3. Creating an experiment in the AI software
4. Populating the ground truth data using AI-assisted tagging tools
5. Assessing model accuracy
6. Classifying data using the model
7. Generating a report of the classification results
8. Determining MN frequency and cytotoxicity
Figure 1 shows the workflow for using the AI software to create a model for the MN assay. The user loads the desired .daf files into the AI software, then assigns objects to the ground truth model classes using the AI-assisted cluster (Figure 2) and predict (Figure 3) tagging algorithms. Once all ground truth model classes have been populated with sufficient objects, the model can be trained using the RF or CNN algorithms. Following training, the performance of the model can be assessed using tools including class distribution histograms, accuracy statistics, and an interactive confusion matrix (Figure 4). From the results screen in the AI software, the user can either return to the training portion of the workflow to enhance the ground truth data or, if sufficient accuracy has been achieved, the user can use the model to classify additional data.
Using both the cluster and predict algorithms, 190 segments with a total of 285,000 objects were assigned to the proper ground truth classes until all classes were populated with between 1,500 and 10,000 images. In total, 31,500 objects (only 10.5% of the initial objects loaded) were used in the training of this model. Precision (percentage of false positives), recall (percentage of false negatives), and F1 score (balance between precision and recall) are available in the deep learning software package to quantify model accuracy. Here, these statistics ranged from 86.0% to 99.4%, indicating high model accuracy (Figure 4).
Using Cyt-B, background MN frequencies for all control samples were between 0.43% and 1.69%, comparing well to literature17. Statistically significant increases in MN frequency, ranging from 2.09% to 9.50% for (Mitomycin C) MMC and from 2.99% to 7.98% for Etoposide, were observed when compared to solvent controls and compared well to manual microscopy scoring. When examining the negative control Mannitol, no significant increases in MN frequency were observed. Additionally, increasing cytotoxicity with the dose was observed for both Etoposide and MMC, with both microscopy and AI showing similar trends across the dose range. For Mannitol, no observable increase in cytotoxicity was seen (Figure 5).
When not using Cyt-B, background MN frequencies for all control samples were between 0.38% and 1.0%, consistent with results published in the literature17. Statistically significant increases in MN frequency, ranging from 2.55% to 7.89% for MMC and from 2.37% to 5.13% for Etoposide, were observed when compared to solvent controls and compared well to manual microscopy scoring. When examining the negative control Mannitol, no significant increases in MN frequency were observed. Further, increasing cytotoxicity with the dose was observed for both Etoposide and MMC, with both microscopy and AI showing similar trends across the dose range. For Mannitol, no observable increase in cytotoxicity was seen (Figure 5).
When scoring by microscopy, from each culture, 1,000 binucleated cells were scored to assess MN frequency and another 500 mononucleated, binucleated, or polynucleated cells were scored to determine cytotoxicity in the Cyt-B version of the assay. In the non-Cyt-B version of the assay, 1,000 mononucleated cells were scored to assess MN frequency. By IFC, an average of 7,733 binucleated cells, 6,493 mononucleated cells, and 2,649 polynucleated cells were scored per culture to determine cytotoxicity. MN frequency was determined from within the binucleated cell population for the Cyt-B version of the assay. For the non-Cyt-B version of the assay, an average of 27,866 mononucleated cells were assessed for the presence of MN (Figure 5).
Figure 1: AI software workflow. The user begins by selecting the .daf files to be loaded into the AI software. Once the data has been loaded, the user begins to assign objects to the ground truth model classes through the user interface. To aid in ground truth population, the cluster and predict algorithms can be used to identify imagery with similar morphology. Once sufficient objects have been added to each model class, the model can be trained. Following training, the user can assess the performance of the model using the tools provided, including an interactive confusion matrix. Finally, the user can either return to the training portion of the workflow to enhance the ground truth data or, if sufficient accuracy has been achieved, the user can step out of the training/tagging workflow loop and use the model to classify additional data. Please click here to view a larger version of this figure.
Figure 2: Cluster algorithm. The cluster algorithm can be run at any time on a segment of 1,500 objects randomly selected from the input data. This algorithm groups similar objects within a segment together according to the morphology of both unclassified objects and objects that have been assigned to the ground truth model classes. Example imagery shows binucleated, mononucleated, and multinucleated cells, and cells with irregular morphology. Clusters containing mononucleated cells fall on one side of the object map, while clusters with multinucleated cells are on the opposite side of the object map. Binucleated cell clusters fall somewhere between mono- and multinucleated cell clusters. Finally, clusters with irregular morphology fall in a different area of the object map altogether. The user interface permits adding entire clusters, or select objects within clusters, to the ground truth model classes. Please click here to view a larger version of this figure.
Figure 3: Predict algorithm. The predict algorithm requires a minimum of 25 objects in each ground truth model class and attempts to predict the most appropriate model class to assign unclassified objects within a segment. The predict algorithm is more robust in comparison to the cluster algorithm with respect to the identification of subtle morphologies in images (i.e., mononucleated cells with MN [yellow] versus mononucleated cells without MN [red]). Objects with these similarities are placed in close proximity on the object map; however, the user is easily able to inspect the images in each predicted class and assign objects to the appropriate model class. Objects that the algorithm is unable to predict a class for will remain as 'unknown'. The predict algorithm permits users to rapidly populate the ground truth model classes, particularly in the case of events that are considered rare and challenging to find within the input data, such as micronucleated cells. Please click here to view a larger version of this figure.
Figure 4: Confusion matrix with model results. The results screen of the AI software presents the user with three different tools to assess model accuracy. (A) The class distribution histograms permit the user to click on the bins of the histogram to assess the relationship between objects in the truth populations and objects that were predicted to belong to that model class. In general, the closer the percentage values between the truth and predicted populations are to one another for a given model class, the more accurate the model. (B) The accuracy statistics table allows the user to assess three common machine learning metrics to assess model accuracy: precision, recall, and F1. In general, the closer these metrics are to 100%, the more accurate the model is at identifying events in the model classes. Finally, (C) the interactive confusion matrix provides an indication of where the model is misclassifying events. The on-axis entries (green) indicate objects from the ground truth data that were classified correctly during training. Off-axis entries (shaded orange) indicate objects from the ground truth data that were incorrectly classified. Various examples of misclassified objects are shown, including (i) a mononucleated cell classified as a mononucleated cell with MN, (ii) a binucleated cell classified as a binucleated cell with MN, (iii) a mononucleated cell classified as a cell having irregular morphology, (iv) a binucleated cell with MN classified as a cell having irregular morphology, and (v) a binucleated cell with a MN classified as a binucleated cell. Please click here to view a larger version of this figure.
Figure 5: Genotoxicity and cytotoxicity results. Genotoxicity measured by the percentage of MN by microscopy (clear bars) and AI (dotted bars) following a 3 h exposure and 24 h recovery for Mannitol, Etoposide, and MMC using both the (A–C) Cyt-B and (D–F) non-Cyt-B methods. Statistically significant increases in MN frequency compared to controls are indicated by asterisks (*p < 0.001, Fisher's Exact Test). Error bars represent the standard deviation of the mean from three replicate cultures at each dose point except for MMC by microscopy, where only duplicate cultures were scored. This figure has been modified from Rodrigues et al.15. Please click here to view a larger version of this figure.
The work presented here describes the use of deep learning algorithms to automate the scoring of the MN assay. Several recent publications have shown that intuitive, interactive tools allow the creation of deep learning models to analyze image data without the need for in-depth computational knowledge18,19. The protocol described in this work using a user interface-driven software package has been designed to work well with very large data files and permit the creation of deep learning models with ease. All necessary steps to create and train RF and CNN models in the AI software package are discussed, permitting highly accurate identification and quantification of all key events in both the Cyt-B and non-Cyt-B versions of the assay. Finally, the steps to use these deep learning models to classify additional data and evaluate chemical cytotoxicity and MN frequency are described.
The AI software used in this work has been created with a convenient user interface and constructed to work easily with large datasets generated from IFC systems. Training, evaluation, and enhancement of deep learning models follow a straightforward iterative approach (Figure 1), and application of the trained models to classify additional data can be accomplished in just a few steps. The software contains distinctive cluster (Figure 2) and predict (Figure 3) algorithms that permit rapid assignment of objects into appropriate ground truth model classes. The protocol in this paper demonstrates how a CNN model, constructed and trained using AI software, is able to robustly identify all key events in the MN assay; it yields results that compare well to traditional microscopy, thus removing the requirement for image analysis and computer coding experience. Furthermore, the interactive model results (Figure 4) permit the investigation of specific events that the model is misclassifying. The iterative process permits assigning these misclassified events to the appropriate model classes so that the model can be trained again to enhance accuracy.
The results presented here (Figure 5) show the evaluation of Mannitol, Etoposide, and MMC using microscopy and a CNN model created in the AI software. Using both versions of the MN assay, evaluated with a single AI model, increases in cytotoxicity are consistent with increasing doses for both MMC and Etoposide, while exposure to Mannitol yields no increase in cytotoxicity, as expected. For genotoxicity evaluation, significant (Fisher's Exact Test, one-sided) increases in MN frequency were demonstrated using MMC and Etoposide but not using Mannitol. Results for both microscopy and the AI model compared well across the dose ranges for each chemical tested.
In several previous publications, it has been shown that an IFC-based MN assay can be performed with straightforward and simple sample preparation steps along with an image-based analysis using masks (regions of interest that highlight pixels in an image) and features calculated using these masks to automatically score all key events12,13,16. This IFC-based assay takes advantage of the strengths of IFC, including high-throughput image capture, simplified sample processing with simple DNA dyes, and automated differentiation of cellular imagery with morphology that aligns with published MN assay scoring criteria. However, this workflow also included disadvantages, such as the complexity of feature-based analysis techniques that are often rigid and necessitate advanced knowledge of image analysis software packages12. The use of deep learning to analyze MN data acquired by IFC demonstrates that CNNs can be used to break away from the restrictions and difficulties of feature-based analyses, yielding results that are highly accurate and compare well to microscopy scoring14,15. While this AI-based approach is promising, further studies with an expanded selection of well-described chemicals should be performed to further test and validate the robustness of the technique. This work further demonstrates the advantages of IFC over more traditional methods, such as microscopy and conventional flow cytometry, to enhance the performance of assays with challenging morphologies and stringent scoring requirements.
The authors have nothing to disclose.
None.
15 mL centrifuge tube | Falcon | 352096 | |
Cleanser – Coulter Clenz | Beckman Coulter | 8546931 | Fill container with 200 mL of Cleanser. https://www.beckmancoulter.com/wsrportal/page/itemDetails?itemNumber=8546931#2/10//0/25/ 1/0/asc/2/8546931///0/1//0/ |
Colchicine | MilliporeSigma | 64-86-8 | |
Corning bottle-top vacuum filter | MilliporeSigma | CLS430769 | 0.22 µm filter, 500 mL bottle |
Cytochalasin B | MilliporeSigma | 14930-96-2 | 5 mg bottle |
Debubbler – 70% Isopropanol | MilliporeSigma | 1.3704 | Fill container with 200 mL of Debubbler. http://www.emdmillipore.com/US/en/product/2-Propanol-70%25-%28V%2FV%29-0.1-%C2%B5m-filtred,MDA_CHEM-137040?ReferrerURL=https%3A%2F%2Fwww.google.com%2F |
Dimethyl Sulfoxide (DMSO) | MilliporeSigma | 67-68-5 | |
Dulbecco's Phosphate Buffered Saline 1X | EMD Millipore | BSS-1006-B | PBS Ca++MG++ Free |
Fetal Bovine Serum | HyClone | SH30071.03 | |
Formaldehyde, 10%, methanol free, Ultra Pure | Polysciences, Inc. | 04018 | This is what is used for the 4% and 1% Formalin. CAUTION: Formalin/Formaldehyde toxic by inhalation and if swallowed. Irritating to the eyes, respiratory systems and skin. May cause sensitization by inhalation or skin contact. Risk of serious damage to eyes. Potential cancer hazard. http://www.polysciences.com/default/catalog-products/life-sciences/histology-microscopy/fixatives/formaldehydes/formaldehyde-10-methanol-free-pure/ |
Guava Muse Cell Analyzer | Luminex | 0500-3115 | A standard configuration Guava Muse Cell Analyzer was used. |
Hoechst 33342 | Thermo Fisher | H3570 | 10 mg/mL solution |
Mannitol | MilliporeSigma | 69-65-8 | |
MEM Non-Essential Amino Acids 100X | HyClone | SH30238.01 | |
MIFC – ImageStreamX Mark II | Luminex, a DiaSorin company | 100220 | A 2 camera ImageStreamX Mark II eqiped with the 405 nm, 488 nm, and 642 nm lasers was used. |
MIFC analysis software – IDEAS | Luminex, a DiaSorin company | 100220 | "Image analysis sofware" The companion software to the MIFC (ImageStreamX MKII) |
MIFC software – INSPIRE | Luminex, a DiaSorin company | 100220 | "Image acquisition software" This is the software that runs the MIFC (ImageStreamX MKII) |
Amnis AI software | Luminex, a DiaSorin company | 100221 | "AI software" This is the software that permits the creation of artificial intelligence models to analyze data |
Mitomycin C | MilliporeSigma | 50-07-7 | |
NEAA Mixture 100x | Lonza BioWhittaker | 13-114E | |
Penicllin/Streptomycin/Glutamine solution 100X | Gibco | 15070063 | |
Potassium Chloride (KCl) | MilliporeSigma | P9541 | |
Rinse – Ultrapure water or deionized water | NA | NA | Use any ultrapure water or deionized water. Fill container with 900 mL of Rinse. |
RNase | MilliporeSigma | 9001-99-4 | |
RPMI-1640 Medium 1x | HyClone | SH30027.01 | |
Sheath – PBS | MilliporeSigma | BSS-1006-B | This is the same as Dulbecco's Phosphate Buffered Saline 1x Ca++MG++ free. Fill container with 900 mL of Sheath. |
Sterile water | HyClone | SH30529.01 | |
Sterilizer – 0.4%–0.7% Hypochlorite | VWR | JT9416-1 | This is assentually 10% Clorox bleach that can be made by deluting Clorox bleach with water. Fill container with 200 mL of Sterilzer. |
T25 flask | Falcon | 353109 | |
T75 flask | Falcon | 353136 | |
TK6 cells | MilliporeSigma | 95111735 |