Here, we present a protocol to use a curvelet transform-based, open-source MATLAB software tool for quantifying fibrillar collagen organization in the extracellular matrix of both normal and diseased tissues. This tool can be applied to images with collagen fibers or other types of line-like structures.
Fibrillar collagens are prominent extracellular matrix (ECM) components, and their topology changes have been shown to be associated with the progression of a wide range of diseases including breast, ovarian, kidney, and pancreatic cancers. Freely available fiber quantification software tools are mainly focused on the calculation of fiber alignment or orientation, and they are subject to limitations such as the requirement of manual steps, inaccuracy in detection of the fiber edge in noisy background, or lack of localized feature characterization. The collagen fiber quantitation tool described in this protocol is characterized by using an optimal multiscale image representation enabled by curvelet transform (CT). This algorithmic approach allows for the removal of noise from fibrillar collagen images and the enhancement of fiber edges to provide location and orientation information directly from a fiber, rather than using the indirect pixel-wise or window-wise information obtained from other tools. This CT-based framework contains two separate, but linked, packages named “CT-FIRE” and “CurveAlign” that can quantify fiber organization on a global, region of interest (ROI), or individual fiber basis. This quantification framework has been developed for more than ten years and has now evolved into a comprehensive and user-driven collagen quantification platform. Using this platform, one can measure up to about thirty fiber features including individual fiber properties such as length, angle, width, and straightness, as well as bulk measurements such as density and alignment. Additionally, the user can measure fiber angle relative to manually or automatically segmented boundaries. This platform also provides several additional modules including ones for ROI analysis, automatic boundary creation, and post-processing. Using this platform does not require prior experience of programming or image processing, and it can handle large datasets including hundreds or thousands of images, enabling efficient quantification of collagen fiber organization for biological or biomedical applications.
Fibrillar collagens are prominent, structural ECM components. Their organization changes impact tissue function and are likely associated with the progression of many diseases ranging from osteogenesis imperfecta1, cardiac dysfunction2, and wound healing3 to different types of cancer including breast4,5,6, ovarian7,8, kidney9, and pancreatic cancers10. Many established imaging modalities can be used to visualize fibrillar collagen such as second harmonic generation microscopy11, stains or dyes in conjunction with bright field or fluorescence microscopy or polarized light microscopy12, liquid crystal-based polarization microscopy (LC-PolScope)13, and electron microscopy14. As the importance of fibrillar collagen organization has become clearer, and the use of these methods has increased, the need for improved collagen fiber analysis approaches has also grown.
There have been many efforts to develop computational methods for automated measurement of fibrillar collagen. Freely available software tools are mainly focused on the calculation of fiber alignment or orientation by adopting either first derivative or structure tensor for pixels15,16, or Fourier transform-based spectrum analysis for image tiles17. All these tools are subject to limitations such as the requirement of manual steps, inaccuracy in detection of the fiber edge in noisy background, or lack of localized feature characterization.
Compared to other freely available open-source free software tools, the methods described in this protocol use CT—an optimal, multiscale, directional image representation method—to remove noise from fibrillar collagen images and enhance or track fiber edges. Information about location and orientation can be provided directly from a fiber rather than by using the indirect pixel-wise or window-wise information to infer the metrics of fiber organization. This CT-based framework18,19,20,21 can quantify fiber organization on a global, ROI, or fiber basis, mainly via two separate, but linked, packages named “CT-FIRE”18,21 and “CurveAlign”19,21. As far as the implementation of the software is concerned, in CT-FIRE, CT coefficients on multiple scales can be used to reconstruct an image that enhances edges and reduces noise. Then, an individual fiber extraction algorithm is applied to the CT-reconstructed image to track fibers for finding their representative center points, extending fiber branches from the center points, and linking fiber branches to form a fiber network. In CurveAlign, CT coefficients on a user-specified scale can be used to track local fiber orientation, where the orientation and locations of curvelets are extracted and grouped to estimate the fiber orientation at the corresponding locations. This resulting quantification framework has been developed for more than ten years and has evolved greatly in many aspects such as functionality, user interface, and modularity. For instance, this tool can visualize local fiber orientation and allows the user to measure up to thirty fiber features including individual fiber properties such as length, angle, width, and straightness, as well as bulk measurements such as density and alignment. Additionally, the user can measure fiber angle relative to manually or automatically segmented boundaries, which, for example, plays an important role in image-based biomarker development in breast cancer22 and pancreatic cancer studies10. This platform provides several feature modules including ones for ROI analysis, automatic boundary creation, and post-processing. The ROI module can be used to annotate different shapes of ROI and conduct corresponding ROI analysis. As an application example, the automatic boundary creation module can be used to register hematoxylin and eosin (H&E) bright field images with second harmonic generation (SHG) images and generate the image mask of tumor boundaries from the registered H&E images. The post-processing module can help facilitate the processing and integration of output data files from individual images for possible statistical analysis.
This quantification platform does not require prior experience of programming or image processing and can handle large datasets including hundreds or thousands of images, enabling efficient quantification of collagen organization for biological or biomedical applications. It has been widely used in different research fields by many researchers all over the world, including ourselves. There are four main publications on CT-FIRE and CurveAlign18,19,20,21, out of which the first three have been cited 272 times (as of 2020-05-04 according to Google Scholar). A review of the publications that cited this platform (CT-FIRE or CurveAlign) indicates that there are approximately 110 journal papers that directly used it for their analysis, in which approximately 35 publications were collaborative with our group, and the others (~ 75) were written by other groups. For instance, this platform was used for the following studies: breast cancer22,23,24, pancreatic cancer10,25, kidney cancer9,26, wound healing3,27,28,29,30, ovarian cancer8,31,7, uterosacral ligament32, hypophosphatemic dentin33, basal cell carcinoma34, hypoxic sarcoma35, cartilage tissue36, cardiac dysfunction37, neurons38, glioblastoma39, lymphatic contractions40, fibrous cacffolds41, gastric cancer42, microtubule43, and bladder fibrosis44. Figure 1 demonstrates the cancer imaging application of CurveAlign to find the tumor-associated collagen signatures of breast cancer19 from the SHG image. Figure 2 describes a typical schematic workflow of this platform. Although these tools have been reviewed technically18,19,21, and a regular protocol20 for alignment analysis with CurveAlign is also available, a visual protocol that demonstrates all the essential features could be useful. A visualized protocol, as presented here, will facilitate the learning process of using this platform as well as more efficiently address concerns and questions that users might have.
NOTE: This protocol describes the use of CT-FIRE and CurveAlign for collagen quantification. These two tools have complementary, but different, main goals and are linked together to some extent. CT-FIRE can be launched from the CurveAlign interface to conduct most operations except for advanced post-processing and ROI analysis. For a full operation of CT-FIRE, it should be launched separately.
1. Image collection and image requirement
NOTE: The tool can process any image file with line-like structures readable by MATLAB regardless of the imaging modality used to collect it.
2. Software installation and system requirement
NOTE: Both standalone and source-code versions are freely available. The source code version requires a full MATLAB installation including toolboxes of Signal Processing, Image Processing, Statistics Analysis, and Parallel Computing. To run the source-code version, all the necessary folders including some from the third-party sources should be added to the MATLAB path. Use of the standalone application (APP) is recommended for most users, which requires an installation of a freely available MATLAB Compiler Runtime (MCR) of specified version. The procedure of installing and launching the APP is described below.
3. Individual fiber extraction with CT-FIRE
NOTE: CT-FIRE uses CT to denoise the image, enhance the fiber edges, and then uses a fiber extraction algorithm to track individual fibers. Length, angle, width, and straightness are calculated for individual fibers.
4. Fiber analysis with CurveAlign
NOTE: CurveAlign was initially developed to automatically measure angles of fibers with respect to user-defined boundaries. The current version of CurveAlign can be used for bulk assessment of density- and alignment-based features in addition to the relative angle measurement by either loading the individual fiber information extracted by CT-FIRE or directly using the local orientation of the curvelets. CurveAlign calculates up to thirty features related to global or local features mainly including density and alignment as well as individual fiber properties when CT-FIRE is adopted as the fiber tracking method.
5. Estimated running time
These methods have been successfully applied in numerous studies. Some typical applications include: 1) Conklin et al.22 used CurveAlign to calculate tumor-associated collagen signatures, and found that collagen fibers were more frequently aligned perpendicularly to the duct perimeter in ductal carcinoma in situ (DCIS) lesions; 2) Drifka et al.10 used the CT-FIRE mode in CurveAlign to quantify the stromal collagen alignment for pancreatic ductal adenocarcinoma and normal/chronic pancreatitis tissues, and found that there was an increased alignment in cancer tissues compared to that in normal/chronic tissues; 3) Alkmin et al.7 used CurveAlign to quantify the angular distribution of F-actin fibers and overall collagen alignment from the SHG images of ovarian stromal collagen, and showed that matrix morphology plays an important role in driving cell motility and F-actin alignment; 4) LeBert et al.3 applied CT-FIRE to the SHG images of a zebrafish wound repair model and found an increase in thickness of collagen fibers after acute wounding; 5) Devine et al.45 used the CT-FIRE mode in CurveAlign for SHG images of vocal fold collagen from different animal models to measure individual fiber properties and overall alignment, and showed that porcine and canine vocal fold collagen had a higher alignment and straightness inferiorly; 6) Keikhosravi et al.13 used CurveAlign to quantify collagen alignment in histopathology samples imaged with LC-PolScope, and showed that LC-PolScope and SHG are comparable in terms of alignment and orientation measurement for some types of tissue.
Figure 1: Using CurveAlign to find tumor-associated collagen signatures from SHG images of a human breast cancer tissue microarray (TMA). (A) Overlay image of a TMA core with SHG image (yellow) overlaid on the corresponding H&E bright field image. (B) The region of interest of (A). (C) The bright field image of (B). (D) The SHG image of (B). (F) The mask associated with the bright field image (C). (E) The CurveAlign output overlay image showing the tumor boundaries (yellow) from (F), representative fiber locations, and orientation (green lines); the blue lines are used to associate fibers with their closest boundaries. The green arrows in (B) and (E) show the fibers parallel to the tumor boundary, while the red arrows there show the fibers perpendicular to the boundary. The scale bar in (A) equals 200 µm. Images in (B)–(F) are displayed in the same scale, and the representative scale bar in (D) equals 50 µm. Abbreviations: SHG = second harmonic generation; H&E = hematoxylin and eosin. Please click here to view a larger version of this figure.
Figure 2: Schematic workflow of quantification of a fibrillar collagen image. (A) SHG image to be analyzed by CT-FIRE and/or CurveAlign. (B) Overlay image output by CT-FIRE. (C) Mask boundary of (A) is an optional CurveAlign input. (D) Overlay image output by CurveAlign. The color lines in (B) represent the extracted fibers. In (D), the green lines indicate the locations and orientations of fibers that are outside the boundaries (yellow lines) and are within the specified distance from their closest boundaries, the red lines are those of other fibers, and the blue lines are used to associate fibers with their closest boundaries. Images in (A)–(D) are displayed in the same scale, and the representative scale bar in (A) equals 200 µm. Please click here to view a larger version of this figure.
Figure 3: CT-FIRE regular analysis. (A) Main GUI. (B) Output table showing the summary statistics. (C) and (F) show the histograms of angle and width, respectively. (E) Output image showing the extracted fibers (color lines) overlaid on the original SHG image (D). Abbreviations: GUI = CT = curvelet transform; graphical user interface; SHG = second harmonic generation. Please click here to view a larger version of this figure.
Figure 4: CT-FIRE ROI management module. (A) Module GUI. (B) ROI post-analysis of four ROIs with different shapes showing the fibers within each ROI. (C) ROI histograms of different fiber properties. Abbreviations: CT = curvelet transform; GUI = graphical user interface; ROI = region of interest. Please click here to view a larger version of this figure.
Figure 5: CT-FIRE advanced post-processing module. (A) Module GUI. (B) Measurements of selected three fibers. (C) Visualization of the selected three fibers in (B). (D) Summary statistics after applying a length threshold (>60 pixels). (E) Visualization of the fibers selected in (D) with length-based color bar. Abbreviations: CT = curvelet transform; GUI = graphical user interface. Please click here to view a larger version of this figure.
Figure 6: CurveAlign regular analysis. (A) Main GUI. (B) Output table showing the summary statistics. (C) Output image showing the locations and orientation of representative fibers (green lines) and boundaries (yellow lines) overlaid on the original SHG image, the blue lines are used to associate fibers with their closest boundaries, red lines show the locations and orientations of fibers inside a boundary or outside far away from the boundary (>user specified distance, e.g., 250 pixels here). (D) Heatmap of the angles: red (> 60 degrees), yellow (45–60] degrees, green (10, 45] degrees. (E)–(F) show the angle distribution using histogram and compass plot, respectively. Abbreviations: GUI = graphical user interface; SHG = second harmonic generation. Please click here to view a larger version of this figure.
Figure 7: CurveAlign ROI management module. (A) The module GUI. (B) Four annotated rectangular ROIs overlaid on the original image. (C) ROI post-analysis output table. (D) Angle histogram of each ROI. Abbreviations: ROI = region of interest; GUI = graphical user interface. Please click here to view a larger version of this figure.
This protocol describes the use of CT-FIRE and CurveAlign for fibrillar collagen quantification and can be applied to any image with collagen fibers or other line-like or fiber-like elongated structures suitable for analysis by CT-FIRE or CurveAlign. For example, elastin or elastic fibers could be processed in a similar way on this platform. We have tested both tools on computationally generated synthetic fibers21. Depending on the application, users should choose the analysis mode that is most appropriate for their data. The CT fiber analysis mode can directly use curvelets in CT to represent fiber location and orientation, and it is sensitive to changes in local fiber structure. The CT-mode can be used to locate fibers and their orientation in complex conditions, e.g., where the noise level is high, the fiber is curvy, or the variation in fiber thickness is high. However, as the CT-mode only picks up the brightest parts of an image, it would miss some fibers with lower intensity when there is a large variation in image intensity.
Moreover, the CT-mode does not provide information on individual fibers. In contrast to the CT-mode, the CT-FIRE mode calculates individual fiber properties and can analyze all the fibers whose intensity is above a specified threshold. The challenges associated with the CT-FIRE mode include: 1) the accuracy of an intact fiber extraction may be reduced or compromised when there is large variation in the intensity along a fiber or the fiber thickness across an image; and 2) the current standard analysis is computationally demanding as mentioned in the protocol. More details about the advantages and limitations of these methods can be seen in our previous publications20,21.
As far as the accuracy of fiber tracking is concerned, the user can mainly rely on visual inspection to check the overlapping image where the extracted fibers or representative orientations are overlaid on the original image. In addition, for CT-FIRE, the user may use the advanced post-processing module to identify the properties of selected individual fibers, and compare them to measurements by using other image analysis tools such as Fiji46. For CurveAlign, the user may compare the orientation or alignment results to those calculated by other tools such as OrientationJ16 and CytoSpectre17.
Among the features available for output by the platform, alignment-related features are most frequently used and are the most reliable and robust. To use individual fiber features, the user needs to confirm the extraction of representative fiber features. Of note, an intact fiber may be divided into several shorter segments in some circumstances, which the user should take into consideration when selecting the fiber analysis mode or conducting further statistical analysis. Even when the fiber length cannot be directly used as a comparable property, the orientation or width of fiber segments weighted against their lengths might still indicate useful information. As far as SHG imaging is concerned, numeric aperture (NA) of the objective lens can significantly affect the detection of the width and length of a fiber, but it has less impact on the orientation and alignment measurements. In our experience in SHG imaging, we usually need to use objective lens with 40x magnification or higher with NA ≥ 1.0 to achieve a robust fiber thickness measurement.
“Alignment” can be interpreted in three different ways: 1) alignment with respect to the positive horizontal direction named “angle”, ranging from 0 to 180 degrees, where angles close to 0 have similar orientation to angles close to 180 degrees; 2) alignment with respect to a boundary named “relative angle”, ranging from 0 to 90 degrees, with 0 degrees indicating a fiber parallel to the boundary and 90 indicating a perpendicular fiber; and 3) alignment of fibers with respect to each other named “alignment coefficient”, ranging from 0 to 1, with 1 indicating perfectly aligned fibers and smaller values representing more randomly distributed fibers.
Besides the fiber features calculated in this platform, some metrics based on texture analysis47,48,49 were also proposed to quantify ECM patterns. Those texture-related features can provide an alternative or additional descriptor of the ECM in some applications. The challenges for the development of this type of metrics lie in the interoperation of the possible biological relevance, localized characterization, and the accuracy of tracing individual fibers.
To optimize the running parameters and perform troubleshooting, the user can refer to the manual, relevant publications20,21 as well as the FAQ sidebars on the GitHub Wiki pages of the curvelets repository: https://github.com/uw-loci/curvelets/wiki. For some buttons, a function hint may appear to guide the user for the current or next operation when the user moves the mouse icon above a button. Follow the information on the GUI or command window to perform the troubleshooting.
To process a large dataset, the user is encouraged to use parallel computing options, which enable the tool to process multiple images simultaneously. One option is using multiple CPU cores if available on the computer being utilized. Alternatively, a headless version of both APPs is provided and has been successfully compiled in the compilation node through the server held at the CONDOR-based50 Center for High Throughput Computing (CHTC) at the University of Wisconsin-Madison. The CHTC workflow for large scale fiber quantification has been developed, tested, and used successfully on real image sets consisting of thousands of images. The user could adapt the headless MATLAB functions of both CT-FIRE and CurveAlign to run quantification on other cloud computing systems including commercial services such as those offered by Amazon, Google, and Microsoft.
The ongoing and future development directions include: 1) incorporation of deep learning neural network to extract or generate high-quality synthetic collagen fiber images and improve the robustness and accuracy of fiber tracing algorithm; 2) integration of all the modules into a comprehensive platform while optimizing the code and documentation following the best software engineering practices; 3) deployment of all the core features on a cloud computing platform; 4) enhancement of the workflow of fiber analysis using CHTC service; and 5) improvement of the functionality of the synthetic fiber generator.
The authors have nothing to disclose.
We thank many contributors and users to CT-FIRE and CurveAlign over the years, including Dr. Rob Nowak, Dr. Carolyn Pehlke, Dr. Jeremy Bredfeldt, Guneet Mehta, Andrew Leicht, Dr. Adib Keikhosravi, Dr. Matt Conklin, Dr. Jayne Squirrell, Dr. Paolo Provenzano, Dr. Brenda Ogle, Dr. Patricia Keely, Dr. Joseph Szulczewski, Dr. Suzanne Ponik and additional technical contributions from Swati Anand and Curtis Rueden. This work was supported by funding from Semiconductor Research Corporation, Morgridge Institute for Research, and NIH grants R01CA199996, R01CA181385 and U54CA210190 to K.W.E.
CT-FIRE | Univerity of Wisconsin-Madison | N/A | open source software available from https://eliceirilab.org/software/ctfire/ |
CurveAlign | University of Wisconsin-Madison | N/A | open source software available from https://eliceirilab.org/software/curvealign/ |