We demonstrate methods for the detection of architectural distortion in prior mammograms. Oriented structures are analyzed using Gabor filters and phase portraits to detect sites of radiating tissue patterns. Each site is characterized and classified using measures to represent spiculating patterns. The methods should assist in the detection of breast cancer.
Cite this ArticleCopy Citation
Rangayyan, R. M., Banik, S., Desautels, J. E. Detection of Architectural Distortion in Prior Mammograms via Analysis of Oriented Patterns. J. Vis. Exp. (78), e50341, doi:10.3791/50341 (2013).
Translate text to:
We demonstrate methods for the detection of architectural distortion in prior mammograms of interval-cancer cases based on analysis of the orientation of breast tissue patterns in mammograms. We hypothesize that architectural distortion modifies the normal orientation of breast tissue patterns in mammographic images before the formation of masses or tumors. In the initial steps of our methods, the oriented structures in a given mammogram are analyzed using Gabor filters and phase portraits to detect node-like sites of radiating or intersecting tissue patterns. Each detected site is then characterized using the node value, fractal dimension, and a measure of angular dispersion specifically designed to represent spiculating patterns associated with architectural distortion.
Our methods were tested with a database of 106 prior mammograms of 56 interval-cancer cases and 52 mammograms of 13 normal cases using the features developed for the characterization of architectural distortion, pattern classification via quadratic discriminant analysis, and validation with the leave-one-patient out procedure. According to the results of free-response receiver operating characteristic analysis, our methods have demonstrated the capability to detect architectural distortion in prior mammograms, taken 15 months (on the average) before clinical diagnosis of breast cancer, with a sensitivity of 80% at about five false positives per patient.
Breast cancer is a major disease affecting women and is the second leading cause of cancer related death among women 1,2. In order to improve the chance of survival and the prognosis of the affected patients through effective treatment at early stages of breast cancer, the disease needs to be detected as early as possible. In retrospective analysis of cases of breast cancer, subtle signs of abnormalities have been observed on previously acquired screening mammograms 3,4. Architectural distortion is one such localized mammographic sign of possibly early stages of breast cancer that is difficult to detect 5,6. The associated patterns are vaguely described as distortion of the normal architecture of the breast with no definite mass visible. Architectural distortion could appear at the initial stages of the formation of a breast mass or tumor. We hypothesize that screening mammograms obtained prior to the detection of breast cancer could contain subtle signs of early stages of breast cancer, in particular, architectural distortion.
Figure 1a shows a prior mammographic image of a case of screen-detected cancer. The region of abnormality identified by a radiologist (J.E.L.D.) is outlined with a red rectangle. The prior mammogram was taken 24 months before the detection mammogram shown in Figure 1b. The prior mammogram had been declared to be free of signs of cancer at the original instance of screening. In retrospective analysis and in comparison with the detection mammogram, a suspicious region related to the site of cancer detected was labeled by the radiologist, and is outlined in red on the prior mammogram. The suspicious region contains signs of architectural distortion, including spicules.
Computer-aided diagnosis (CAD) techniques and systems offer the potential for achieving increased sensitivity in the detection of breast cancer 2,7-9. However, in comparison with the number of publications that exist in the literature on the detection of other signs of breast cancer, such as masses and calcifications, only a small number of studies have been reported on the detection of architectural distortion in the absence of a central mass 10-17. Commercially available CAD systems have been found to perform poorly in the detection of architectural distortion 18. Studies on the detection of architectural distortion in prior mammograms of screen-detected or interval-cancer cases 3,4,19-22 could help in developing strategies for the detection and treatment of breast diseases at their early stages, and lead to improvement in the prognosis for the patient 23.
Preparation of Images for the Experiment
Experiments were conducted with 158 mammographic images including 106 prior mammograms of 56 individuals diagnosed with breast cancer and 52 images of 13 normal individuals. Ethics approval for the study was obtained from the Conjoint Health Research Ethics Board, Office of Medical Bioethics, University of Calgary, and the Calgary Regional Health Authority. The images were obtained from Screen Test: Alberta Program for the Early Detection of Breast Cancer 21,24,25.
Mammograms acquired in the last scheduled visit to the screening program prior to the diagnosis of cancer outside the screening program were labeled as prior mammograms of interval-cancer cases. The corresponding diagnostic mammograms were not available. All but two of the 106 prior mammograms had been declared to be free of any sign of breast cancer at the time of their acquisition and analysis in the screening program; the individuals corresponding to the other two mammograms had been referred for biopsy. The time interval between the diagnosis of cancer and prior mammograms ranged from 1.5 months to 24.5 months, with an average of 15 months and standard deviation of 7 months. All of the prior mammograms of interval-cancer cases available in the database have been included in the present study, except six images in which no suspicious parts could be identified.
The screen-film mammograms were digitized at the spatial resolution of 50 μm and gray-scale resolution of 12 bits per pixel using the Lumiscan 85 laser scanner (Lumisys, Sunnyvale, CA). An expert radiologist specialized in mammography (J.E.L.D.) reviewed all of the 106 prior mammograms of interval-cancer cases and marked the suspected regions of architectural distortion with rectangular boxes based on the reports available on subsequent imaging or biopsy, or by detailed inspection of the mammograms. Of the 106 prior mammographic images in the dataset used in the present study, 38 images have visible architectural distortion, and the remaining 68 images contain questionable or no clearly evident architectural distortion. Each prior mammogram contains a single site of architectural distortion as identified by the rectangular box drawn by the radiologist. The average width, height, and area of the 106 suspicious parts of images marked by the radiologist are 56 mm, 39 mm, and 2,274 mm2, with standard deviation of 11.8 mm, 11.6 mm, and 1073.9 mm2, respectively.
1. Overview of Methodology
In our procedure, potential sites of architectural distortion in mammograms are automatically detected via analysis of oriented textural patterns with the application of a bank of Gabor filters 26 and modeling of phase portraits 11,27. The detected sites are then processed through the steps of extraction of features or measures to characterize architectural distortion, development of a trained classifier, and application of an algorithm for pattern recognition or classification. The procedure is summarized by the following steps 11,20,21:
- Segment the breast portion in the given mammographic image using adaptive thresholding and morphological opening.
- Apply a set of 180 Gabor filters with angles spaced evenly over the range -90° to +90° to obtain the Gabor magnitude image, M (i, j), and the Gabor angle image, θ (i, j), by selecting the response and angle of the filter with the highest response at each pixel, (i, j).
- Select curvilinear structures (CLSs) of interest, such as spicules and fibroglandular tissue, by distinguishing them from confounding structures, such as edges of the pectoral muscle, parenchymal tissue, breast boundary, and noise, by using the orientation field, the gradient field, the nonmaximal suppression (NMS) technique, and additional conditions 11.
- Filter the orientation field with a Gaussian filter with the standard deviation of 7 pixels and down-sample by a factor of four to reduce noise and further computational requirements 11,20.
- Apply linear phase-portrait modeling, with a sliding analysis window of size 10 x 10 pixels at 800 μm/pixel, with one pixel per step, to the filtered orientation field, with specific conditions to select phase-portrait maps related to particular types of node patterns 11,20.
- Cast a vote, if certain conditions are met, at the position given by the fixed point for each position of the analysis window to form the node map.
- Filter the node map with a Gaussian window of size 35 x 35 pixels, with the empirically determined standard deviation of 6 pixels (4.8 mm), to consolidate votes in close proximity to one another.
- Analyze the node map by rank-ordering the peaks in the node map.
- Cut regions of interest (ROIs), of size 128 x 128 pixels except at the edges, from the original image, with the center of each ROI located at the center of the related peak in the node map. At the edges of the image being processed, create ROIs to include as much of the image data as available in the specified window.
- Derive features or measures to characterize the spiculating patterns related to architectural distortion and separate them from normal tissue patterns that met some of the initial conditions.
- Develop a trained classifier to discriminate between the features of sites with architectural distortion and those of normal tissue patterns using a training set of ROIs classified by a radiologist.
- Apply the trained classifier to a set of test cases and verify the results with diagnosis provided by the radiologist and based on biopsy.
Steps 1-9 listed above are applied automatically to a given mammographic image. Selected steps of the procedure listed above are described and illustrated in the following sections.
2. Preprocessing of Mammographic Images
The preprocessing stage consists of the following steps:
- Filter the given mammographic image using a Gaussian filter, with a standard deviation of 2 pixels and size of 13 x 13 pixels at the resolution of 50 μm/pixel and 12 bits/pixel, and down-sample to 200 μm/pixel and 8 bits/pixel resolution.
- Reflect the image if it is of the right breast.
- Segment the breast region in the mammographic image using Otsu's adaptive thresholding method and morphological opening with a disk-shaped structuring element of radius 25 pixels (5 mm at 200 μm/pixel) 21,28,29.
- Detect the approximate breast boundary 10,21.
Figure 2A shows an original prior mammogram. Figure 2B of the same figure shows the result of approximate segmentation of the breast portion, which is used in the subsequent steps of processing and analysis.
3. Extraction of Oriented Patterns Using Gabor Filters
The real Gabor filter function oriented at -90° is specified in our work as 10,30:
where σx and σy are the standard deviation values in the x and y directions, and ƒo is the frequency of the modulating sinusoid. Filters at other angles are obtained by rotating this function using coordinate transformation as:
where (x', y') is the set of coordinates rotated by the angle α.
The parameters in Equation 1 for filtering mammograms are derived in our work by taking into account the average size of the breast tissue patterns to be detected, as follows 10:
- Let Τ be the full-width at half-maximum of the Gaussian term in Equation 1 along the x axis.
- Let Τ = 4 pixels, corresponding to a thickness of 0.8 mm at the pixel size of 200 μm.
- Calculate .
- Let the period of the cosine term be Τ; then, ƒo= 1/Τ.
- Let the value of σy be defined as σy = lσx, where l determines the elongation of the Gabor filter in the y direction, as compared to the width of the filter in the x direction. For analysis of mammograms at 200 μm/pixel, use l = 8.
A bank of 180 real Gabor filters evenly spaced over the range -90° to +90° is used in our methods for the detection of oriented patterns in mammograms 10, 21. For each given image, a Gabor magnitude image, M(i, j), and a Gabor angle image, θ(i, j), are obtained using the response and angle of the Gabor filter with the highest response at each pixel, (i, j).
The Gabor filter has a nonzero magnitude response at the origin of the frequency plane (zero frequency). Because low-frequency components are not related to the presence of architectural distortion, it is desirable to reduce the effect of the low-frequency components of the mammographic image in the orientation field magnitude. Therefore, the mammographic images are high-pass filtered prior to the extraction of the orientation field. This is achieved by computing the difference between the original image and a low-pass-filtered version of the same image. The low-pass filter used in this step is a Gaussian filter with the standard deviation equal to σy defined as above.
Although one could save the filtered image for each angle of interest, in the present work, the maximum response at each pixel over all of the filters (angles) used is saved in a single image, referred to as the Gabor magnitude response; the corresponding angle of the Gabor filter is saved at each pixel in another image, referred to as the Gabor angle response. Together, the two output images provide the orientation field of the given image.
Figure 3A shows a test image of a plant. Figure 3B shows the Fourier spectrum of the image, which depicts concentrations of energy at various angles. All parts of the image with the same orientation, regardless of their position and size, have their frequency components (spectral energy) located in an angular band or sector positioned at 90° with respect to their orientation in the image. The results of filtering the image with Gabor filters with Τ = 8 pixels and l = 8 are shown in Figures 3C and D. It is evident that the Gabor filters have extracted parts of the plant oriented at various angles with high magnitude response and that the angle response agrees with the orientation of the dominant feature present at the corresponding pixel. By using a bank of Gabor filters oriented at several angles over the range -90° to +90°, we have extracted all of the oriented components present in the image and their angles at each pixel. It is evident that the response of the Gabor filters is almost zero in smooth areas with the same intensity level and no structures with preferred orientation, such as parts of the pot and the wall.
Figure 4 shows the Gabor magnitude and angle responses obtained for the mammogram with architectural distortion shown in Figure 2B. It is evident that the Gabor filters have extracted oriented components with high responses as well as the corresponding angles. It is also seen that the response of the Gabor filters is low in smooth areas with almost constant density and no structures with preferred orientation. Upon close inspection, it may also be observed that the response of the Gabor filters depends upon the contrast of an oriented structure in relation to its background and not only on its density or brightness. These results are due to the bandpass nature of the Gabor filters.
4. Selection of Curvilinear Structures
Mammograms contain many CLSs corresponding to ducts, vessels, ligaments, parenchymal tissue, and edges of the pectoral muscle. Some abnormalities in mammograms could be characterized by the presence of certain types of CLS, such as spiculated masses 12,31,32 and architectural distortion 10,11,33, or by asymmetric structure of the oriented texture in the breast image 34. On the other hand, certain types of lesions, such as circumscribed masses, could be obscured by several CLSs superimposed on the lesions in the projected mammographic images; the appearance of such lesions could be altered and may lead to false-negative detection or misdiagnosis. Analysis of the CLSs present in mammograms could improve the performance of algorithms for the detection of spiculated masses and architectural distortion, as suggested by Zwiggelaar et al. 35. Therefore, the identification of CLSs is an important step in the detection of architectural distortion.
Although the Gabor filter bank used in the present work is sensitive to linear structures, such as spicules and fibers, it also detects other strong edges, such as edges of the pectoral muscle, edges of the parenchymal tissue, and vessel walls, as oriented structures. Strong edges around the fibroglandular disk 36 could be used in the detection of a particular form of architectural distortion 37 known as focal retraction. However, in the present work, it is important that only CLSs related to fibroglandular tissues are identified as oriented features.
The method for the selection of CLSs in the present work includes the following three steps:
- Segment the breast area in a given mammogram as described in Section 2.
- Detect core CLS pixels by applying the NMS technique 35,38 to the Gabor magnitude response image.
- Reject CLSs pixels at sites with a strong gradient 33.
The NMS algorithm identifies core CLS pixels by comparing each pixel in the magnitude response image with its neighbors along the direction that is perpendicular to the local orientation field angle; see Figure 5. If the pixel under investigation has a larger magnitude value than the corresponding neighbors, the pixel is a core CLS pixel. NMS is a common step in many edge detectors (such as the Canny edge detector 39). Zwiggelaar et al. 35 used NMS for the detection of CLS pixels in the same manner as described in this section.
The presence of a strong gradient could cause a ripple in the Gabor magnitude response, leading to an erroneous detection of a CLS. The core CLS pixels associated with the presence of strong gradients are rejected by the criteria proposed by Karssemeijer and te Brake 12 in the context of detection of spiculated lesions. The gradient of the mammographic image is obtained using the first derivative of a Gaussian with a standard deviation of five pixels (1 mm). For each core CLS pixel, the direction of the gradient is compared to the direction of the orientation field. If the difference between the direction of the orientation field and the direction perpendicular to the gradient is less than 30°, the corresponding core CLS pixel is discarded.
The CLSs within the fibroglandular disk typically possess reduced contrast as compared to the CLSs outside the fibroglandular disk. Consequently, the CLSs within the fibroglandular disk have smaller Gabor magnitude response values than the CLSs outside the disk. In order to assign the same weight to all of the CLS pixels independent of location, and to ensure the detection of the relevant CLSs with low contrast, such as spicules within the fibroglandular disk, the magnitude field M (i, j) is replaced for further processing by an image composed of only core CLS pixels, MCLS(i, j), defined as follows:
The image MCLS(i, j) conveys important information on the presence of CLSs. Figure 6 shows the results of CLS selection with a full mammogram and an ROI. Because the presence of architectural distortion is indicated by the geometrical arrangement of the associated CLSs rather than their density or intensity, the magnitude of the detected CLSs is of lower importance than the spatial layout of the oriented structures.
5. Detection and Labeling of Suspicious Sites via Analysis of Phase Portraits
Rao and Jain 40 developed a method for the analysis of oriented texture in images by associating the corresponding gradient orientation field with the appearance of phase portraits. A phase portrait of a system of two linear, first-order, differential equations shows the possible trajectories of the state variables 27.
Let p(t) and q(t), t R, represent two differentiable functions of time t, related as
Here, p•(t) and q•(t) are the first-order derivatives with respect to time, and F and G are functions of p and q 10. Given the initial conditions p(0) and q(0), the solution [p(t), q(t)], can be represented in the form of a parametric trajectory or streamline of a hypothetical particle in the (p, q) plane. The particle is placed at [p(0), q(0)] at time t = 0 and moves through the (p, q) plane with the velocity [p•(t) and q•(t)]. The (p, q) plane is known as the phase plane of the system. A phase portrait is a graph of the possible trajectories of a particle in the phase plane. A fixed point is a point in the phase plane where p•(t) = 0 and q•(t) = 0. A particle left at a fixed point remains stationary. For an affine system, we have
Here, A is a 2 x 2 matrix and b is a 2 x 1 column matrix. The center (p0 , q0) of the phase portrait is given by the fixed point as
If we associate the functions p(t) and q(t) with the x and y coordinates of the plane of the image being processed, the corresponding orientation field is
Here, Φ(x y) is the angle of the velocity vector [ p•(t), q•(t)] with respect to the x axis at (x, y) = [p(t), q(t)]. We associate Φ(x y) with the Gabor angle response θ (i, j), and define an error function to be minimized as
where [a, b] and [c, d] are the two rows of A. The last term provides for a higher penalty (cost) for deviation in configurations of the matrix A from those related to spiculated node patterns. The equation given above represents Φ(x y) on a discrete grid (i, j) instead of the continuous space (x, y). Estimates of A and b that minimize ε2(A, b) are obtained by the following procedure:
- Obtain initial estimates of A and b through the minimization of ε2(A, b) using the simulated annealing method 41.
- Obtain the optimal estimates by refining the initial estimates by using a nonlinear least-squares algorithm 42.
In the model described above, there are three possible types of phase portraits: node, saddle, and spiral. The type of phase portrait is determined by the eigenvalues of A 10,27,30,40. The orientation field of a textured image can be described by determining the type of the phase portrait that is most similar to its orientation field. Because spiral patterns are not of interest in the analysis of mammograms, we constrain the matrix A to be symmetric, resulting in only two types of phase portraits: node and saddle.
Because of the expected presence of a number of spicules at various angles that get superimposed in the projected mammographic image, we hypothesize that a site of architectural distortion will present node-like characteristics. However, normal tissues, ducts, vessels, and other oriented structures in the breast could also get projected and superimposed to form patterns that mimic the appearance of architectural distortion in a mammogram. Therefore, we analyze the node map for the detection of suspicious sites or potential sites of architectural distortion, and analyze the detected sites through further steps of feature extraction and pattern classification.
Because a mammogram could exhibit several patterns, we apply a sliding analysis window of size 10 x 10 pixels, at 800 μm/pixel, with one pixel per step. For each position of the window, a vote is cast in a map, referred to as the node map, at the position given by the corresponding fixed point, if all of the conditions applied are satisfied. Results related to the matrix A with its condition number greater than 3.0 are rejected to ignore patterns not expected to be associated with architectural distortion 11. Furthermore, an additional condition is imposed on the distance between a fixed point and the position of the corresponding analysis window: if the distance is less than three pixels (2.4 mm) or greater than 20 pixels (16 mm), the results for the current analysis window are rejected. The magnitude of the vote is set equal to the ratio of the measure of fit ε2 (A, b), defined in Equation 7, to the condition number of A, to emphasize the isotropy of the phase portrait. The node map is then analyzed to detect local maxima or peaks that are expected to indicate sites of architectural distortion. However, the procedure also results in the detection of a number of false-positive (FP) sites due to superimposed normal structures.
At each peak in the node map, we automatically extract an ROI, of size 128 x 128 pixels except at the edges of the images, from the mammographic image at 200 μm/pixel. We label the ROIs at the locations indicated by the peaks in the node map, in decreasing order of the values of the peaks, up to a maximum of 30 ROIs per mammogram.
When mammograms with known diagnoses are used to train our procedure, the automatically detected ROIs with their centers within the parts of architectural distortion identified by the radiologist are labeled as true-positive (TP) ROIs; the others are labeled as FP ROIs for use in the training procedure. When a mammogram is analyzed using the trained procedure, all of the ROIs detected as above are processed for classification without any labeling.
Figure 7 shows the node map and the ROIs detected for the mammogram shown in Figure 2B. The red rectangles indicate the suspicious area marked by the radiologist.
Figure 8 shows a number of TP and FP ROIs extracted automatically from several mammograms. Most of the TP ROIs have several spicules and oriented patterns spread over a wide range of angles. The FP ROIs, on the other hand, have a smaller number of normal tissue structures oriented over a narrower range of angles; regardless, due to their superposition in the projected mammographic image, they mimic the node-like characteristics of architectural distortion.
Our strategy is to detect suspicious regions with high efficiency or sensitivity at the initial stage (with correspondingly low false negatives), even if the accompanying number of FPs is large. The next step of analysis of the ROIs is designed to help reduce the FPs via efficient characterization and classification of the detected ROIs.
6. Characterization of Architectural Distortion
An automatically detected ROI including architectural distortion, centered at a peak in the related node map, is likely to possess several spicules scattered at various angles. We expect this characteristic to lead to a broad angular spread of energy in the image domain and spectral energy in the Fourier domain. In our previous works, we have shown that such an angular dispersion may be represented efficiently in the form of a rose diagram, which is an angular histogram 21,22. We normalize the rose diagram to have unit area and treat it as a probability density function (PDF). Then, we characterize the PDF of each ROI using entropy, which is a statistical measure of disorder or scatter.
The increased scatter of tissue patterns in regions with architectural distortion modifies the fractal nature of normal breast tissue. The commonly used models of fractals are based on multiscale nested patterns of self-similar patterns 43-46. Another model of fractal behavior is fractional Brownian motion (fBm) that is related to a spectrum in the frequency domain in which the power decreases in proportion to (1/f)^β, where f is the frequency and β is known as the spectral component 47,48. The fBm model leads to fractal images that are similar to random cloudy patterns; comparable patterns are often seen in mammograms. In order to apply this model to images, the two-dimensional (2D) Fourier spectrum of the image needs to be converted to a one-dimensional (1D) function.
We have developed an integrated method to characterize angular spread and to derive an estimate of the fractal dimension (FD) of an image by mapping the 2D Fourier spectrum of the image in rectangular coordinates, denoted by S(u,v), to a spectrum in polar coordinates, denoted by S(ƒ,Ν). The procedure is described by the following steps 21:
- Apply the von Hann window to each automatically extracted 128 x 128 ROI and pad the result with zeros to an array of size 256 x 256 pixels.
- Compute the 2D Fourier transform of the padded ROI and the magnitude of each resulting complex value to obtain an estimate of the power spectrum, S(u,v), of the ROI.
- Identify selected low-frequency and high-frequency portions of the spectrum for exclusion in the subsequent steps.
- Map the 2D power spectrum S(u,v) from the Cartesian (rectangular) coordinates (u,v) to the polar coordinates (ƒ,Ν) to obtain S(ƒ,Ν), by resampling and computing a weighted average of the four neighbors of each point for radial distance f ranging from zero to one-half of the sampling frequency, and over the range of angle Ν = [0, 179°].
- Transform the 2D spectrum S(ƒ,Ν) into a 1D function S(ƒ), by integrating as a function of the radial distance or frequency f from the zero-frequency point over the range in Ν = [0, 179°] in angle.
- Apply linear regression to a limited frequency range of the 1D spectrum S(ƒ) on a log-log scale, excluding points in selected low-frequency and high-frequency regions, and obtain the slope β of the fitted line, which is an estimate the spectral component in the fBm model.
- Compute the estimated value of FD as 15,49,50 FD = (8 - β) / 2.
- Transform the 2D spectrum S(ƒ,Ν) into a 1D function S(Ν), by integrating as a function of the angle Ν for the range [0, 179°], from the zero-frequency point over radial distance ƒ = [1, 128] pixels.
- Normalize S(Ν) to have unit sum and compute the entropy of the result as .
The geometric transformation described above leads to improved representation and visualization of the spectral characteristics of periodic or spiculated texture 9. Selected low- and high-frequency regions need to be excluded to remove the effects of the low-frequency components related to the overall appearance of the image and the large structures present in the image, as well as to prevent the effects of high-frequency noise. In the present work, the bands of frequencies to be excluded in the estimation of β and FD (i.e. the nonlinear portions) are selected based on experimentation using synthesized images with known FD, and also using a number of ROIs of mammograms. The range of ƒ used to fit the linear model corresponds to [6, 96] pixels or [0.117, 1.875] mm-1, where the range of [1, 128] pixels corresponds to discrete representation of the frequency range [0, 2.5] mm-1.
Figures 9 and 10 illustrate the various steps for fractal analysis and the estimation of the angular spread of power in the frequency domain for a TP ROI and an FP ROI, respectively. Figure 9D indicates the existence of multidirectional spiculating patterns for the TP ROI, whereas in Figure 10D, the spread of power is limited to a small number of angular bands for the FP ROI.
7. Pattern Classification and Validation
We now have three measures or features for each ROI detected automatically: [node value, HF, FD], as described in Section 6. We use these features, individually and collectively, to characterize the spiculating patterns related to architectural distortion and to differentiate the TP ROIs from the FP ROIs detected.
For the TP ROI shown in Figure 9, the feature vector composed by the three measures derived is [0.0299, 7.2224, 2.3037]. For the FP ROI shown in Figure 10, the corresponding feature vector is [0.0349, 6.9444, 2.5223]. As expected, the HF value is higher for the TP ROI than for the FP ROI, and the value of FD is lower. However, the node value is lower for the TP ROI than for the FP ROI, which is contrary to the expected differences, due to the presence of overlapped structures in the latter. In general, we may expect some of the features to follow the expected trends and assist in classification of the ROIs even if other features fail to demonstrate the expected behavior.
To evaluate the performance of the features, we use the area under the receiver operating characteristic (ROC) curve (AUC) 51,52 and free-response ROC (FROC) analysis 53-55. For ROC and FROC analysis with an individual feature, we do not use a trained classifier; instead, we apply a sliding threshold. The three individual features of node, FD, and HF provided AUC values of 0.61, 0.59, and 0.64, respectively, which indicate good potential but not adequate performance in pattern classification. The p values of the same features are 1.7638e-009, 1.8793e-004, and 2.2615e-013, which indicate statistically highly significant differences between their values for the sets of automatically detected TP and FP ROIs.
When large numbers of features are used to represent samples for classification, it is necessary to select an optimal subset of features so as to remove correlated features and reduce the complexity of the classifier 20-22; several procedures, such as stepwise logistic regression 56, may be used for this purpose. In the present work, because we are using only three features per ROI, we do not perform feature selection.
To perform validation of the trained classifier, we apply the pattern classification procedures with the leave-one-patient-out approach. We exclude all of the ROIs extracted from the mammograms of the patient to be tested from the training procedure of the classifier, and then apply the classifier so obtained to the test case. We then repeat the procedure for the entire dataset, one case or patient at a time.
For ROC analysis with the set of three features, we use a classifier that performs quadratic discriminant analysis with the Bayesian assumption 57. To generate FROC curves, we consider the TP ROI with the highest discriminant value in the two mammographic images available for the patient, except in six cases where only one image is available per case.
The three features, namely, node value, FD, and HF , provided AUC values of 0.61, 0.59, and 0.64, respectively, when each feature was used on its own. Combined use of the three features provided improved performance with AUC = 0.70. The FROC curve obtained with the combination of the three features is shown in Figure 11, which indicates a sensitivity of 80% at 5.6 FPs/patient and 89% at 7.5 FPs/patient. Use of only the node value provided a sensitivity of 80% at 8.1 FPs/patient and 89% at 13.8 FPs/patient.
The reduction of FPs in the final result is illustrated in Figure 12. For the sake of illustration, only six ROIs with the highest rankings are shown. The numbers outside the parentheses indicate the ranking based on the discriminant values obtained by the Bayesian classifier; the numbers within the parentheses correspond to the earlier ranking based on the node map. Comparing Figure 12 with the initial stage of detection of suspicious ROIs shown in Figure 7, it is clear that the features used to characterize architectural distortion have led to a substantial reduction of FPs, as compared with the initial stage of node analysis, while maintaining good sensitivity of detection. This is a case where three of the highly ranked ROIs have overlapped with the suspicious area of architectural distortion marked by the radiologist, and represents a case of successful detection by our procedure.
In a clinical application, the number of ROIs to be displayed in the final result should be determined depending upon the desired sensitivity and number of FPs that would be tolerated, as well as the preference of the radiologists.
Figure 1. (A) A prior mammogram of size 1,377 x 850 pixels at 200 μm/pixel resolution; (A) corresponding detection mammogram of size 1374 x 850 pixels at 200 μm/pixel resolution; (C) magnified region of architectural distortion in the image shown in part (A), of size 39.2 mm x 21.8 mm; (D) magnified region of architectural distortion in the image shown in part (B), of size of size 40.8 mm x 26.8 mm. The prior mammogram was taken 24 months before the detection mammogram. This is a case of screen-detected breast cancer. Click here to view larger figure.
Figure 2. (A) A prior mammogram of size 1377 x 850 pixels at 200 μm/pixel resolution; (B) corresponding image after preprocessing for approximate segmentation of the breast region. Click here to view larger figure.
Figure 3. (A) Test image of a plant with several oriented parts, of size 646 x 668 pixels; (B) Fourier magnitude spectrum of the image showing energy concentrated at several angles; (C) Gabor magnitude response; and (D) Gabor angle response. 180 Gabor filters were used over the range -90° to +90°, with Τ = 8 pixels and l = 8. Click here to view larger figure.
Figure 4. (A) Gabor magnitude and (B) angle responses for the mammogram with architectural distortion shown in Figure 2B, of size 1377 x 850 pixels at 200 μm per pixel. 180 Gabor filters were used over the range -90° to +90°, with Τ = 4 pixels and l = 8. The rectangle (red or green) shows the area of architectural distortion marked by the radiologist, of size 47.6 mm x 29.9 mm. (C), (D) Magnified views of the region of architectural distortion. Click here to view larger figure.
Figure 5. The NMS technique: the elongated rectangle (in gray) denotes the presence of a CLS. The squares denote pixels along a direction perpendicular to the orientation of the CLS. The central green square indicates a core CLS pixel.
Figure 6. NMS and CLS selection results overlaid on the full mammographic image in Figure 2A. (A) NMS results. (B) CLS selection results. The pixels marked in white correspond to CLS pixels that are retained for further analysis. (C) NMS results and (D) CLS selection results in magnified views for the ROI marked in Figure 4A. Click here to view larger figure.
Figure 7. (a) Node map and (b) ROIs detected for the mammogram shown in Figure 2B. The mammographic image is of size 1377 x 850 pixels at 200 μm per pixel. The size of the area of architectural distortion (red rectangle) marked by the radiologist is 47.6 mm x 29.9 mm. Each ROI is of size 128 x 128 pixels, except at the edges of the image. Click here to view larger figure.
Figure 8. Examples of (A)-(C) three TP ROIs and (D)-(F) three FP ROIs. Each ROI is of size 128 x 128 pixels. The corresponding node values are shown. Click here to view larger figure.
Figure 9. (A) A 128 x 128 pixel TP ROI with architectural distortion; pixel size = 200 μm. Node value = 0.0299. (B) The 2D Fourier log-power spectrum S(u,v) obtained after applying the von Hann window and zero padding the ROI to 256 x 256 pixels. (C) The power spectrum in the (f,Ν) space. The horizontal axis corresponds to angle Ν from 0° to 179° and the vertical axis corresponds to radial frequency from 0.02 mm-1 to 2.5 mm-1. The top-left corner pixel corresponds to frequency of 0.02 mm-1 and angle of 0°. A black frame has been applied to the spectrum. (D) Angular spread of power, S(Ν). Entropy HF = 7.2224. (E) The 1D power spectrum S(f) plotted on a log-log scale as a function of radial frequency f. The linear fit is also shown (red line), which resulted in FD = 2.3037 for the TP ROI. Click here to view larger figure.
Figure 10. (A) A 128 x 128-pixel FP ROI; pixel size = 200 μm. The ROI caused an FP node due to overlapping and/or intersecting normal structures. Node value = 0.0349. (B) The 2D Fourier log-power spectrum S(u,v) obtained after applying the von Hann window and zero padding the ROI to 256 x 256 pixels. (C) The power spectrum in the (f,Ν) space. The horizontal axis corresponds to angle Ν from 0° to 179° and the vertical axis corresponds to radial frequency from 0.02 mm-1 to 2.50 mm-1. The top-left corner pixel corresponds to frequency of 0.02 mm-1 and angle of 0°. A black frame has been applied to the spectrum. (D) Angular spread of power, S(Ν). Entropy HF = 6.9444. (E) The 1D power spectrum S(f) plotted on a log-log scale as a function of radial frequency f. The linear fit is also shown (red line), which resulted in FD = 2.5223 for the FP ROI. Click here to view larger figure.
Figure 11. FROC curve showing the detection performance of the proposed features.
Figure 12. Final labeling of ROIs for the original mammogram shown in Figure 2. The suspicious ROIs detected at an earlier stage of processing are shown in Figure 7. The rectangles outlined in green represent TP ROIs in the final stage of analysis; the remaining rectangles outlined in yellow represent FPs or false alarms. The rectangle in red indicates the area of architectural distortion marked by the radiologist for the present study; this information was not used in the leave-one-patient-out procedure applied to the present case and would not be available in a prospective application of the proposed methods.
We have presented a series of sophisticated techniques of digital image processing and pattern recognition, also known as machine learning and CAD, for the detection of architectural distortion in prior mammograms of interval-cancer cases. The methods are based on analysis of the oriented textural patterns present in mammographic images. Our methods, including several more features proposed in our related works, are capable of detecting early signs of breast cancer 15 months ahead of the time of clinical diagnosis, on the average, with a sensitivity of 80% at less than 4 FPs/patient 22,58.
In a potential clinical application, the ROIs labeled by our procedures should be viewed as prompts for careful inspection of the corresponding areas of the mammograms by the radiologist. The final decision regarding the presence or absence of breast cancer is to be made by the radiologist, who may request additional imaging procedures or clinical tests to verify or confirm suspicions raised by mammography and CAD.
Although our methods have provided exciting results in the present retrospective study, they are not yet ready for clinical use. The methods take about 6 min per image on a Dell Precision PWS 490 workstation with Quad Intel Xeon processors operating at 3.0 GHz, with 12 GB of RAM; computational requirements need to be reduced by optimal implementation of the computer code. The results are comparable to or slightly better than those reported in studies on architectural distortion with commercially available CAD systems 18,59,60, with the distinction that the present work is based on prior mammograms. The number of FPs needs to be reduced to about one per patient with an associated sensitivity of at least 80%.
Limitations exist in our work in terms of the types of architectural distortion detected by the models used. The methods need to be tested with larger datasets. The parameters used in the methods, which have been empirically determined in the present work, need to be optimized in relation to the characteristics of the mammograms in a given dataset to be analyzed. We expect our procedures to lead to better results with direct digital mammograms and breast tomosynthesis images than those obtained with scanned screen-film images as in the present work.
Our methods show promise in the detection of architectural distortion and breast cancer at early stages. Further work is required to achieve the detection of architectural distortion with high sensitivity and low FP rates.
The authors have nothing to disclose.
This work was supported by grants from the Collaborative Research and Training Experience Programme (CREATE) and a Discovery Grant from the Natural Sciences and Engineering Research Council (NSERC) of Canada.
- Cancer among women [Internet]. Centers for Disease Control and Prevention (CDC). Available from: http://www.cdc.gov/cancer/dcpc/data/women.htm (20011).
- Tang, J., Rangayyan, R. M., Xu, J., El-Naqa, I., Yang, Y. Computer-aided detection and diagnosis of breast cancer with mammography: Recent advances. IEEE Transactions on Information Technology in Biomedicine. 13, (2), 236-251 (2009).
- van Dijck, J. A. A. M., Verbeek, A. L. M., Hendriks, J. H. C. L., Holland, R. The current detectability of breast cancer in a mammographic screening program. Cancer. 72, (6), 1933-1938 (1993).
- Rangayyan, R. M., Prajna, S., Ayres, F. J., Desautels, J. E. L. Detection of architectural distortion in mammograms acquired prior to the detection of breast cancer using Gabor filters, phase portraits, fractal dimension, and texture analysis. International Journal of Computer Assisted Radiology and Surgery. 2, (6), 347-361 (2008).
- Homer, M. J. Mammographic Interpretation: A Practical Approach. 2nd, McGraw-Hill. New York, NY. (1997).
- Knutzen, A. M., Gisvold, J. J. Likelihood of malignant disease for various categories of mammographically detected, nonpalpable breast lesions. Mayo Clinic Proceedings. 68, 454-460 (1993).
- Rangayyan, R. M., Ayres, F. J., Desautels, J. E. L. A review of computer-aided diagnosis of breast cancer: Toward the detection of subtle signs. Journal of the Franklin Institute. 344, 312-348 (2007).
- Doi, K. Diagnostic imaging over the last 50 years: research and development in medical imaging science and technology. Physics in Medicine and Biology. 51, R5-R27 (2006).
- Rangayyan, R. M. Biomedical Image Analysis. CRC Press. Boca Raton, FL. (2005).
- Rangayyan, R. M., Ayres, F. J. Gabor filters and phase portraits for the detection of architectural distortion in mammograms. Medical and Biological Engineering and Computing. 44, 883-894 (2006).
- Ayres, F. J., Rangayyan, R. M. Reduction of false positives in the detection of architectural distortion in mammograms by using a geometrically constrained phase portrait model. International Journal of Computer Assisted Radiology and Surgery. 1, 361-369 (2007).
- Karssemeijer, N., te Brake, G. M. Detection of stellate distortions in mammograms. IEEE Transactions on Medical Imaging. 15, (5), 611-619 (1996).
- Guo, Q., Shao, J., Ruiz, V. F. Characterization and classification of tumor lesions using computerized fractal-based texture analysis and support vector machines in digital mammograms. International Journal of Computer Assisted Radiology and Surgery. 4, (1), 11-25 (2009).
- Sampat, M. P., Whitman, G. J., Markey, M. K., Bovik, A. C. Evidence based detection of spiculated masses and architectural distortion. Proceedings of SPIE Medical Imaging 2005: Image Processing. Fitzpatrick, J. M., Reinhardt, J. M. San Diego, CA, 5747, 26-37 (2005).
- Tourassi, G. D., Delong, D. M., Floyd Jr,, E, C. A study on the computerized fractal analysis of architectural distortion in screening mammograms. Physics in Medicine and Biology. 51, (5), 1299-1312 (2006).
- Nemoto, M., Honmura, S., Shimizu, A., Furukawa, D., Kobatake, H., Nawano, S. A pilot study of architectural distortion detection in mammograms based on characteristics of line shadows. International Journal of Computer Assisted Radiology and Surgery. 4, (1), 27-36 (2009).
- Matsubara, T., Hara, T., Fujita, H., Endo, T., Iwase, T. Automated detection method for mammographic spiculated architectural distortion based on surface analysis. Proceedings of the 22nd International Congress and Exhibition on Computer Assisted Radiology and Surgery (CARS2008), Barcelona, Spain, 3, (1), 176-177 (2008).
- Baker, J. A., Rosen, E. L., Lo, J. Y., Gimenez, E. I., Walsh, R., Soo, M. S. Computer-aided detection (CAD) in screening mammography: Sensitivity of commercial CAD systems for detecting architectural distortion. American Journal of Roentgenology. 181, 1083-1088 (2003).
- Sameti, M., Ward, R. K., Morgan-Parkes, J., Palcic, B. Image feature extraction in the last screening mammograms prior to detection of breast cancer. IEEE Journal of Selected Topics in Signal Processing. 3, (1), 46-52 (2009).
- Rangayyan, R. M., Banik, S., Desautels, J. E. L. Computer-aided detection of architectural distortion in prior mammograms of interval cancer. Journal of Digital Imaging. 23, (5), 611-631 (2010).
- Banik, S., Rangayyan, R. M., Desautels, J. E. L. Detection of architectural distortion in prior mammograms. IEEE Transactions on Medical Imaging. 30, (2), 279-294 (2011).
- Banik, S., Rangayyan, R. M., Desautels, J. E. L. Measures of angular spread and entropy for the detection of architectural distortion in prior mammograms. International Journal of Computer Assisted Radiology and Surgery. 8, 121-134 (2013).
- Broeders, M. J. M., Onland-Moret, N. C., Rijken, H. J. T. M., Hendriks, J. H. C. L., Verbeek, A. L. M., Holland, R. Use of previous screening mammograms to identify features indicating cases that would have a possible gain in prognosis following earlier detection. European Journal of Cancer. 39, 1770-1775 (2003).
- Alto, H., Rangayyan, R. M., Paranjape, R. B., Desautels, J. E. L., Bryant, H. An indexed atlas of digital mammograms for computer-aided diagnosis of breast cancer. Annales des Télécommunications. (5-6), 820-835 (2003).
- Screen Test and the Alberta Breast Cancer Screening Program[Internet]. Alberta Health Services. Available from: http://www.albertahealthservices.ca/services.asp?pid=service&rid=1002353 (2013).
- Gabor, D. Theory of communication. Journal of the Institute of Electrical Engineers. 93, 429-457 (1946).
- Rao, A. R. A Taxonomy for Texture Description and Identification. Springer-Verlag. New York, NY. (1990).
- Otsu, N. A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics. 9, (1), 62-66 (1979).
- Gonzalez, R. C., Woods, R. E. Digital Image Processing. 2nd, Prentice-Hall. Upper Saddle River, NJ. (2002).
- Ayres, F. J., Rangayyan, R. M. Design and performance analysis of oriented feature detectors. Journal of Electronic Imaging. 16, (2), (2007).
- Samulski, M., Karssemeijer, N. Optimizing case-based detection performance in a multiview CAD system for mammography. IEEE Transactions on Medical Imaging. 30, (4), 1001-1009 (2011).
- Muralidhar, G. S., Bovik, A. C., Giese, J. D., Sampat, M. P., Whitman, G. J., Haygood, T. M., Stephens, T. W., Markey, M. K. Snakules: a model-based active contour algorithm for the annotation of spicules on mammography. IEEE Transactions of Medical Imaging. 29, (10), 1768-1780 (2010).
- Ayres, F. J., Rangayyan, R. M. Detection of architectural distortion in mammograms via analysis of phase portraits and curvilinear structures. Proceedings of EMBEC'05: 3rd European Medical & Biological Engineering Conference. Hozman, J., Kneppo, P. Prague, Czech Republic, 11, 1768-1773 (2005).
- Ferrari, R. J., Rangayyan, R. M., Desautels, J. E. L., Frère, A. F. Analysis of asymmetry in mammograms via directional filtering with Gabor wavelets. IEEE Transactions on Medical Imaging. 20, (9), 953-964 (2001).
- Zwiggelaar, R., Astley, S. M., Boggis, C. R. M., Taylor, C. J. Linear structures in mammographic images: Detection and classification. IEEE Transactions on Medical Imaging. 23, (9), 1077-1086 (2004).
- Ferrari, R. J., Rangayyan, R. M., Borges, R. A., Frère, A. F. Segmentation of the fibro-glandular disc in mammograms using Gaussian mixture modeling. Medical and Biological Engineering and Computing. 42, 378-387 (2004).
- Ichikawa, T., Matsubara, T., Hara, T., Fujita, H., Endo, T., Iwase, T. Automated detection method for architectural distortion areas on mammograms based on morphological processing and surface analysis. Fitzpatrick, J. M., Sonka, M. Proceedings of SPIE Medical Imaging 2004: Image Processing, February 2004, San Diego, CA, SPIE. 920-923 (2004).
- Sonka, M., Hlavac, V., Boyle, R. Image Processing, Analysis and Machine Vision. 1st, Chapman & Hall. London, UK. (1993).
- Canny, J. A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence. 8, (6), 679-698 (1986).
- Rao, A. R., Jain, R. C. Computerized flow field analysis: Oriented texture fields. IEEE Transactions on Pattern Analysis and Machine Intelligence. 14, (7), 693-709 (1992).
- Kirkpatrick, S., Gelatt, C. D., Vecchi, M. P. Optimization by simulated annealing. Science. 220, (4598), 671-680 (1983).
- Gershenfeld, N. The Nature of Mathematical Modeling. Cambridge University Press. Cambridge, UK. (1999).
- Mandelbrot, B. B. The Fractal Geometry of Nature. Freeman. San Francisco, CA. (1983).
- Peitgen, H. -O., Jürgens, H., Saupe, D. Chaos and Fractals: New Frontiers of Science. second, Springer. New York, NY. (2004).
- Fortin, C., Kumaresan, R., Ohley, W. Fractal dimension in the analysis of medical images. IEEE Engineering in Medicine and Biology Magazine. 11, 65-71 (1992).
- Schepers, H. E., van Beek, J. H. G. M., Bassingthwaighte, J. B. Four methods to estimate the fractal dimension from self-affine signals. IEEE Engineering in Medicine and Biology Magazine. 11, 57-64 (1992).
- Bak, P., Tang, C., Wiesenfeld, K. Self-organized criticality: An explanation of 1/f noise. The American Physical Society. 59, 381-384 (1987).
- Billock, V. A., De Guzman, G. C., Kelso, J. A. S. Fractal time and 1/f spectra in dynamic images and human vision. Physica D: Nonlinear Phenomena. 148, 136-146 (2001).
- Anguiano, E., Pancorbo, M. A., Aguilar, M. Fractal characterization by frequency analysis: I. Surfaces. Journal of Microscopy. 172, 223-232 (1993).
- Aguilar, M., Anguiano, E., Pancorbo, M. A. Fractal characterization by frequency analysis: II. A new method. Journal of Microscopy. 172, 233-238 (1993).
- Metz, C. E. ROC methodology in radiologic imaging. Investigative Radiology. 21, 720-733 (1986).
- Kurt Rossmann Laboratories for Radiologic Image Research. ROC Software [Internet]. ROCKIT. Available from: http://www-radiology.uchicago.edu/krl/roc_soft6.htm (2012).
- Bornefalk, H., Hermansson, A. B. On the comparison of FROC curves in mammography CAD systems. Medical Physics. 32, (2), 412-417 (2005).
- Miller, H. The FROC curve: A representation of the observer's performance for the method of free response. Journal of the Acoustical Society of America. 46, 1473-1476 (1969).
- Chakraborty, D. P. Statistical power in observer-performance studies: Comparison of the receiver operating characteristic and free-response methods in tasks involving localization. Academic Radiology. 9, (2), 147-156 (2002).
- Ramsey, F. L., Schafer, D. W. The Statistical Sleuth: A Course in Methods of Data Analysis. Duxbury Press. Belmont, CA. (1997).
- Wiley-Interscience, New York, NY. 2nd edition (2001).
- Rangayyan, R. M., Banik, S., Chakraborty, J., Mukhopadhyay, S., Desautels, J. E. L. Measures of divergence of oriented patterns for the detection of architectural distortion in prior mammograms. International Journal of Computer Assisted Radiology and Surgery. (2013).
- Burhenne, L. J. W., Wood, S. A., D'Orsi, C. J., Feig, S. A., Kopans, D. B., O'Shaughnessy, K. F., Sickles, E. A., Tabar, L., Vyborny, C. J., Castellino, R. A. Potential contribution of computer-aided detection to the sensitivity of screening mammography. Radiology. 215, (2), 554-562 (2000).
- Birdwell, R. L., Ikeda, D. M., O'Shaughnessy, K. F., Sickles, E. A. Mammographic characteristics of 115 missed cancers later detected with screening mammography and the potential utility of computeraided detection. Radiology. 219, (1), 192-202 (2001).