A Novel k-Nearest Neighbor Method with Distribution Discrepancy and Differential Feature Importance for Rolling Bearing Fault Diagnosis

Zhenghui Li; Na Zhang; Ziming Wang

doi:10.3791/70568

Method Article

A Novel k-Nearest Neighbor Method with Distribution Discrepancy and Differential Feature Importance for Rolling Bearing Fault Diagnosis

DOI:

10.3791/70568

⸱

June 5th, 2026

Zhenghui Li¹ , Na Zhang¹ , Ziming Wang¹

¹Electrical Engineering, Zhengzhou Railway Vocational & Technical College

Summary

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

This study proposes a k-nearest neighbor method that integrates distribution discrepancy and differential feature importance for accurate rolling bearing fault diagnosis.

Abstract

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Rolling bearings are among the most vulnerable components in various types of rotary machinery, and accurate fault detection and localization are essential. When a rolling bearing fails, the signal is non-stationary, and the energy distribution of the vibration signal varies depending on the fault location. In traditional k-nearest neighbor (KNN) fault diagnosis algorithms, Euclidean distance is primarily used to measure the distance between sample points, which is not effective at capturing similarity across different spatial distributions. Moreover, these algorithms assume equal feature importance, which does not reflect the actual characteristics of fault vibration signals. This study proposes a KNN-based rolling bearing fault diagnosis method that incorporates distribution discrepancy and differential feature importance. First, vibration signals are decomposed using three-level wavelet packet decomposition, and the energy of each node at the third level is used as the fault feature. Then, the mean impact value (MIV) algorithm is used to determine the relative importance of each feature, and the Earth mover’s distance (EMD) is applied to measure differences between spatial distributions. By integrating Euclidean distance with MIV and EMD and applying the KNN majority voting rule, fault diagnosis is performed. The experimental results indicate that this method achieves a diagnostic accuracy of 99.43%, representing a 5.97% improvement compared to traditional KNN methods. The proposed method demonstrates accurate and effective fault diagnosis performance on the rolling bearing datasets used in this study.

Introduction

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

With the progress of technology, rotary machinery is increasingly developing toward integration, large-scale systems, high speed, and intelligent operation¹. Among various types of rotary electrical machinery, rolling bearing components are the most vulnerable and commonly damaged parts. They offer advantages such as easy start-up, low friction, simple lubrication, and convenient replacement, and are widely used in precision instruments, aerospace, automobiles, machine tools, robots, and other fields. According to relevant statistical data on rotating machinery failures, abnormal vibration faults account for 70%, and 30% of these vibration faults are closely related to rolling bearing abnormalities². Therefore, accurate diagnosis of rolling bearing faults is a highly important and widely studied field.

The mainstream methods for diagnosing rolling bearing faults include general methods and machine learning methods. General diagnostic methods analyze and decompose signals in the time and frequency domains. Time-domain analysis methods describe the nature and characteristics of signals by observing waveform patterns, statistical features, and temporal relationships, thereby enabling fault diagnosis. Common indicators include mean value, root mean square, correlation coefficient, margin, effective value, and impulse factor³. However, these methods are susceptible to external noise, which reduces accuracy. In complex systems, they may not fully characterize system behavior and often need to be combined with other analytical approaches. Chen et al. preprocessed vibration time-domain signals to extract different dimensionless features and then built a training model using a decision-tree-based random forest algorithm⁴. The effectiveness of this method was validated using rolling bearing competition data and simulated marine bearing fault data.

Frequency-domain analysis methods transform signals into the frequency domain, enabling a better understanding of frequency components, spectral characteristics, and frequency distribution. These methods include the Fourier transform, spectral analysis, and power spectral density. Li et al. analyzed the frequency distribution of vibration signals in the generated envelope spectrum to diagnose bearing faults⁵. Wang et al. identified characteristic frequencies of vibration signals using different frequency-domain analysis methods and compared them with the equipment’s inherent characteristic frequencies to achieve mechanical fault identification in spindle systems⁶.

With the increasing complexity of equipment, the demand for signal analysis has become more diverse. Since nonlinear and non-stationary signals contain time-evolving frequency components, conventional analyses based on linearity and stationarity assumptions fail to fully reveal their transient behaviors and temporal correlations. In contrast, time-frequency analysis provides a joint representation of signal energy distribution across both time and frequency dimensions, enabling a more comprehensive interpretation. These methods allow observation of how signals change over time and how frequency components vary across different intervals, helping to capture dynamic signal characteristics⁷^,⁸. Several prominent techniques, such as discrete wavelet transform, ensemble empirical mode decomposition, and variational mode decomposition, are widely used for time-frequency analysis⁹. Combining continuous wavelet transform with a transfer learning-enhanced residual neural network, Diao et al. proposed a hybrid diagnostic framework¹⁰.

Traditional diagnostic approaches, which are largely manual and experience-driven, are prone to subjective bias and operator-dependent inconsistencies, leading to uncertain, non-uniform diagnoses. Even after signal processing, the extracted multidomain features often require further optimization to achieve accurate fault diagnosis. Machine learning methods, by contrast, classify bearing faults using mathematical models and automatically identify patterns in feature datasets, reducing reliance on human judgment. Consequently, many researchers have combined signal processing with machine learning to diagnose and classify bearing fault types. Commonly used methods include ensemble models such as random forests, kernel-based methods such as support vector machines, and single-layer feedforward networks such as extreme learning machines¹¹.

Notable efforts have been devoted to advancing machine learning techniques for fault diagnosis. Recently, more advanced methods based on graph neural networks have been developed. Zhang et al. proposed a multiscale channel attention-driven graph dynamic fusion learning method for robust fault diagnosis under noisy signals¹². You et al. developed a channel-adaptive generative reconstruction and fusion framework for few-shot fault diagnosis¹³. While these methods achieve state-of-the-art accuracy, they require substantial computational resources and large labeled datasets. For rolling bearing fault diagnosis, Guo et al. proposed a data-level fusion method with adaptive weighting¹⁴. This method processes multisource vibration signals using the k-nearest neighbor (KNN) algorithm to determine optimal weighting schemes. Fault-induced non-stationarity in vibration signals alters the spectral energy distribution, and these per-band energy variations serve as discriminative features for different fault states. However, traditional KNN algorithms rely on Euclidean distance for similarity measurement, which is insufficient for data with complex or diverse distributions. Additionally, they assume equal feature importance, which does not reflect the actual characteristics of fault vibration features¹⁵^,¹⁶^,¹⁷. Traditional distribution similarity measures are also heavily influenced by distribution overlap, limiting their ability to capture true discrepancies¹⁸^,¹⁹^,²⁰^,²¹^,²². In contrast, Earth mover’s distance (EMD) measures the minimum cost required to transform one distribution into another, effectively capturing distribution differences regardless of overlap or positional displacement. This property makes EMD particularly suitable for this study, as it enables robust similarity measurement under varying operating conditions where distribution shifts are common.

To address these limitations, this study develops a KNN-based fault diagnosis method for rolling bearings that incorporates both distribution differences and feature importance. The proposed method consists of four main steps. First, a three-level wavelet packet decomposition is applied to the vibration signals, and the energy values of all nodes at the third level are calculated to construct the fault feature set. Second, the mean impact value (MIV) algorithm is used to quantify the relative importance of each feature. Third, EMD is introduced to measure distribution differences between feature vectors, capturing underlying structural discrepancies. Finally, the conventional Euclidean distance in the KNN algorithm is enhanced by integrating MIV-based feature weights and EMD-based distribution metrics. This improved similarity measure, combined with majority voting, is used to classify faults and improve diagnostic accuracy.

Access restricted. Please log in or start a trial to view this content.

Protocol

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

This study did not involve human participants or animal subjects; therefore, ethical approval and informed consent were not required. The proposed method was implemented using MATLAB R2014b. The following toolboxes were used: Wavelet Toolbox for wavelet packet analysis (wpdec and wpcoef), Neural Network Toolbox for BP neural network implementation (feedforwardnet and train), Optimization Toolbox for solving the linear programming problem in EMD computation (linprog), and Statistics and Machine Learning Toolbox for KNN classification (fitcknn and predict). The EMD was implemented by solving the transportation problem using linear programming. All experiments were conducted on a Windows 10 PC with an Intel Core i7-10700 CPU (2.90 GHz) and 16 GB RAM. The proposed method was validated using the CWRU Bearing Dataset (https://engineering.case.edu/bearingdatacenter/download-data-file) available publicly. Faults were introduced into SKF6205 drive-end bearings using electro-discharge machining (EDM) at three diameters (0.007″, 0.014″, and 0.021″) and at three locations (inner race, outer race, and ball). Vibration signals were collected at a sampling frequency of 12 kHz under four motor loads (0–3 hp), corresponding to speeds of 1797–1730 rpm. Each signal sample consisted of 2048 data points, obtained using a sliding window segmentation approach with a step size of 598 points (70.8% overlap).

The proposed KNN-based rolling bearing fault diagnosis framework (Figure 1) consists of seven sequential stages, where the output of each stage serves as the input to the next. In Stage 1, energy features are extracted from rolling bearing vibration signals using three-level wavelet packet decomposition with the Daubechies 3 (db3) wavelet basis. The normalized energy values of the eight sub-bands at the third decomposition level are compiled into the feature set Q = (q1, q2, ..., qm), where m = 8 is the feature dimension. The normalization is performed using sum normalization (relative energy normalization) as defined in Equation 6⁶.

Vibration signal analysis diagram using MIV-BP neural networks, KNN-EMD, majority voting.
Figure 1: Implementation flowchart of the proposed algorithm. Flowchart illustrating the workflow of the proposed KNN–MIV–EMD method, including vibration signal input, wavelet packet energy feature extraction, MIV-based feature weighting, EMD-based similarity calculation, and final classification using majority voting. Please click here to view a larger version of this figure.

In Stage 2, the MIV of each feature is calculated using a BP neural network (single hidden layer with 10 neurons, maximum 2000 iterations, target error 1.0 × 10⁻⁵, perturbation step size δ = ±10% of the mean value of each feature), as defined in Equations 10–13²³. The network employs the hyperbolic tangent sigmoid function (tansig) as the activation function θ_s in the hidden layer and a linear function (purelin) in the output layer. The network is trained using the Levenberg–Marquardt algorithm (trainlm). The MIV value is then assigned as the relative importance weight of that feature.

In Stage 3, raw vibration signals are segmented into 200 samples using a sliding window of 2048 data points with a step size of 598 points (70.8% overlap). The window is moved sequentially from the beginning of the signal. The dataset (Q) is partitioned into a training set (Q_train) and a testing set (Q_test) using a 52/48 split ratio. For each fault condition, 104 samples are randomly selected as the training set, and the remaining 96 samples are used as the testing set. The partitioning is repeated 10 times using different random seeds, and the average performance metrics are reported to assess statistical robustness. This approach ensures that the results are not dependent on a specific random partition. Because the split is performed after segmentation and at the sample level, there is no data-point overlap between training and testing sets.

In Stage 4, the optimal number of nearest neighbors K is selected using five-fold cross-validation on the training set. The folds are generated randomly using a fixed random seed and are stratified by class to preserve class distribution. Candidate K values are searched within the range K range set notation equation, K ∈ [1,10], mathematical inequality concept. , specifically evaluating K = 1, 3, 5, 7, and 9. The value of K that achieves the highest average classification accuracy across the five folds is selected as the optimal value. In this study, the optimal K is determined to be 3.

In Stage 5, the similarity measure is enhanced by integrating feature importance weights (from MIV) and distribution differences (from EMD). Each sample is represented as an 8-dimensional normalized energy feature vector obtained using sum normalization (as defined in Equations 4–7⁶). The conventional KNN classifier uses Euclidean distance to measure similarity between samples; however, this approach is extended to better capture differences in feature distributions. EMD measures the distance between feature distributions and is particularly suitable for analyzing the energy distribution of bearing vibration signals obtained through wavelet packet decomposition. No additional normalization is applied prior to EMD computation. The ground distance used in EMD is the Euclidean distance between feature components. When measuring distances between multiple distributions, EMD is not affected by the positional differences of the distributions, enabling effective comparison of extracted energy features and improved classification when combined with the KNN decision rule.

Equations 1–3 are novel formulations proposed in this study. Equations 4–7 and 8–26 represent standard formulations in their respective fields. For a test sample (I) and a training sample (Q_train), the weighted Euclidean distance incorporating MIV is defined as Equation 1, where m = 8 is the feature dimension, w_i is the normalized MIV weight for the i-th feature, Q_{test, i} and Q_{train, i} are the i-th feature values of the test and training samples, respectively.

dMIV formula, symbolic representation, equation for test-train data evaluation, educational use. (1)

The EMD-based distribution distance is defined in Equation 2, where the feature distributions of the test (Htest) and training (Htrain) samples are used. EMD measures the minimum cost required to transform one distribution into another.

Mathematical equation for Earth Mover's Distance, EMD concept, formula representation. (2)

The final enhanced distance combining both components is defined as Equation 3, where λ is a balancing parameter that controls the contribution of the EMD-based distribution distance. In this study, λ is set to 0.5 based on empirical tuning to achieve optimal classification performance. The value λ = 0.5 was determined by grid search on the validation set over the range [0, 1] with a step size of 0.1, and the value that achieved the highest classification accuracy was selected as optimal. The optimal λ may be dataset-specific; for other datasets, we recommend re-tuning λ using cross-validation on the training data.

Equation describing statistical method for data analysis optimization, showing d_final calculation. (3)

In Stage 6, all training samples are sorted based on their enhanced distance to the test sample, which incorporates MIV-based feature importance weights and EMD-based distribution metrics. The top K = 3 samples are then selected as the nearest neighbors.

In Stage 7, the majority voting rule is applied among the K = 3 nearest neighbors to determine the final class label for each test sample.

Wavelet Packet Analysis and Energy Extraction
Wavelet packet decomposition is based on wavelet transform but is more refined than conventional wavelet decomposition. A distinctive feature of wavelet packet decomposition is its ability to perform a more balanced and complete time–frequency analysis by decomposing both low- and high-frequency components, unlike conventional wavelet decomposition, which refines only the low-frequency part⁵. In contrast to the fixed resolution characteristic of wavelet decomposition, this approach enables a more balanced representation, alleviating the typical compromise between time and frequency localization across the signal bandwidth.

In the multiresolution process, wavelet packet decomposition is regarded as the stepwise orthogonal decomposition of a function space⁶. The formula for wavelet packet decomposition is given in Equation 4⁶:

Wavelet transform equations, Σ-formula, mathematical concepts, diagram for educational use. (4)

In this formulation, the variables ( Equation dlj,2n, mathematical symbol for sequence indexing, educational use. , Mathematical equation, dj,2n+1, illustrates variable transformation. , and Equation for wavelet transform coefficient, showing subscripts and superscripts in symbolic form. ) correspond to the coefficients obtained from the wavelet packet decomposition, while the symbols (h_k-2l [low-pass] and g_k-2l [high-pass]) represent the filter coefficients central to the decomposition process.

In this study, the db3 wavelet is selected as the wavelet basis function due to its compact support and orthogonality, which are well suited for extracting transient features from vibration signals. A three-level wavelet packet decomposition is performed on the original vibration signals, resulting in 2³ =8 sub-bands at the third decomposition level.

Compared to the standard wavelet transform, the wavelet packet transform enables more granular signal decomposition. By decomposing the original signal to a specified scale, it isolates the frequency bands of interest and extracts their energy distribution as effective features. The wavelet packet transform decomposes a signal into sub-bands whose energy distribution characterizes the original signal’s frequency content, and this derived energy feature vector serves as a robust basis for signal classification.

The feature vector of a signal is defined as the normalized energy distribution across the 2^j frequency bands obtained from its j-layer wavelet packet decomposition, where the total signal energy is partitioned into these orthogonal sub-bands. The energy contained in the k-th frequency band of the j-th decomposition layer is represented by Equation 5⁶, and the normalized energy feature is obtained as shown in Equation 6⁶.

Mathematical concept, equation Ej,k=Σ|xj,k(n)|², analysis of signal energy distribution. (5)

$Equation showing probability calculation formula $pj,k$ for statistical analysis in research.$ (6)

Following this procedure, for each vibration signal sample, an energy feature vector is constructed as defined in Equation 7⁶:

Equation of force vector components; sequence of parameters p3,1 to p3,8 in analysis setup. (7)

KNN–MIV–EMD Implementation Diagnosis Procedure Execution
In the supervised KNN algorithm, the classification of a new instance is determined by the plurality class among its K most similar training samples, as measured by a predefined distance metric. The classification result is therefore dependent on the selection of K and the nature of the similarity calculation. This simple yet effective principle underpins its extensive application in diverse classification domains.

The workflow of the KNN algorithm is outlined as follows. First, the k-nearest neighbors are identified from the training samples by computing the Euclidean distances between the test sample and each training instance, as defined in Equation 8²⁴.

$Distance formula, $ d_{ij}=\|x(i)-x(j)\|_2 $, mathematical equation, vector space analysis.$ (8)

In this notation, the variables x(i) and x(j) correspond to a training sample and a test sample, respectively.

Next, the class probability distribution for the test sample is estimated based on its k-nearest neighbors. Here, k represents the number of nearest neighbors, and the number of these k neighbors that belong to a specific class a (a = 1,2,...,c) is used to compute the probability P(a) that the test sample belongs to that class a, as defined in Equation 9²⁴. where c denotes the total number of classes in the dataset.

Static equilibrium; formula P(a)=ka/k; equation; physics concept; balance analysis. (9)

Finally, fault diagnosis is performed by identifying the k-nearest neighbors using Equation 8, tallying the counts according to Equation 9, sorting these counts in descending order, and assigning the class with the highest count as the fault class of the test sample.

The MIV is based on the BP neural network structure and is used to reflect the importance weight of each variable with respect to the output. The BP neural network is a feedforward network with a typical three-layer topology, including input, hidden, and output layers. It propagates errors backward and iteratively adjusts the weights of neurons to achieve self-learning.

Let X denote the sample dataset with L groups, as defined in Equation 10²³ and Equation 11²³:

Let X be the sample dataset with L groups:

Static equilibrium formula ΣFx=0, diagram illustrating linear equation X=[x(1),x(2),...,x(L)]. (10)

State vector formula, x(k)=[x1(k),x2(k),...,xn(k)]^T, used in control systems analysis. (11)

Here, x(k) denotes the sampled data at time k, x_i is the i-th component of x(k), with k = 1, 2, …, L and i = 1, 2, …, n.

The working principle of the BP neural network is as follows. The input sample x(k) is first weighted by the connection weights ω^T and propagated to the hidden layer to generate the input data s_i of the hidden layer, as defined in Equation 12²³, where the activation function (θ) governs the transformation.

Neural network function: sj=θ[ωjTx(k)+bj] equation, computation model. (12)

The output of the network is then obtained as defined in Equation 13²³:

Neural network regression formula, ŷ(k)=∑βjθ[ωTx(k)+bj]; equation for predictive modeling. (13)

where ω_j = [ω_1j_,ω_2j,...,ω_nj] denotes the input weight vector, β denotes the output weight vector, and j = 1, 2, …, n. In this study, the single hidden layer is set to 10 neurons, the maximum number of iterations is set to 2000, and the minimum expected target error is set to 1.0 × 10⁻⁵.

When a small perturbation (Δω_ij) is applied to the weights between the input and hidden layers, it is propagated to the hidden layer output (S_j), resulting in a variation that ultimately leads to a change (ΔS_j) in the network output. The corresponding weights (ω_ij and ω_jk) are updated through BP, and the loss function is defined in Equation 14²³.

Static equilibrium equations, Σe², error estimation, scientific formula. (14)

To augment the fault feature set Time series data sequence equation, X=[x(1), x(2),..., x(L)], showing indexed numerical values. , small positive and negative perturbations are applied independently to each feature variable in the sample data, as defined in Equation 15²³ and Equation 16²³.

Multivariate time series equation, formula for sequential data analysis. (15)

Static equilibrium equations, ΣFx=0, vector formula, educational diagram, mathematical analysis. (16)

In this formulation, L and n represent the number of fault feature factors and sample groups, respectively. In this study, the perturbation step size is set to δ = ±10% of the mean value of each feature, which is a commonly used setting in MIV-based feature importance analysis. Accordingly, the neural network fitting outputs are obtained as defined in Equation 17²³ and Equation 18²³.

Mathematical equation for signal processing, featuring summation notation and index variables. (17)

Neural network node activation formula, Zj=f(Σωjx(i)+aj), used in computational models. (18)

If Multivariable function formula, statistical model representation, equation analysis. , as shown in Equation 19²³, the corresponding outputs represent the results of the perturbed sample sets.

Multivariate calculus formula; IVi = Ŷi,+ - Ŷi,-; function evaluation; symbolic representation. (19)

Here, static equilibrium diagram; ΣF=0; illustrating forces balance in physics experiment setup and Mathematical symbol Ŷi used in statistical model prediction equation diagram. , respectively represent output results of sample sets Static equilibrium equation, X subscript delta t, formula symbol. and Static equilibrium equation ΣFx=0, diagram showing balanced forces, mechanical analysis. . The impact degree of each fault feature variable on the fault type is expressed as defined in Equation 20²³.

Equation vector notation for input variables IV in mathematical analysis. (20)

By averaging the impact values over the number of observations, the mean impact value of each fault feature on the final output fault type is calculated as defined in Equation 21²³.

$Mathematical equation showing mean IV calculation using summation: $ MIV = \frac{1}{L} \sum_{i=1}^{L} IV_i $$ (21)

The EMD is a measure of similarity between two distributions. Let Set theory equation, G with elements and weights pairs, mathematical analysis formula. denote the source distribution and Mathematical set notation, formula illustration, equation for educational research analysis. denote the target distribution, where g_i and h_j are the positions (or feature vectors) of the i-th and j-th clusters in the source and target distributions, respectively. ω_gi is the probability mass (weight) at position g_j, satisfying Mathematical equation Σm i=1 ωgi=1; static equilibrium principle, balance condition. . ω_hj is the probability mass (weight) at position h_j, satisfying Equation of summation, symbol Σ, for weight normalization in mathematical analysis. . m and n are the numbers of clusters in the source and target distributions, respectively.

The EMD between G and H is defined as the minimum cost required to transform the source distribution into the target distribution, as given in Equation 22⁷:

Earth Mover's Distance formula; mathematical equation analysis; optimization method. (22)

Here, the optimal flow (f_ij) is subject to the constraints defined in Equations 23–26⁷:

$Linear programming inequality constraints formula, $ f_{ij} \geq 0, i=1,2,...,m; j=1,2,...,n $.$ (23)

Optimization constraint formula, Σfij≤ωgi, linear programming equation in mathematics. (24)

Linear inequality equation Σfij ≤ ωhj; j=1,2,...n; in algorithmic optimization context. (25)

Mathematical optimization equation Σ_i=1^m Σ_j=1^n f_ij formula in algorithm analysis. (26)

Here, f_ij is the flow (amount of mass transported) from the i-th cluster of the source distribution to the j-th cluster of the target distribution. Its dimension is m × n. d_ij is the ground distance between g_i and h_j, typically defined as the Euclidean distance: $Mathematical formula, calculating Euclidean distance $ d_{ij} = \| g_i - h_j \|_2 $.$ . Its dimension is also m × n. The first constraint ensures non-negative flows, the second and third constraints ensure that the total flow from each source cluster and into each target cluster does not exceed the available mass, and the fourth constraint ensures that the total flow equals the total mass, which is 1 for normalized distributions.

In practice, EMD is computed by solving a transportation problem using linear programming methods (e.g., the simplex algorithm) to determine the optimal flow (f_ij) that minimizes the total transportation cost. The resulting EMD value represents the minimum cost required to transform one distribution into another and serves as a robust similarity metric for comparing feature distributions in the proposed fault diagnosis method.

Access restricted. Please log in or start a trial to view this content.

Results

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

The experimental environment and platform are shown in Figure 2. From left to right, the platform consists of a fan-bearing assembly, an induction motor, and a drive unit. The central portion connects these components to a torque transducer/encoder via a coupling, and the rightmost section serves as the dynamometer. The control electronics are not depicted. EDM technology was used to simulate pitting faults on bearings, ranging from weak to severe conditions. Vibration data were collected us...

Access restricted. Please log in or start a trial to view this content.

Discussion

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

The present study proposes a KNN-based fault diagnosis method for rolling bearings that integrates distribution differences through EMD and feature importance through MIV. Experimental results on a rolling bearing dataset containing four health states (normal, inner race fault, rolling element fault, and outer race fault) demonstrate the effectiveness of the proposed approach. The proposed KNN–MIV–EMD method achieves the highest classification accuracy of 99.43% at K = 3, outperforming traditional KN...

Access restricted. Please log in or start a trial to view this content.

Disclosures

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

The authors declare no conflicts of interest.

Acknowledgements

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

This work was financially supported by the Key Scientific Research Projects of Colleges and Universities in Henan Province (25A580011) and the Scientific and Technological Research Project in Henan Province (262102210057).

Access restricted. Please log in or start a trial to view this content.

Materials

List of materials used in this article
Name	Company	Catalog Number	Comments
Acquisition System
Data Recorder (16-channel)			Anti-aliasing filter, 24-bit ADC; dynamic range >90 dB; Case Western Reserve University, Cleveland, OH, USA Fault Type and Size (inch): All fault types Load (HP): All Sampling Frequency: 12 kHz or 48 kHz Purpose in Study: High-quality data acquisition File Naming Convention: N/A
Data Subset
Data Selection for This Study			Drive-end data, 12 kHz sampling, 4 load conditions; CWRU Bearing Data Center (https://engineering.case.edu/bearingdatacenter/download-data-file) Fault Type and Size (inch): 7 conditions × 4 loads = 28 subsets Load (HP): All Sampling Frequency: 12 kHz Purpose in Study: Model training and testing File Naming Convention: Custom selection
Label Information
Fault Class Labels			One-hot encoding format Fault Type and Size (inch): [1,0,0,0,0,0,0] for Normal … Load (HP): All Sampling Frequency: N/A Purpose in Study: Supervised learning labels File Naming Convention: Label_vector.mat
Sensor
Accelerometer (Drive End)			ICP accelerometer; position: 12 o’clock; sensitivity ~500 mV/g; Drive end bearing housing Fault Type and Size (inch): All fault types Load (HP): All Sampling Frequency: 12 kHz or 48 kHz Purpose in Study: Primary vibration signal acquisition File Naming Convention: DE_time_series
Accelerometer (Fan End)			ICP accelerometer; sensitivity ~500 mV/g; Fan end bearing housing Fault Type and Size (inch): All fault types Load (HP): All Sampling Frequency: 12 kHz or 48 kHz Purpose in Study: Auxiliary/comparison signal File Naming Convention: FE_time_series
Test Bearing
Ball Bearing (Healthy)			Deep groove ball bearing (6205 type); SKF (commonly used in CWRU setup) Fault Type and Size (inch): Normal Load (HP): 0, 1, 2, 3 Sampling Frequency: 12 kHz or 48 kHz Purpose in Study: Baseline condition File Naming Convention: Normal_0.mat
Ball Bearing (Inner Race Fault)			Single-point fault via EDM Fault Type and Size (inch): Inner Race (IR) @ 0.007", 0.014", 0.021", 0.028" Load (HP): 0, 1, 2, 3 Sampling Frequency: 12 kHz or 48 kHz Purpose in Study: Inner race fault validation File Naming Convention: IR007_1.mat
Ball Bearing (Outer Race Fault)			Single-point fault via EDM (6 o’clock position) Fault Type and Size (inch): Outer Race (OR) @ 0.007", 0.014", 0.021", 0.028" Load (HP): 0, 1, 2, 3 Sampling Frequency: 12 kHz or 48 kHz Purpose in Study: Outer race fault validation File Naming Convention: OR021_2.mat
Ball Bearing (Ball Fault)			Single-point fault via EDM Fault Type and Size (inch): Ball (B) @ 0.007", 0.014", 0.021", 0.028" Load (HP): 0, 1, 2, 3 Sampling Frequency: 12 kHz or 48 kHz Purpose in Study: Ball element fault validation File Naming Convention: B014_3.mat
Test Rig
Machinery Fault Simulator			Motor-driven system with adjustable load; accelerometer mounted on bearing housing; Case Western Reserve University, Cleveland, OH, USA Fault Type and Size (inch): N/A Load (HP): 0, 1, 2, 3 Sampling Frequency: 12 kHz or 48 kHz Purpose in Study: Source of fault dataset File Naming Convention: N/A

Reprints and Permissions

Request permission to reuse the text or figures of this JoVE article

Request Permission

A Novel k-Nearest Neighbor Method with Distribution Discrepancy and Differential Feature Importance for Rolling Bearing Fault Diagnosis

In This Article

Summary

Abstract

Introduction

Protocol

Results

Discussion

Disclosures

Acknowledgements

Materials

Reprints and Permissions

Tags

Related Articles