Estimating soil organic carbon content with visible-near-infrared (vis-NIR) spectroscopy.
The selection of a calibration method is one of the main factors influencing measurement accuracy with visible-near-infrared (Vis-NIR, 350-2500 nm) spectroscopy. This study, based on both air-dried unground (DU) and air-dried ground (DG) soil samples, used nine spectral preprocessing methods and their combinations, with the aim to compare the commonly used partial least squares regression (PLSR) method with the new machine learning method of support vector machine regression (SVMR) to find a robust method for soil organic carbon (SOC) content estimation, and to further explore an effective Vis-NIR spectral preprocessing strategy. In total, 100 heterogeneous soil samples collected from Southeast China were used as the dataset for the model calibration and independent validation. The determination coefficient (R(2)), root mean square error (RMSE), residual prediction deviation (RPD), and ratio of performance to interquartile range were used for the model evaluation. The results of this study show that both the PLSR and SVMR models were significantly improved by the absorbance transformation (LOG), standard normal variate with wavelet detrending (SW), first derivative (FD), and mean centering (MC) spectral preprocessing methods and their combinations. SVMR obtained optimal models for both the DU and DG soil, with R(2), RMSE, and RPD values of 0.72, 2.48 g/kg, and 1.83 for DU soil and 0.86, 1.84 g/kg, and 2.60 for DG soil, respectively. Among all the PLSR and SVMR models, SVMR showed a more stable performance than PLSR, and it also outperformed PLSR, with a smaller mean RMSE of 0.69 g/kg for DU soil and 0.50 g/kg for DG soil. This study concludes that PLSR is an effective linear algorithm, but it might not be sufficient when dealing with a nonlinear relationship, and SVMR turned out to be a more suitable nonlinear regression method for SOC estimation. Effective SOC estimation was obtained based on the DG soil samples, but the accurate estimation of SOC with DU soil samples needs to be further explored. In addition, LOG, SW, FD, and MC are valuable spectral preprocessing methods for Vis-NIR optimization, and choosing two of them (except for FD + SW and LOG + FD) in a simple combination is a good way to get acceptable results.