Rice Science ›› 2025, Vol. 32 ›› Issue (6): 857-867.DOI: 10.1016/j.rsci.2025.08.007

• Research Papers • Previous Articles     Next Articles

Physical and Physicochemical Classification of Parboiled Rice Using VNIR-SWIR Spectroscopy and Machine Learning

Nairiane dos Santos Bilhalva1, Paulo Carteri Coradi2(), Rosana Santos de Moraes1, Dthenifer Cordeiro Santana2, Larissa Ribeiro Teodoro2, Paulo Eduardo Teodoro2, Marisa Menezes Leal1   

  1. 1Laboratory of Postharvest, Campus Cachoeira do Sul, Federal University of Santa Maria, Cachoeira do Sul, Rio Grande do Sul 96506-322, Brazil
    2Campus de Chapadão do Sul, Federal University of Mato Grosso do Sul, Chapadão do Sul, Mato Grosso do Sul 79560-000, Brazil
  • Received:2025-05-15 Accepted:2025-08-13 Online:2025-11-28 Published:2025-12-04
  • Contact: Paulo Carteri Coradi (paulo.coradi@ufsm.br)

Abstract:

The classification of parboiled rice into types can be optimized through the use of machine learning (ML) algorithms, resulting in greater speed and accuracy in data processing. The objectives of this study were: (i) to investigate the spectral behavior of different types of parboiled rice (Types 1-5 and Off-type); (ii) to identify the most effective ML algorithm for classifying parboiled rice types; (iii) to determine the best kernel configuration and preprocessing methods for spectral data; and (iv) to recommend a protocol for implementing this technique in the rice storage industry. Samples were selected based on the maximum defect limits tolerated for each type, according to the Technical Rice Regulation. Spectral data were acquired using a spectroradiometer in the range of 350-2500 nm and subsequently processed with different methods, including baseline correction, standard normal variate, multiplicative scattering correction, combinations of these techniques with Savitzky-Golay smoothing, and the application of the first derivative of Savitzky-Golay smoothing. The data were analyzed using six different ML algorithms: Artificial Neural Network, Decision Tree, Logistic Regression, REPTree, Random Forest, and Support Vector Machine. Rice types were treated as output variables, while spectral features served as input variables. Logistic Regression and Support Vector Machine algorithms showed the best classification performance, with accuracy rates above 97%, F-scores around 0.98, and Kappa values exceeding 0.97. Spectral preprocessing did not yield substantial improvements and incurred high computational costs; therefore, using raw data was a viable and efficient alternative. For practical implementation in the rice storage industry, we recommend acquiring a VNIR-SWIR (visible near-infrared and shortwave infrared) hyperspectral sensor (350-2500 nm) and developing a classification model based on the Support Vector Machine algorithm with a linear kernel trained on representative local samples. Additionally, we recommend implementing an automated real-time classification system, a representative sample collection protocol, and detailed reporting for inventory and logistics optimization.

Key words: Oryza sativa L., artificial intelligence, supervised classification, support vector machine, logistic regression