Predicting Spoilage Intensity Level in Sausage Products Using Explainable Machine Learning and GAN-Based Data Augmentation

Spoilage in processed meat products, such as poultry and pork sausages, presents significant challenges for food safety, quality control, and waste reduction. This study presents a machine learning-based framework to classify spoilage intensity levels using sensory, physicochemical, and microbiologi...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Food and bioprocess technology Ročník 18; číslo 11; s. 9647 - 9674
Hlavní autori: Ince, Volkan, Bader-El-Den, Mohamed, Esmeli, Ramazan, Maurya, Lalit, Sari, Omer Faruk
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: New York Springer US 01.11.2025
Springer Nature B.V
Predmet:
ISSN:1935-5130, 1935-5149
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Spoilage in processed meat products, such as poultry and pork sausages, presents significant challenges for food safety, quality control, and waste reduction. This study presents a machine learning-based framework to classify spoilage intensity levels using sensory, physicochemical, and microbiological features. To overcome limitations caused by small datasets, we applied synthetic data augmentation using a tabular variational autoencoder (TVAE) to generate high-fidelity samples that enhance model generalization. Additionally, traditional oversampling techniques such as SMOTE and ADASYN were employed for comparative purposes and to further address class imbalance issues. Seven machine learning classifiers were evaluated logistic regression, support vector machine, K -nearest neighbors, random forest, gradient boosting, voting classifier, and multilayer perceptron. The best classification performance was achieved when models were trained on GAN-based synthetic data and tested on real samples. For poultry sausage spoilage prediction, the gradient boosting classifier reached the highest accuracy of 97%. For pork sausages, random forest achieved the highest accuracy of 95%. These results confirm the effectiveness of data augmentation in improving predictive robustness. To ensure model transparency, we integrated explainable AI techniques SHAP and LIME into the pipeline. These analyses revealed that sampling time, CO 2 concentration, pH, and microbial species such as Lactobacillus curvatus and Leuconostoc carnosum were among the most influential features in spoilage prediction. The combination of synthetic data generation and interpretable machine learning enables a reliable, scalable, and explainable approach to spoilage classification. This methodology has strong potential for enhancing quality control systems in the meat industry while reducing waste and improving safety along the food supply chain. Graphical abstract
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1935-5130
1935-5149
DOI:10.1007/s11947-025-03971-x