Machine Learning-Based Forecasting of Agricultural Commodity Prices Using Ensemble Models

Price Prediction Random Forest Linear Regression XGBoost Food Commodities

Authors

Downloads

This study aims to forecast the prices of key food commodities including garlic, shallots, cayenne pepper, and red chili in Kota Singkawang using three machine learning models: Linear Regression, Random Forest, and XGBoost. The dataset, sourced from BPS Kota Singkawang for the 2016–2023 period, underwent preprocessing to address missing values and outliers, followed by correlation-based feature selection. Model training involved grid search and cross-validation to ensure robust performance evaluation. The findings indicate that XGBoost consistently outperforms the other models, achieving the highest R² values (up to 0.82) and the lowest MAPE (5–10%), demonstrating its ability to capture complex nonlinear relationships and account for external factors such as inflation and seasonality. Random Forest ranked second in predictive accuracy, especially for garlic, while Linear Regression was less effective for volatile commodities. Notably, features such as rainfall intensity and national holidays were found to significantly influence price movements. The novelty of this research lies in its localized approach to price forecasting using ensemble models combined with macroeconomic and climatic variables. The results offer a practical tool for local policymakers to anticipate price volatility and design evidence-based interventions to enhance food security and price stability at the regional level.