Gradient-Boosted Machine Learning Models for Tree Volume Estimation Using Forest Health Indicators
Keywords:
Forest health indices, machine learning, random forest, SHAP, sustainable forestry, tree volume prediction, XGBoostAbstract
Background and Objective: Accurate estimation of tree volume is essential for evaluating forest productivity, biomass accumulation, and carbon storage. This study aimed to develop a scalable and interpretable machine-learning framework for predicting tree volume using integrated forest health indicators.
Materials and Methods: A multi-index forest health dataset incorporating canopy, soil, and ecological variables was used to train and evaluate predictive models. Three machine-learning algorithms—Linear Regression, Random Forest, and Extreme Gradient Boosting (XGBoost)—were implemented and assessed using a 70/15/15 training, validation, and testing data split. Model interpretability was examined using SHapley Additive Explanations (SHAP) to identify the most influential predictors.
Results: Among the evaluated models, XGBoost demonstrated superior predictive performance on the independent test dataset, achieving a root mean square error (RMSE) of 2.143, a mean absolute error (MAE) of 1.602, and a coefficient of determination (R²) of 0.947. SHAP analysis indicated that canopy width, crown density, and soil fertility were the most significant contributors to tree volume estimation.
Conclusion: The findings highlight the effectiveness of gradient-boosted machine-learning models for accurate and interpretable tree volume prediction. The proposed approach provides a robust, data-driven framework with strong potential for large-scale forest monitoring, carbon accounting, and sustainable forest resource management.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Melca M. Abogado, Jose C. Agoylo Jr., Rolly S. Acaso, Jimson A. Olaybar, Jorton A. Tagud, Alex C. Bacalla

This work is licensed under a Creative Commons Attribution 4.0 International License.
