Effects of numbers of observations and predictors for various model types on the performance of forest inventory with airborne laser scanning
2021
Cosenza, Diogo Nepomuceno | Packalen, Petteri | Maltamo, Matti | Varvia, Petri | Rty, Janne | Soares, Paula | Tom, Margarida | Strunk, Jacob | Korhonen, Lauri
Nonparametric models are popular in area-based approach (ABA) using airborne laser scanning. It is unclear, however, what are the number of predictors and the number of observations needed for different modeling approaches to provide accurate predictions without overfitting. This work aims to determine these limits for various approaches: ordinary least squares regression (OLS), generalized additive models (GAM), least absolute shrinkage and selection operator (LASSO), random forest (RF), support vector machine (SVM), and Gaussian process regression (GPR). We modeled timber volume (m ha-1) using ABA with 2–39 predictors and 20–500 training plots. OLS, GAM, LASSO, and SVM overfitted as the number of predictors approached the number of training plots. They required ≥15 plots per predictor to provide accurate predictions (RMSE ≤30%). GAM required ≥250 plots regardless of the number of predictors. The number of predictors hardly affected RF and GPR, but they required ≥200 and ≥250 training plots, respectively, to ensure accurate predictions. RF did not overfit in any circumstances, whereas GPR overfitted even with 500 training plots. Overall, increasing model predictors up to 39 did not necessarily result in overfitting and, in most models, it resulted in better accuracy as long as the training dataset was sufficiently large (≥250 plots).
Show more [+] Less [-]The accepted manuscript in pdf format is listed with the files at the bottom of this page. The presentation of the authors' names and (or) special characters in the title of the manuscript may differ slightly between what is listed on this page and what is listed in the pdf file of the accepted manuscript; that in the pdf file of the accepted manuscript is what was submitted by the author.
Show more [+] Less [-]Bibliographic information
This bibliographic record has been provided by Canadian Science Publishing