ASROT: A Novel Resampling Algorithm to Balance Training Datasets for Classification of Minor Crops in High-Elevation Regions
2025
Wei Li | Jie Zhu | Tongjie Li | Zhiyuan Ma | Timothy A. Warner | Hengbiao Zheng | Chongya Jiang | Tao Cheng | Yongchao Tian | Yan Zhu | Weixing Cao | Xia Yao
Accurately mapping crop distribution is important for environmental and food security applications. The success of machine learning algorithms (MLs) applied to mapping crops is partly dependent on the acquisition of sufficient training samples. However, since minor crops typically cover only few areas within agricultural landscapes, opportunities for collecting training data for those classes are often constrained. This problem is particularly acute in high-elevation regions, where fields tend to be small and heterogeneous in shape. This often leads to imbalanced training datasets, where the proportions of samples for each class differ greatly. To address this issue, a novel resampling algorithm, i.e., the adaptive synthetic and repeat oversampling technique (ASROT), was proposed by coupling two existed algorithms: adaptive synthetic sampling (ADASYN) and density-based spatial clustering of applications with noise (DBSCAN). Then, we explored the application of the proposed ASROT approach and compared it with six commonly used alternative algorithms, using 13 imbalanced datasets generated from GF-6 images of a high-elevation region. The imbalanced training datasets as well as balanced versions produced by ASROT and the comparison algorithms were used with two classifiers (i.e., random forest (RF) and a stacking classifier) to map crop types. The results showed a negative correlation between overall accuracy and the imbalance degree of datasets, illustrating the latter does affect the models in calibrating the crop classification. The balanced datasets produced higher accuracy for crop classification than the original imbalanced datasets for both the RF and stacking classifiers. The classification accuracy of almost all the crop classes and the overall classification accuracy (OA) increased. Most notably, the accuracy for minor crops (e.g., highland barley and broad beans) increased by approximately 30%. Overall, the proposed ASROT algorithm provides an effective method for balancing training datasets, simultaneously improving classification accuracy of both major and minor crops in high-elevation regions.
Afficher plus [+] Moins [-]Informations bibliographiques
Cette notice bibliographique a été fournie par Multidisciplinary Digital Publishing Institute
Découvrez la collection de ce fournisseur de données dans AGRIS