Abstract:
In landslide susceptibility assessment, different approaches to handling sample imbalance can introduce significant uncertainty in evaluation outcomes. To address this issue, this study focused on the Changdu area of eastern Tibet and constructed the landslide susceptibility evaluation model using a dataset with imbalanced landslide and non-landslide samples. Three disposal schemes were applied: no treatment, downsampling, and SMOTE oversampling. The logistic regression method was used to construct the landslide susceptibility evaluation model. Based on ROC curve, accuracy, precision, recall, missed detection rate, and other evaluation indicators, the comprehensive evaluation index of
F1′ score was used to verify the accuracy of model classification. The results show that the modeling effect of landslide susceptibility obtained by data processing into equilibrium data (downsampling/oversampling) is greatly improved compared with that obtained without processing data. Specifically, the value of the
F1′score of the comprehensive index was increased by 53.17%. In the two schemes for processing data (downsampling and oversampling), the oversampling method increased the value of the composite index
F1′ score by 16.30% compared with the downsampling method, indicating that the oversampling method has effectiveness in handling unbalanced data. This study can provide basic information for processing of data sets before landslide prediction and geological disaster prediction, and provide theoretical and technical support for further improving regional disaster prevention and mitigation.