Prediction of gully erosion susceptibility mapping using xgboosting machine learning algorithm

Research Article
Md Hasanuzzaman and Pravat Kumar Shit*
Gully erosion; Geo-environmental factors; Machine learning algorithm; Receiver Operating Characteristic (ROC); Sita Nala small watershed.

Gully erosion presents a significant threat to the environment, putting agriculture, wildlife habitats, human safety, infrastructure, and soil health at risk. Mapping areas vulnerable to gully erosion accurately demands selecting the right machine learning model, given the varied environmental factors influencing gully formation. In this study, we utilized machine learning algorithms based on extreme gradient boosting (XGB) to craft a highly precise gully erosion susceptibility map (GESM) for the Sita Nala small watershed, a tributary located on the right bank of the Subarnarekha River in West Bengal, India. Our investigation involved an in-depth analysis of gully erosion mapping with twenty-four variables and scrutiny of a dataset comprising 200 sample points, equally representing gullies and non-gullies. To assess multicollinearity, we utilized Information Gain Ratio (IGR) and Variance Inflation Factors (VIF) tests. The results revealed that drainage density (0.77), elevation (0.74), geomorphology (0.72), Land Use/Land Cover (LULC) (0.72), and Normalized Difference Vegetation Index (NDVI) (0.68) are the most critical factors influencing GESM. Employing a quantile classification approach, we generated three distinct categories of GESMs, ranging from areas with no gully erosion to those with moderate gully susceptibility area and high gully susceptibility area. Approximately 13.49% of the basin area was identified as being dominated by gully erosion, highlighting the urgent need for targeted management strategies in these regions. We evaluated the XGB model's performance on both training and testing data using various statistical tests, including Root Mean Square Error (RMSE), Kappa index, Mean Absolute Error (MAE), Accuracy (ACC), Receiver Operating Characteristic (ROC), and R². While both models produced satisfactory results, the XGB model exhibited strong performance, achieving an ROC value of 84.2%. However, the present study underscores that machine learning can accurately identify areas prone to gully erosion, providing valuable insights for policymakers to implement sustainable management practices.