Landslides remain one of the most devastating natural hazards in mountainous and hilly regions across the globe. With steep topography, variable climate conditions, and increasing human activity like deforestation and infrastructure expansion, the need for accurate landslide susceptibility mapping (LSM) has never been more pressing.
A recent study has taken, considering Spiti Valley in
Himachal Pradesh, India, a major step forward in improving how we assess landslide risks by combining advanced machine learning (ML) models with smarter sampling strategies for non-landslide areas. Traditionally, identifying safe zones in LSM models has been less rigorous, often reducing prediction accuracy. This new research introduces refined methods—like slope-based sampling—that better represent ground conditions, resulting in more reliable hazard assessments.
Three machine learning models—Extreme Gradient Boosting (XGBoost), Random Forest (RF), and K-Nearest Neighbors (KNN)—were tested using two distinct non-landslide data strategies. The enhanced sampling approach showed significant gains across all models, with improved accuracy, reduced false alarms, and more balanced hazard maps. In particular, XGBoost demonstrated outstanding performance, showing how well-tuned algorithms can support disaster mitigation efforts.
Key environmental variables like elevation, slope, vegetation, rainfall, and proximity to water networks played a vital role in the model's accuracy. Beyond performance metrics, researchers also used the Landslide Density Index (LDI) to evaluate how well the models captured real-world patterns—further reinforcing the value of smarter data input.
This study sets a new benchmark in landslide risk prediction. It emphasizes how combining machine learning with better sampling methods can lead to clearer, more actionable insights for engineers, planners, and policymakers working in hazard-prone regions worldwide.
Read the full research study
here.