人民长江 ›› 2023, Vol. 54 ›› Issue (4): 241-246.doi: 10.16232/j.cnki.1001-4179.2023.04.035

• • 上一篇    

基于XGBoost的水库大坝基础设施潜在风险评估预测

丁炜;金有杰;张日;俞蕊;   

  • 发布日期:2023-05-08

Evaluation and prediction of potential risks of reservoir dam infrastructures based on XGBoost

DING Wei JIN Youjie ZHANG Ri YU Rui   

  • Published:2023-05-08

摘要: 水库大坝基础设施潜在风险评估是水库大坝风险评估体系的重要组成部分,然而水库大坝基础设施数据量大、数据特征多、还存在数据缺失的情况。XGBoost算法作为一种基于决策树的集成算法,在应对大规模含有缺失情况的、具有混合类型的特征数据方面具有独特优势。为了更快速准确地对水库大坝基础设施潜在风险进行评估,提出了一种基于XGBoost的水库大坝设施潜在风险评估预测方法。首先将水库大坝统计数据进行预处理,并用该数据对XGBoost模型进行训练,然后通过GridSearch和Cross-validation计算模型最优参数,最后根据准确率、召回率等精度指标对模型进行评价。预测结果表明:XGBoost在测试集上的准确率达91.26%,相比于其他4种常规机器模型(随机森林、人工神经网络、最邻近算法、支持向量机)高出2.12%,5.59%,19.31%,38.65%,满足工程实际的要求。

关键词: 水库大坝;风险评估预测;准确率;召回率;XGBoost;

Abstract: The potential risk evaluation of the reservoir dam infrastructures is an important part of the reservoir dam risk evaluation system.However, the monitoring data of reservoir dam infrastructures is large with many characteristics and easy to lose.As an integrated algorithm based on decision tree, XGBoost algorithm has unique advantages in dealing with feature data with large-scale missing data and mixed type.Therefore, in order to evaluate the potential risk of reservoir dam infrastructures quickly and accurately, this paper proposed a potential risk assessment and prediction method for reservoir dam infrastructures based on XGBoost.Firstly, the reservoir dam monitoring data was preprocessed, and the XGBoost model was trained with the data.Then, the optimal parameters of the model were calculated by GridSearch and Cross-validation.Finally, the model was evaluated according to accuracy indicators such as accuracy and recall rate.The prediction results showed that the accuracy of XGBoost on the test set reached 91.26%,which was 2.12%,5.59%,19.31% and 38.65% higher than the other four conventional machine models(random forest, artificial neural network, nearest neighbor algorithm and support vector machine).The proposed model can meet the requirements of engineering practice.

Key words: reservoir dam; risk evaluation and prediction; accuracy; recall rate; XGBoost;