人民长江 ›› 2023, Vol. 54 ›› Issue (3): 117-124.doi: 10.16232/j.cnki.1001-4179.2023.03.018

• • 上一篇    下一篇

基于聚类算法优化样本的地质灾害易发性评价

刘铁铭;郭有金;刘艳领;   

  • 出版日期:2023-03-28 发布日期:2023-04-03

Evaluation of geological disaster susceptibility based on sample optimization by clustering algorithm

LIU Tieming GUO Youjin LIU Yanling   

  • Online:2023-03-28 Published:2023-04-03

摘要: 区域地质灾害易发性评价样本数据的可靠性是影响最终评价结果的关键因素之一。以陕西省白河县236处地质灾害为研究对象,分别采用层次聚类算法与动态K-means聚类算法对地质灾害样本数据进行聚类分析并获得其样本纯度。分析结果表明:层次聚类算法与K-means聚类算法得到样本纯度分别为91.53%与92.80%;结合两种算法结果,剔除样本噪声点20个,确定有效样本点216个,得到最终样本纯度为91.53%。利用样本提纯前后数据分别建立信息量模型(Ⅳ前、Ⅳ后),开展区域地质灾害易发性评价。评价结果表明:Ⅳ前模型与Ⅳ后模型得到的区域易发性区划图中,地质灾害落入极高-高易发区的数量分别为149个与167个,分别占灾害总数的63.13%与70.77%,灾害密度分别为0.508个/km2与0.584个/km2,较初始样本条件,样本提纯后极高-高易发区内灾害点数增加18个,灾害密度增加0.076个/km2,地质灾害分布更集中,预测结果准确度更高。研究结果可为地质灾害易发性评价研究中的初始样本数据提纯提供一定的理论与科学依据。

关键词: 地质灾害;易发性评价;评价指标;样本纯度;聚类算法;信息量模型;

Abstract: The reliability of sample data for regional geological hazard susceptibility evaluation is one of the key factors affecting the final evaluation results.Taking 236 geological hazards in Baihe County, Shaanxi Province as the research objects, hierarchical clustering algorithm and dynamic K-means clustering algorithm were used to cluster and analyze the geological hazard sample data and obtain their sample purity, respectively.The analysis results indicated that the sample purity produced by hierarchical clustering algorithm and K-means clustering algorithm were 91.53% and 92.80%,respectively.Combining the results of these two algorithms, 20 sample noise points were eliminated, and 216 valid sample points were finally determined with a sample purity of 91.53%.The data before and after sample purification were used to establish an information value(IV) model, namely the pre-IV and post-IV models, to carry out regional geological hazard susceptibility evaluation.The results showed that the number of geological hazards located in the very high and high susceptibility zones of regional susceptibility maps generated by the pre-IV and post-IV models were 149 and 167,accounting for 63.13% and 70.77% of the total hazards, and the hazard densities were 0.508/km2 and 0.584/km2,respectively.Compared with initial samples, the number of hazards in very high and high prone regions increased by 18 after sample purification, and the hazard density increased by 0.076/km2,which made the distribution of geological hazards more concentrated and the prediction results more accurate.The results can provide theoretical and scientific basis for initial sample data purification in geological hazard susceptibility evaluation research.

Key words: geological hazard; susceptibility evaluation; evaluation indicators; sample purity; clustering algorithm; information value model;