论文部分内容阅读
对大型数据库的异常数据准确挖掘是实现数据库系统的故障诊断和检测的关键技术。异常数据具有复杂性和多样性,传统方法难以对其进行准确、有效识别。为了提高异常数据挖掘性能,提出一种基于改进模糊遗传算法的大型数据库异常数据挖掘算法。构建大型数据库的异常数据信息特征模型,数据训练样本在进行遗传迭代状态下执行更新平滑,依据平方差函数值较小为原则更新簇的中心点,求得异常数据的功率谱密度函数作为特征,进行异常数据特征优选,计算异常数据流信息聚焦在多层空间模糊聚类中心,将训练集与所属的类别进行关联,得到异常数据的属性集分类和信息增益,从而提高数据的挖掘性能。仿真实验结果表明,该算法具有较高的异常数据检测和挖掘性能,挖掘识别能力优于传统模型,具有较好的应用价值。
The accurate excavation of abnormal data in large database is the key technology to realize the fault diagnosis and detection in database system. Abnormal data is complex and diverse, and it is difficult for traditional methods to accurately and effectively identify them. In order to improve the performance of anomaly data mining, a large-scale database anomaly data mining algorithm based on improved fuzzy genetic algorithm is proposed. The abnormal data information feature model of large database was constructed. The data training samples were updated and smoothed under the state of genetic iteration. The central point of the cluster was updated according to the smaller value of the square error function, and the power spectral density function of the abnormal data was obtained as the feature. The feature of anomalous data is optimized. The abnormal data stream information is focused on the fuzzy clustering center of multi-layer space, and the training set is correlated with the belonging category to get the attribute set classification and information gain of abnormal data, so as to improve the data mining performance. Simulation results show that this algorithm has high performance of anomaly detection and mining, and its mining identification ability is superior to the traditional model, which has a good application value.