论文部分内容阅读
针对训练集中出现未知网络协议样本的识别问题,提出一种基于半监督聚类集成的识别方法.该方法利用流的相关性实现对标记样本的扩展,提高标记样本比例;引入集成学习辅助半监督聚类对扩展后训练集进行聚类分析,实现对未知协议样本的识别,最后对得到的混合未知协议样本集进行细分类.通过实际网络数据集进行仿真实验,结果表明该方法在样本标记比例较小情况下,能够有效地识别未知协议数据并实现细分类,提高聚类结果的稳定性.
Aiming at the recognition problem of unknown network protocol samples in training set, this paper proposes a recognition method based on semi-supervised clustering integration, which uses the correlation of flow to realize the extension of the labeled samples and improves the ratio of labeled samples. Cluster clustering analysis of the extended training set to realize the recognition of unknown protocol samples, and finally sub-classify the sample set of unknown mixed protocol.According to the actual network data set simulation results show that the method in the sample labeling ratio In the smaller case, the unknown protocol data can be effectively identified and subdivided, and the stability of the clustering result can be improved.