论文部分内容阅读
多关系分类是数据挖掘领域中的研究和应用热点。已有多关系朴素贝叶斯分类算法将所有与目标表相连的表都考虑在内,包括语义关系很弱的表。为此,本文提出一种新的分类算法—Graph-NB。它通过对表进行剪裁,达到优化语义关系图,从而一定程度上消除无关表对分类影响的目的。该算法实现了深度优先与广度优先两种遍历策略。实验结果表明,语义关系图的优化可以提高分类准确度和运行效率,相比于其他算法,该算法运行时间短,分类准确度高。
Multi-relational classification is a hot research and application area in the field of data mining. The Many-Matched Naive Bayesian Classification Algorithm takes into account all tables connected to the target table, including tables that have a weak semantic relationship. To this end, this paper presents a new classification algorithm -Graph-NB. It cuts through the table, to optimize the semantic relationship diagram, to a certain extent, eliminate the irrelevant table on the classification of the purpose. The algorithm implements two traversal strategies: depth first and breadth first. Experimental results show that the optimization of semantic relation graph can improve classification accuracy and operation efficiency. Compared with other algorithms, this algorithm has short running time and high classification accuracy.