基于概念格的关联规则挖掘

来源 :华中师范大学 | 被引量 : 0次 | 上传用户:dave463
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
In this information age, databases are piling up huge volume data. For getting useful information from this "data sea", knowledge discovery in database (KDD) emerges as the most hot research field. The association rule-mining problem is one of the most studied and the most popular KDD tasks.The chief task of association rules mining is to find the frequent itemsets. The algorithms for finding frequent itemsets can be sort as three groups: 1. Levelwise algorithms. Apriori algorithm is a most typical algorithm. Other this kind of algorithms is Mannila, Pardon, DIG, and so on. The main idea of this kind of algorithms is to prune the sub-itemsets lattice. It is started from the 1 -size itemsets, passing the database level by level, and stopped when the largest frequent itemsets were found. It is a most popular method to find frequent itemsets. However, the perform time will increase in magnitude level, and the perform efficiency and effect won’t be very good. 2. Algorithms that the frequent itemsets are found by finding the largest frequent itemsets, for example, Pincer-Search algorithm, MaxClique algorithm, MaxMiner algorithm and so on. This kind of algorithms save performing time cost and space occupation. However, for the objections on its theoretical basement, it will lose information when generating assocaiton rules using the result. 3. Algorithms that finding frequent itemsets by discovering the frequent closed itemsets, i.e., algorithms based on formal concept analysis (FCA) and Concept Lattice. The main idea of this kind of algorithms is to find closed frequent itemset firstly, then get all frequent itemsets from the result. Because it transforms the problem of finding frequent itemset into the problem of finding frequent closed itemsets, this kind of algorithms reduces both space and time cost. Especially whenthe performing objects are dense and highly correlated databases, because the number of frequent closed itemsets is great less than the number of frequent itemsets, the performing effect will outgo the Apriori kind of algorithms, and the association rules can be found without any losing at same time. For these reason, association rules mining based on concept lattice is a very efficient mining method when the object database is a dense and highly correlated database.This paper firstly analyses the time and space complexity of several closed frequent itemsets mining algorithm based on concept lattice, then deduces the main factors deciding this kind of algorithms performing efficiency. When the database’s correlation degree is relatively low, the effect of association rules mining using concept lattice won’t be better than Apriori kind of algorithms, sometimes even worse than the Apriori kind of algorithms do. This paper develops a judging and selecting algorithm based on database’s correlation degree, whose name "is RelationDesider. It can get the information of the database’s correlation degree by an database passing before beginning association rules mining, and decide which kind of algorithm should be selected. At last, this paper introduces the association rules exact based on concept lattice, and discusses the difference exacting association rules method based on concept lattice and the common method of exacting association rules.
其他文献
本文通过对荣华二采区10
期刊
机动车牌照是交通管理机关颁发给机动车的唯一合法标志,在交通管理中机动车牌照有着重要作用。而牌照识别是确认机动车身份的重要技术,广泛运用于高速公路收费系统、智能小区的
数字视频技术是近年来信息技术领域中飞速发展的一个学科,将电视技术、计算机技术和通信技术结合在一起,在电视系统中得到了广泛的应用,已经进入千家万户的日常生活中。可是随着
该文中,分析了Nyquist ADC各种参数的意义、单阶ΣΔ调制器的原理和高阶ΣΔ调制器的原理.给出了解决高阶ΣΔ调制器不稳定性的方案:该设计的电路采用两级和单级级联的ΣΔ调
该文的工作主要是在分析该机制实现代码的基础上,从理论上提出参数的优化配置原则,克服了流量控制机制因对参数敏感而影响性能的缺点,并通过实验进一步验证了该原则的正确性.
Fas-associated protein with death domain(FADD)是经典的凋亡配体蛋白,由N端的死亡效应结构域(death effector domain,DED)和C端的死亡结构域(DeathDomain,DD)构成。当凋亡信
请下载后查看,本文暂不支持在线获取查看简介。 Please download to view, this article does not support online access to view profile.
菌根(mycorrhiza)是植物和真菌形成的一种互利共生体。化石表明,菌根在植物登陆时就出现了,约80%的陆地植物能形成菌根。泡囊丛枝菌根(vesicularaurbuscular mycorrhiza,VAM,
富营养化水体中光照强度、底质营养、重金属Cd浓度和氨氮浓度是影响沉水植物的重要环境因子。因此,研究几种环境因子的复合作用对沉水植物生长和生理的影响,对指导富营养化水体
进入信息时代后,信息安全问题成为人们关注的焦点。随着数字通信和计算机网络的迅猛发展,数据加密技术被广泛的应用于商业保密、军事通讯等许多重要领域中。近来,基于光学信