用于LDA的无监督特征选择

来源 :中国通信 | 被引量 : 0次 | 上传用户:between930
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
As a generative model, Latent Dirichlet Allocation Model, which lacks optimization of topics discrimination capability focuses on how to generate data, This paper aims to improve the discrimination capability through unsupervised feature selection. Theoretical analysis shows that the discrimination capability of a topic is limited by the discrimination capability of its representative words. The discrimination capability of a word is approximated by the Information Gain of the word for topics, which is used to distinguish between general word and special word in LDA topics. Therefore, we add a constraint to the LDA objective function to let the general words only happen in general topics other than special topics. Then a heuristic algorithm is presented to get the solution. Experiments show that this method can not only improve the information gain of topics, but also make the topics easier to understand by human.
其他文献
这几天,何思洁没来学校。李冬冬很想问何思洁没来的原因,但又怕遭人嘲笑。毕竟是男女有别,过于关心,就要露出马脚了。
传说清朝时,北京香山四王府村只有两眼水井,一眼在街中心,一眼在财主张伯元家后花园里。张伯元依仗权势,硬是把街中心的井给填了,人们要吃水只能到他家里去挑。他在井旁放了一个瓦
IPv4 and IPv6 will coexist for many years during the transition period from the traditional IPv4-based Internet to an IPv6-based Internet. DHTLayer, a novel IPv
生鲜产品具有货柜期短、配送准时度要求高等特点。供应链中任何的滞留都会导致它们货品损耗和价值流失。以下是生鲜产品运输管理的10条小贴士。1.开发一套有案可查的标准作业
The tensile properties and fracture behavior of a cast nickel-base superalloy K445 in the temperature range of 25-1 000 ℃ were investigated.The microstructure
杨炎,字公南,凤翔天兴(今陕西凤翔)人,唐玄宗开元十五年(公元727年)出生于当地名流之家。青年时曾为河西(治所在今甘肃凉州)节度掌书记。唐肃宗时进入中央政府,但未受重用,唐
周冬雨,凭借清纯到极致的气质打动张艺谋,以在校学生的身份成为张导奉献的一场“最干净的爱情”《山楂树之恋》的女主角。张艺谋说:“她像一张白纸。对现实社会中的很多人情世故
Functional hollow ceramic microspheres plated with Co-Fe were obtained through electroless plating technique for the application of lightweight microwave absorb