论文部分内容阅读
[目的 /意义]微博作为一种新兴的社交媒体平台,被互联网用户广泛关注。微博数据中包含着大量的用户信息、用户行为及用户生成内容,基于微博内容自动识别图书名有利于分析用户阅读兴趣、收集用户对图书的评价和挖掘图书相关知识。[方法 /过程]基于微博的数据特点,提出一种基于深度神经网络的表示学习方法,利用微博中候选图书名的上下文连续向量化表示,实现微博内容中的图书名自动识别。[结果 /结论]实验结果表明,所提出的方法显著优于传统基于特征工程的有指导机器学习方法,并达到91.92%的精确率。
[Purpose / Significance] Weibo, as a new social media platform, has attracted a great deal of attention from Internet users. Microblogging data contains a large amount of user information, user behavior and user generated content, based on the content of microblogging automatically identify the title of the book is conducive to the analysis of user reading interests, collecting user evaluation of books and mining books related knowledge. [Methods / Processes] Based on the characteristics of Weibo data, this paper proposes a representation learning method based on depth neural network, which uses context continuous vectorization representation of candidate titles in Weibo to automatically identify book titles in Weibo content. [Result / Conclusion] The experimental results show that the proposed method is significantly superior to the traditional guided machine learning method based on feature engineering and achieves an accuracy of 91.92%.