论文部分内容阅读
针对商务信息领域的产品命名实体,研究了产品命名实体各部分的结构特征和相互关系,建立了一个三层的半监督学习框架.该方法综合利用规则词典和统计的方法,建立一个隐条件随机场模型,可以更充分地利用自举得到数据的隐藏状态.在数码相机领域进行的实验结果表明,该方法只需要少量的手工标记数据就能较好地识别网页等文本中的产品命名实体.
Aiming at the product naming entity in the field of business information, this paper studies the structural features and the interrelationships of each part of the product naming entity, and establishes a three-level semi-supervised learning framework.This method makes use of the rules dictionaries and statistical methods to establish a hidden condition Airport model can make full use of the hidden state of data to bootstrap.Experimental results in the field of digital cameras show that this method can identify the product named entities in texts such as web pages well by using a small amount of manual tagging data.