论文部分内容阅读
【目的】针对不同查询专指度语句的检索效果进行全面分析,为改善搜索引擎性能、提高用户检索体验提供借鉴。【方法】基于TREC Web Track查询语句,人工构建查询专指度标注集,选用语言模型狄利克雷平滑、语言模型线性插值平滑和BM25三种模型,以常用的信息检索评价指标为基准,探讨查询专指度强弱对检索效果在不同层次上的影响。【结果】在最靠前的几条检索结果中,强弱专指度查询语句的检索效果差异最大,强专指度的检索效果要明显好于弱专指度。【局限】仅在TREC数据集上进行实验测试,还需在其他数据集上进一步检验。【结论】搜索引擎在专指度这一维度下,应重点关注最靠前的几条检索结果的准确性,以此为切入点改善检索模型。
【Objective】 The purpose of this paper is to provide a comprehensive analysis of the search results of different sentences with different degrees of specializations, which can provide reference for improving the performance of search engines and improving the user’s search experience. 【Method】 Based on the TREC Web Track query, we construct artificial annotation dataset manually, select Dirichlet language model, linear interpolation smoothing language model and BM25 three models. Based on the commonly used information retrieval index, The degree of strength refers to the effect of retrieval on different levels. 【Result】 Among the top few search results, the search results of the strength-specific-degree-of-query are the most different, and the search results of the strong-specific-degree are significantly better than the weak-degree-specific. [Limitations] Experiments are performed on TREC datasets only, and further testing on other datasets is required. 【Conclusion】 In the dimension of special degree, the search engine should focus on the accuracy of several top-ranked search results and use it as the starting point to improve the retrieval model.