论文部分内容阅读
排序学习是机器学习与信息检索相互结合的研究领域,它利用机器学习的方法自动调节参数、综合多种排序特征、同时可以避免过拟合,进而得到新的排序模型用于排序被检索的文档.在排序学习方法中,Listwise方法的排序效果相对较好,但是目前已有的属于此类学习算法也有很多缺点:由于是基于列表所有的置换进行训练,时间复杂度太高;其损失函数并未充分利用极其重要的排序位置信息.本文基于此提出了新的学习算法,引入了位置信息损失因子,构建了新的损失函数,同时使用了效率更高的训练方法.最后在LETOR 4.0数据集上的实验结果表明,新学习算法的排序性能得到了较为明显的提升.
Sort learning is a research field in which machine learning and information retrieval are combined with each other. It uses the method of machine learning to automatically adjust parameters, synthesizes various sorting features, avoids over-fitting and obtains a new sorting model for sorting the retrieved documents In the ranking learning method, the Listwise method has relatively good sorting effect, but there are some disadvantages to the existing learning algorithms. Because of the high time complexity due to the training based on all the permutations in the list, the loss function This paper puts forward a new learning algorithm, introduces the location information loss factor, constructs a new loss function, and uses a more efficient training method.Lastly, in the LETOR 4.0 data set Experimental results show that the performance of the new learning algorithm has been significantly improved.