论文部分内容阅读
近年,随着人工智能的快速发展,机器学习逐渐成为学术研究的热点。其传统监督学习问题中,学习算法是外界给定已标注实例集作为训练集进行训练,从中归纳出模型。而现实应用中,对实例集进行标注代价昂贵、枯燥乏味或是异常困难[1]。有监督的学习方法都依赖于标注的样本,所以很多情况下,算法和处理方法实现了,就是缺少样本,所以就需要人工进行标注,这里遇到一个问题,如何在保证分类准确率的情况下,减少人工标注的数量。解决这个问题的一种方法就是主动学习。其核心思想是算法通过少量标注,可以实现与从训练集中获取更高精度的数据。
In recent years, with the rapid development of artificial intelligence, machine learning has become a hot spot in academic research. Among the traditional supervised learning problems, the learning algorithm is given a set of annotated examples from the outside world as a training set to be trained, from which the model is summarized. In practical applications, it is expensive, boring or unusual to annotate an instance set [1]. Supervised learning methods rely on the labeled samples, so in many cases, algorithms and processing methods are implemented, that is, the lack of samples, so you need to manually mark, here is a question, how to ensure the classification accuracy , Reduce the number of manual annotation. One way to solve this problem is to take the initiative to learn. The core idea is that the algorithm can achieve and obtain more accurate data from the training set through a small amount of annotation.