论文部分内容阅读
在主动学习中,采用近邻熵(Neighborhood Entropy)作为样例的挑选标准,熵值最大的样例体现基于近邻分类规则,最无法确定该样例的类标.而标注不确定性高的样例可用尽量少的样例获得较高的分类性能.文中提出一种基于近邻熵的主动学习算法.该算法首先计算未标注样例的近邻样例类别熵,然后挑选熵值最大样例的进行标注.实验表明,基于近邻熵挑选样例进行标注,较基于最大距离(Maximal Distance)挑选和随机样例挑选可获得更高的分类性能.
In active learning, the neighborhood entropy (Neighborhood Entropy) is chosen as the sample selection criterion, and the sample with the largest entropy value is based on the nearest neighbor classification rules, and the sample with the highest uncertainty can not be identified We can get a higher classification performance by using as few samples as possible.An active learning algorithm based on neighborhood entropy is proposed in this paper.The algorithm first calculates the entropy of the nearest neighbor samples without marking the samples and then selects the largest sample with the highest entropy value Experiments show that, based on the nearest neighbor entropy selection example, higher classification performance is obtained than that based on Maximal Distance selection and random sample selection.