,Transfer active learning by querying committee

来源 :浙江大学学报(英文版)(C辑:计算机与电子) | 被引量 : 0次 | 上传用户:mnswangjian
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
In real applications of inductive leaing for classifi cation, labeled instances are often defi cient, and labeling them by an oracle is often expensive and time-consuming. Active leaing on a single task aims to select only informative unlabeled instances for querying to improve the classifi cation accuracy while decreasing the querying cost. However, an inevitable problem in active leaing is that the informative measures for selecting queries are commonly based on the initial hypotheses sampled from only a few labeled instances. In such a circumstance, the initial hypotheses are not reliable and may deviate from the true distribution underlying the target task. Consequently, the informative measures will possibly select irrelevant instances. A promising way to compensate this problem is to borrow useful knowledge from other sources with abundant labeled information, which is called transfer leaing. However, a signifi cant challenge in transfer leaing is how to measure the similarity between the source and the target tasks. One needs to be aware of different distributions or label assignments from unrelated source tasks;otherwise, they will lead to degenerated performance while transferring. Also, how to design an effective strategy to avoid selecting irrelevant samples to query is still an open question. To tackle these issues, we propose a hybrid algorithm for active leaing with the help of transfer leaing by adopting a divergence measure to alleviate the negative transfer caused by distribution differences. To avoid querying irrelevant instances, we also present an adaptive strategy which could eliminate unnecessary instances in the input space and models in the model space. Extensive experiments on both the synthetic and the real data sets show that the proposed algorithm is able to query fewer instances with a higher accuracy and that it converges faster than the state-of-the-art methods.
其他文献
作为植物界广泛存在的一类酚类聚合物,木质素是陆生植物正常生长发育过程中非常重要的生物大分子,而且与人类的生活息息相关。肉桂醇脱氢酶(CAD)依赖于NADPH还原肉桂醛及其衍生
为研究栉孔扇贝消化盲囊和鳃丝对石油烃的富集排除规律及氧化损伤效应,在实验室条件下将栉孔扇贝在各浓度石油烃的海水中分别进行染毒和排除实验各15d,测定消化盲囊和鳃丝的
该文研究了两系杂交水稻的氮素吸收特性、干物质积累与分配特性以及不同施氮条件对两系杂交水稻氮素吸收特性、干物质积累与分配特性、籽粒灌浆动态、稻谷产量和稻米品质的影
讨论新闻与想象的关系,应从新闻作品的实际出发来分析、观察作品中有没有想象的成分,是否物化了想象,再分析,观察这种想象的存在特征。我们大家都熟悉毛泽东同志为新华社写
氮肥的过量施用和低效利用,造成资源浪费和环境污染,不利于农业可持续发展。为了减少氮肥的投入量,发挥氮肥的增产效益,本研究在玉米-大豆带状套作条件下,以玉米品种“川单41
该研究以玉米栽培生理专家顾慰连、戴俊英教授的三十多年栽培生理、遗传育种理论和经验及解决问题、分析、判断思维方法建立内容广泛的知识系统和包括声、图、像、文本、天气
1996-1997年在山东农业大学泰安实习农场,采用大田试验与盆栽试验相结合的方法, 研究了简化整枝对棉花产量形成、纤维品质、C同化物生产运转分配及生理特性的影响.盆栽试验结
本文研究了7个不同类型水稻品种发育过程中对16小时长日与10小时短日不同组合处理反应的特性。根据长短日照不同组合对生育期的影响和长日处理不同天数后达到抽穗所需短日天
With the rapid development of the Internet, recent years have seen the explosive growth of social media. This brings great challenges in performing efficient and accurate image retrieval on a large sc