Denoising Distant Supervision for Relation Extraction with Entropy Weight Method

来源 :第十八届中国计算语言学大会暨中国中文信息学会2019学术年会 | 被引量 : 0次 | 上传用户:jxczl900424
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  Distant supervision for relation extraction has been widely used to construct training set by aligning the triples of the knowledge base,which is an efficient method to reduce human efforts.However,this method inevitably suffers from wrong labeling problems leading too much noise that will severely hurt the performance of relation extrac-tion.To tackle this problem,in this paper,we propose a denosing model based on Entropy Weight Method(EWM)to filter the noise and se-lect most relevant sentences.First,in a pretraining stage,we develop a sentence-level relation aware attention mechanism to distinguish several most relevant sentence,increasing the attention weights for those critical sentences.Second,we filter the noisy sentences by calculating the entropy weight using the above attention matrix,and then we employ intra-bag and inter-bag attentions to aggregate these selected sentence represen-tations.Experiments on the NYT dataset show that our method can significantly reduce the noisy instance and achieve the state-of-the-art model performance.
其他文献
性别偏见是社会学研究的热点.近年来,机器学习算法从数据中学到偏见使之得到更广泛的关注,但目前尚无基于语料库的方法对文本数据中职业性别偏见的研究.该文基于标记理论,利用BCC和DCC语料库,从共时和历时两个层面考察了63个职业的性别无意识偏见现象.首先,以调查问卷的形式调研了不同性别和不同年龄段的人群对63个职业的性别倾向,发现和BCC语料库中多领域的职业性别偏见度呈显著的正相关.然后从共时的角度,
Aspect-based sentiment analysis(ABSA)aims at identifying sentiment polarities towards aspect in a sentence.Attention mechanism has played an important role in previous state-of-the-art neural models.H
This present study aims to investigate the colligational structures in China English.A corpus-based and comparative methodology was adopted in which three verbs of communication(discuss,communicate an
Answer selection(AS)is an important subtask of question answering(QA)that aims to choose the most suitable answer from a list of candidate an-swers.Existing AS models usually explored the single-scale
In recent years,machine reading comprehension is becoming a more and more popular research topic.Promising results were obtained when the machine reading comprehension task had only two inputs,context
Most of the current man-machine dialogues are at the two end-points of a spectrum of dialogues,i.e.goal-driven dialogues and non goal-driven chitchats.Document-driven dialogues provide a bridge betwee
Natural language inference(NLI)is a challenging task to determine the relationship between a pair of sentences.Existing Neural Network-based(NN-based)models have achieved prominent success.However,rar
In this paper,we present a neural model to map structured table into document-scale descriptive texts.Most existing neural net-work based approaches encode a table record-by-record and generate long s
Word embeddings have a significant impact on natural lan-guage processing.In morpheme writing systems,most Chinese word em-beddings take a word as the basic unit,or directly use the internal structure
Dropped pronoun recovery,which aims to detect the type of pronoun dropped before each token,plays a vital role in many applications such as Machine Translation and Information Extraction.Recently,deep