Scientific Keyphrase Extraction:Extracting Candidates with Semi-supervised Data Augmentation

来源 :第十七届全国计算语言学学术会议暨第六届基于自然标注大数据的自然语言处理国际学术研讨会(CCL 2018) | 被引量 : 0次 | 上传用户:doraemon1226
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  Keyphrase extraction can provide effective ways of organiz-ing scientific documents.For this task,neural-based methods usually suffer from performance unstability due to data scarcity.In this paper,we adopt the pipeline two-step method including candidate extraction and keyphrase ranking,where candidate extraction is a key to influence the whole performance.In the candidate extraction step,to overcome the low-recall problem of traditional rule-based method,we propose a novel semi-supervised data augmentation method,where a neural-based tagging model and a discriminative classifier boost each other and get more confident phrases as candidates.With more reasonable candidates,keyphrase are identified with recall promoted.Experiments on SemEval 2017 Task 10 show that our model can achieve competitive results.
其他文献
Extracting term translation pairs is of great help for Chinese histori-cal classics translation since term translation is the most time-consuming and challenging part in the translation of historical
Nowadays,research on stylistic features(SF)mainly focuses on two aspects: lexical elements and syntactic structures.The lexical elements act as the content of a sentence and the syntactic structures c
Dialogue intent detection and semantic slot filling are two critical tasks in nature language understanding(NLU)for task-oriented dialog systems.In this paper,we present an attention-based encoder-dec
In recent years,mining opinions from customer reviews has been widely explored.Aspect-level sentiment analysis is a fine-grained subtask,which aims to detect the sentiment polarity towards a partic-ul
Network Representation Learning(NRL)can learn a latent space rep-resentation of each vertex in a topology network structure to reflect linked in-formation.Recently,NRL algorithms have been applied to
Network representation learning(NRL)aims at building a low-dimensional vector for each vertex in a network,which is also increasingly recognized as an important aspect for network analysis.Some curren
This paper studies the methods to improve end-to-end neural coreference resolution.First,we introduce a coreference cluster modification algorithm,which can help modify the coreference cluster to rule
Type information is very important in knowledge bases,but some large knowledge bases are lack of type information due to the incompleteness of knowledge bases.In this paper,we propose to use a well-de
It is common to fine-tune pre-trained word embeddings in text categorization.However,we find that fine-tuning does not guarantee improvement across text categorization datasets,while could introduce c
Named entity recognition(NER)in Chinese electronic medical records(EMRs)has become an important task of clinical natural language processing(NLP).However,limited studies have been performed on the cli