Scientific Keyphrase Extraction:Extracting Candidates with Semi-supervised Data Augmentation

来源 :第十七届全国计算语言学学术会议暨第六届基于自然标注大数据的自然语言处理国际学术研讨会(CCL 2018) | 被引量 : 0次 | 上传用户：doraemon1226

【摘要】

：

【作者】

：

Qianying Liu Daisuke Kawahara Sujian Li

【机构】

：

School of Mathematical Science,Peking University

【出处】

：

第十七届全国计算语言学学术会议暨第六届基于自然标注大数据的自然语言处理国际学术研讨会(CCL 2018)

【发表日期】

：

2018年9期

【关键词】

：

keyphrase extraction neural networks semi-supervised learning

下载到本地 , 更方便阅读

下载此文赞助VIP

声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架

论文部分内容阅读

　　Keyphrase extraction can provide effective ways of organiz-ing scientific documents.For this task,neural-based methods usually suffer from performance unstability due to data scarcity.In this paper,we adopt the pipeline two-step method including candidate extraction and keyphrase ranking,where candidate extraction is a key to influence the whole performance.In the candidate extraction step,to overcome the low-recall problem of traditional rule-based method,we propose a novel semi-supervised data augmentation method,where a neural-based tagging model and a discriminative classifier boost each other and get more confident phrases as candidates.With more reasonable candidates,keyphrase are identified with recall promoted.Experiments on SemEval 2017 Task 10 show that our model can achieve competitive results.

其他文献

Term Translation Extraction from Historical Classics Using Modern Chinese Explanation

Extracting term translation pairs is of great help for Chinese histori-cal classics translation since term translation is the most time-consuming and challenging part in the translation of historical

会议

BiLSTM-CRFCo-occurrence frequencyTransliteration featuresTerm translation ext

Syntax Enhanced Research Method of Stylistic Features

Nowadays,research on stylistic features(SF)mainly focuses on two aspects: lexical elements and syntactic structures.The lexical elements act as the content of a sentence and the syntactic structures c

会议

StyleLexical and syntactic featuresFeature dimension reduction

Attention-Based CNN-BLSTM Networks for Joint Intent Detection and Slot Filling

Dialogue intent detection and semantic slot filling are two critical tasks in nature language understanding(NLU)for task-oriented dialog systems.In this paper,we present an attention-based encoder-dec

会议

Nature Language UnderstandingSlot FillingIntent detectionAt-tention Model

A Joint Model for Sentiment Classification and Opinion Words Extraction

In recent years,mining opinions from customer reviews has been widely explored.Aspect-level sentiment analysis is a fine-grained subtask,which aims to detect the sentiment polarity towards a partic-ul

会议

aspect-level sentiment analysisopinion words extractionneural networkattentio

Linked Document Classification by Network Representation Learning

Network Representation Learning(NRL)can learn a latent space rep-resentation of each vertex in a topology network structure to reflect linked in-formation.Recently,NRL algorithms have been applied to

会议

Document ClassificationNRLFlexible Random Walk Strategy

Network Representation Learning based on Community and Text Features

Network representation learning(NRL)aims at building a low-dimensional vector for each vertex in a network,which is also increasingly recognized as an important aspect for network analysis.Some curren

会议

Network Representation LearningCommunity and Text FeaturesInductive Matrix Com

A Study on Improving End-to-End Neural Coreference Resolution

This paper studies the methods to improve end-to-end neural coreference resolution.First,we introduce a coreference cluster modification algorithm,which can help modify the coreference cluster to rule

会议

Coreference resolutionEnd-to-endNeural network

Type Hierarchy Enhanced Heterogeneous Network Embedding for Fine-Grained Entity Typing in Knowledge

Type information is very important in knowledge bases,but some large knowledge bases are lack of type information due to the incompleteness of knowledge bases.In this paper,we propose to use a well-de

会议

Entity TypingKnowledge Base CompletionHeterogeneous Network Embedding

A Word Embedding Transfer Model for Robust Text Categorization

It is common to fine-tune pre-trained word embeddings in text categorization.However,we find that fine-tuning does not guarantee improvement across text categorization datasets,while could introduce c

会议

Word EmbeddingText CategorizationTransfer Learning

Medical Knowledge Attention Enhanced Neural Model for Named Entity Recognition in Chinese EMR

Named entity recognition(NER)in Chinese electronic medical records(EMRs)has become an important task of clinical natural language processing(NLP).However,limited studies have been performed on the cli

会议

Chinese Electronic Medical RecordNamed Entity RecognitionDeep LearningKnowled

Scientific Keyphrase Extraction:Extracting Candidates with Semi-supervised Data Augmentation

与本文相关的学术论文