【摘 要】
:
In recent years,machine reading comprehension is becoming a more and more popular research topic.Promising results were obtained when the machine reading comprehension task had only two inputs,context
【机 构】
:
State Key Laboratory of Software Development Environment,Beihang University,China
【出 处】
:
第十八届中国计算语言学大会暨中国中文信息学会2019学术年会
论文部分内容阅读
In recent years,machine reading comprehension is becoming a more and more popular research topic.Promising results were obtained when the machine reading comprehension task had only two inputs,context and query.In this paper,we propose a capsule networks based model for Chinese opinion machine reading comprehension task which has three inputs: context,query and alternatives.First,we use a bi-directional LSTM to encode the three inputs.Second,model the complex interactions between context and query with a multiway attention layer.In addition to the attention mechanism used in BiDAF,the other two attention functions are designed to match the relationship between inputs.Finally,we present a capsule networks layer to route the right alternative.Specifically,we use two strategies to improve the dynamic routing process to filter noisy capsules,which may contain useless information such as stop words.Our single model achieves competitive results compared to the baseline methods on a Chinese dataset and obtains a significant improvement of 2.45%accuracy.
其他文献
针对当前医学语料库涵盖实体分类以及实体关系难以满足精准医学发展需求的问题,本文从儿科疾病入手,参考现有的医学命名实体和实体关系标注体系,在医学领域专家的指导下,制定了适合儿科学的命名实体和实体关系的标注体系及详细标注规范;利用自行开发的标注工具,在采用机器学习进行预标注实体及实体关系后;以标注规范为指导,进行多轮人工标注,完成了298余万字的儿科医学文本中的实体及关系进行标注,形成了面向儿科疾病的
多模机器翻译近年来成为研究热点之一.已有工作表明,融入图像视觉语义信息可以提升文本机器翻译模型的效果,已有工作多数将图片的整体视觉语义信息融入到翻译模型,而图片中可能包含不同的语义对象,并且这些不同的局部语义对象对解码端单词的预测具有不同程度的影响和作用.基于此,本文提出一种融合图像注意力的多模机器翻译模型,将图片中的全局语义和不同部分的局部语义信息与源语言文本的交互信息作为图像注意力融合到文本注
语言知识驱动计算机正确地处理自然语言,介词结构知识对自然语言处理和语言教学研究有很重要的意义.本文基于大规模语料库构建了高质量的介词结构搭配库.首先在前人研究的基础上对介词进行归类并建立了介词搭配知识体系,而后设计并实现了从大数据中获取介词结构搭配知识的规则,最后对抽取结果及其数据规模进行了统计和评估.主要目的是通过形式手段获取高质量的介词结构搭配,同时也为自然语言处理和语言学基础及应用研究提供数
神经网络语言模型应用广泛但可解释性较弱,其可解释性的一个重要而直接的方面表现为词嵌入向量的维度取值和语法语义等语言特征的关联状况.先前的可解释性工作集中于对语料库训得的词向量进行知识注入,以及基于训练和任务的算法性能分析,对词嵌入向量和语言特征之间的关联缺乏直接的验证和探讨.该文应用基于语言知识库上的伪语料法,通过控制注入语义特征,并对得到的词向量进行分析后取得了一些存在性的基础性结论:语义特征可
Hashtag recommendation aims to recommend hashtags when social media users show the intention to insert a hashtag by typing in the hashtag symbol “#” while writing a microblog.Previous methods usually
Distant supervision is an effective way to collect large-scale training data for relation extraction.To better solve the wrong labeling problem accompanied by distant supervision,some methods have bee
性别偏见是社会学研究的热点.近年来,机器学习算法从数据中学到偏见使之得到更广泛的关注,但目前尚无基于语料库的方法对文本数据中职业性别偏见的研究.该文基于标记理论,利用BCC和DCC语料库,从共时和历时两个层面考察了63个职业的性别无意识偏见现象.首先,以调查问卷的形式调研了不同性别和不同年龄段的人群对63个职业的性别倾向,发现和BCC语料库中多领域的职业性别偏见度呈显著的正相关.然后从共时的角度,
Aspect-based sentiment analysis(ABSA)aims at identifying sentiment polarities towards aspect in a sentence.Attention mechanism has played an important role in previous state-of-the-art neural models.H
This present study aims to investigate the colligational structures in China English.A corpus-based and comparative methodology was adopted in which three verbs of communication(discuss,communicate an
Answer selection(AS)is an important subtask of question answering(QA)that aims to choose the most suitable answer from a list of candidate an-swers.Existing AS models usually explored the single-scale