【摘 要】
:
Uyghur is an agglutinative language that has many mor-phemes.It is necessary for processing Uyghur to segment words into morphemes.This work is called morphological segmentation.Previous works treat m
【机 构】
:
Beijing Information Science and Technology University
【出 处】
:
第十八届中国计算语言学大会暨中国中文信息学会2019学术年会
论文部分内容阅读
Uyghur is an agglutinative language that has many mor-phemes.It is necessary for processing Uyghur to segment words into morphemes.This work is called morphological segmentation.Previous works treat morphological segmentation as a tagging task and classify each character as one of four classes,which are {b,m,e,s}.However,these labels are not independent from each other,which makes the mod-els easily overfitted.We propose a new method for the segmentation task.Instead of using these labels,we use only segmentation points for mod-eling.The model used in our method is more robust and easier to train than previous methods.Applying our model to Uyghur morphological segmentation,it achieves high accuracy and higher recall and f1 score than previous models.
其他文献
运用降水与径流双累积曲线法、Mann-Kendall法,变差系数Cv等指标对东里店水文站1956~2016年系列径流资料进行分析,以判断水库建设、小流域治理、沂源县城区面积扩大、农业种植结构改变等下垫面变化对沂河上游流域天然径流的影响及其程度.经分析可见,沂河上游流域天然径流量有减少趋势但不明显,年际间的变化趋于平缓;水利工程尤其水库调蓄仍是主要影响因素,小流域治理导致降水截留、蓄渗增加亦影响了天
地下水监测研究工作是国民经济建设的一项基础工作,是水利、水文事业的重要组成部分.根据《国家地下水监测工程(水利部分)山东省监测井建设工程第10标段合同》要求,2017年7月31日泰安市完成49眼自动监测井的土建工作,安装自动监测仪器后,2018年正式投入运行,国家地下水监测站建设完成后,如何更好的做好运行维护与管理工作已成为地下水管理工作中的重中之重.本文结合泰安市国家地下水监测工程运维与管理中存
基于黄河宁夏、内蒙古河段实地查勘和实测资料进行了分析.研究总结了宁蒙河段2020~2021年度凌情特点.黄河宁蒙河段2020~2021年度凌情具有流凌封冻前气温高,流量大,河段流凌、封冻时间接近常年;封河流量大,首封河段出现几封几开现象;盖面冰层厚;槽蓄量增量小,开河过程释放完全;个别断面封河水位高;开河时间早、速度快、开河过程未出现大的凌峰流量;全线开通日期为有资料以来最早等特点.形成本年度凌情
为解决城市洪涝监测预警预报与应急响应中城市地下管网水位精准监测的难题,在调研分析城市地下管网水位监测的现状的基础上,研究基于120GHz调频连续波的一体化雷达水位计的技术路线,为城市地下管网水位精准监测提供一种性价比高的解决方案.
Neural Machine Translation(NMT)has recently achieved the state-of-the-art in many machine translation tasks,but one of the challenges that NMT faces is the lack of parallel corpora,especially for low-
Automatic judgment prediction aims to predict the judicial results based on case materials.It has been studied for several decades mainly by lawyers and judges,considered as a novel and prospective ap
Learning the similarity between sentences is made difficult by the fact that two sentences which are semantically related may not contain any words in common limited to the length.Recently,there have