THUUyMorph:维吾尔语形态分析语料库

来源 :第十六届全国计算语言学学术会议暨第五届基于自然标注大数据的自然语言处理国际学术研讨会 | 被引量 : 0次 | 上传用户:krist2009
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
该文介绍了维吾尔语形态分析语料库及其构建过程.从网上搜集了新闻、科技、小说、散文、日常用语和其它等不同领域的语料,采用制定切分规则(带语音变化和不带语音变化)、人工切分、错误分析和校对等过程建立了维吾尔语形态分析语料库.该语料库为50万词次规模,分为词级和句子级两类标注.该文工作不仅对相关维吾尔语语料库的建设具有参考意义,而且为维吾尔语的自然语言处理的研究提供了有益的资源.
其他文献
This paper examines the impact of power transmission network topology change on locational marginal price(LMP)in real-time power markets.We consider the case where the false status of circuit breakers
The rising demand for high density power storage systems such as hydrogen,combined with renewable power production systems,has led to the design of optimal power production and storage systems.In this
With the development of electricity market mechanism and advanced metering infrastructure(AMI),demand response has become an important alternative solution to improving power system reliability and ef
Based on analysis of construction and operation of micro integrated energy systems(MIES),this paper presents economic optimization for their configuration and sizing.After presenting typical models fo
In this paper,a data-driven linear clustering(DLC)method is proposed to solve the long-term system load forecasting problem caused by load fluctuation in some developed cities.A large substation load
Energy management is facing new challenges due to the increasing supply and demand uncertainties,which is caused by the integration of variable generation resources,inaccurate load forecasts and non-l
A bulk power system is conventionally characterized by a complex structure with a large number of components.Each component generally has a different contribution to the transmission congestion(TC)of
本文研究了几种酶制剂在速冻油条生胚中的-些应用,为油条工业化大生产企业技术人员在使用酶制剂时提供一些参考和帮助.试验表明:酶制剂对改善速冻油条生胚的抗冻性能有一定效果.
在地理等特有领域概念关系抽取过程中,由于其有限的样本标注资源,难以应用深度学习等大规模知识图谱构建技术.迁移学习方法能够利用开放域文本语料资源,帮助解决目标领域训练数据较少的问题.本文针对地理领域文本的时序性特征,利用长短期记忆(LongShort-Term Memory,LSTM)神经网络,构建了基于词特征和句子特征的概念关系抽取模型,针对地理概念关系语料缺乏的问题,提出了基于LSTM的迁移学习
为黏着语形态分析建立了一种图状结构的判别式模型,该模型将黏着语语句的形态分析结果建模为形态成分的图状结构,通过灵活丰富的特征设计描述了词语内部形态成分之间以及分属相邻词语的形态成分之间的关联约束.相比传统的线性模型,图状模型更好地考虑了各形态成分之间的语言学关联,从而有望取得更高的整句分析性能.在韩语和维吾尔语上的实验结果表明,图状模型相比线性模型取得了显著的性能提升,形态分析词级准确率分别提升了