论文部分内容阅读
蒙古语言语语料库是蒙古语语料库建设工程的重要组成部分和主要分库。标注和加工言语原始语料是蒙古语言语语料库建设中的一项重要任务。本方案是为了标注蒙古语言语语料库中的原始语料而设计的初步实施方案。本次标注主要借助计算机,利用语音分析软件对言语音段进行声学标注。此外还标注言语中出现的一些副语言学现象和非语言学现象以及其他语言词汇等。标注和加工后形成的蒙古语言语语料库音段声学标注库不仅填补相关领域的空白,而且将对蒙古语语音研究、蒙古语方言研究等基础研究提供大量的语音资料和语音声学特征信息,同时对蒙古语语音合成、语音识别、人机对话等言语工程研究具有重要应用价值和现实意义。
Mongolian language corpus is an important part of Mongolian corpus construction project and the main sub-library. Marking and processing the original speech of the language is an important task in the construction of the Mongolian language corpus. The program is designed to mark the original corpus in Mongolian language corpus designed for the initial implementation of the program. The main annotation with the help of computer, the use of voice analysis software for voice segment speech markings. In addition, some paralinguistic and non-linguistic phenomena appearing in the speech, as well as other language terms, etc., are also marked. The markup and processing of the Mongolian language corpus voice tagging library not only fill the gap in related fields, but also provide a large amount of voice data and phonetic and acoustic feature information for Mongolian phonetic research and Mongolian dialect research. At the same time, Mongolian speech synthesis, speech recognition, human-machine dialogue and other language engineering research has important application value and practical significance.