论文部分内容阅读
针对汉语的韵律特征受语境参数影响时,表现出层次性的特点,本文描述了一种带特殊加权因子和输出优化功能的人工神经网络,并用其来构筑汉语TTS系统的韵律模型。大量测试表明,该人工神经网络的拓扑结构相较传统的人工神经网络模型更能反映出汉语的韵律特点。它提高了模型本身的收敛速度和运算精度,从而改善了整个韵律模型的质量。同时,本文还对汉语音节的基频曲线进行了规格化处理,较详细的分析了音节基频规格化参数-SPiS,在基频调节中的作用和方式。SPiS参数能够反映出汉语的声调特点,且方便了网络模型的建立和汉语韵律的控制。
The prosodic features against Chinese are affected by the contextual parameters, which show the hierarchical features. In this paper, an artificial neural network with special weighting factors and output optimization is described and used to construct the prosodic model of Chinese TTS system. A large number of tests show that the topology of the artificial neural network can better reflect the Chinese prosodic features than the traditional artificial neural network model. It improves the convergence speed and accuracy of the model itself, thus improving the quality of the entire prosody model. At the same time, this paper also normalizes the fundamental frequency curve of Chinese syllables, and analyzes in more detail the role and mode of pitch-normalization parameter-SPiS in the pitch adjustment. SPiS parameters can reflect the Chinese tone characteristics, and facilitate the establishment of the network model and Chinese prosody control.