论文部分内容阅读
统计语音合成使用隐Markov模型(HMM)作为声学特征的统计模型。提出了一种利用声学模型空间距离进行HMM的大尺度压缩的量化方法,通过对矢量量化码本进行的优选迭代步骤,减小压缩后的声道谱模型与原模型之间的声学距离,使通过量化模型合成的语音更加接近未量化模型。主观和客观测试结果显示:使用该方法进行声道谱模型的压缩,在压缩至原模型大小的0.06左右时,仍有约90%的评价得分认为合成语音的质量没有明显下降。
Statistical Speech Synthesis uses a Hidden Markov Model (HMM) as a statistical model of acoustic features. A quantization method for large scale compression of HMM based on acoustic model spatial distance is proposed. By optimizing iterative steps of vector quantization codebook, the acoustic distance between the compressed channel spectrum model and the original model is reduced, The speech synthesized by the quantitative model is closer to the unquantified model. Subjective and objective test results show that using this method to compress the channel spectral model, there is still about 90% of the scoring scores compressed to the original model size of about 0.06, the quality of the synthesized speech is not significantly decreased.