论文部分内容阅读
话者识别系统的时间鲁棒性是影响话者识别系统实用化的关键问题之一。为了提高系统的时间鲁棒性,本文提出了基于子带矢量量化(SBVQ)及人工神经网络(ANN)的话者模型。将语音文本的有效频段划分为几个子带,分别求取于带上的矢量量化码本(SBVQ码本),利用BP型人工神经网络(BPNN)对训练数据在各个子带上的量化误差进行拟合,即可训练出话者模型(SBVQ码本及BPNN的极值矩阵、确认阈值)。该话者模型反映了不同频段对话者识别系统性能的不同影响,并可将时间间隔等因素对系统性能的影响局限在某个子带内从而提高模型的时间鲁棒性。实验表明,本文提出的(SBVQ+BPNN)话者模型具有较好的时间鲁棒性。
The time robustness of speaker recognition system is one of the key issues affecting the practical use of speaker recognition system. In order to improve the time robustness of the system, this paper proposes a speaker model based on Subband Vector Quantization (SBVQ) and Artificial Neural Network (ANN). The effective frequency band of the speech text is divided into several subbands, and the vector quantization codebook (SBVQ codebook) is obtained from the band respectively. The BP neural network (BPNN) is used to quantize the quantization error of training data on each subband Fitting, the speaker model (SBVQ codebook and BPNN extremum matrix, threshold of acknowledgment) can be trained. The speaker model reflects the different effects of inter-speaker identification system performance in different frequency bands and can limit the impact of time intervals and other factors on system performance within a certain sub-band to improve the time robustness of the model. Experiments show that the (SBVQ + BPNN) speaker model proposed in this paper has better time robustness.