论文部分内容阅读
声调集成是汉语语音识别的一个重要任务.在语音识别的二次解码过程中,使用区分性训练的权重因子进行声调模型集成已被证明是一个有效的方法,而且使用上下文相关的得分加权进行模型组合也得到了应用.上下文相关模型组合方法的一个不足是将会带来大量的训练参数,从而导致权重训练受到过拟合的影响.针对该问题,提出利用声学决策树对上下文相关权重参数进行参数聚类,决策树节点问题集根据最小化训练数据的期望误识率进行选择.提出问题集剪枝来加快决策树的构建速度.汉语连续语音识别实验表明与人工选择上下文相关权重参数相比,该方法能够在大大减少参数数量的条件下明显降低误识率.
Tone integration is an important task in Chinese speech recognition.In the process of secondary decoding of speech recognition, the integration of tone models using discriminative training weights has been proved to be an effective method and the context-dependent score-weighted model And the combination has also been applied.An inadequacy of the context-dependent model combination method is that it will bring a large number of training parameters, which will lead to over-fitting of the weight training.To solve this problem, we propose to use the acoustic decision tree to perform context-dependent weighting parameters Parameter clustering and decision tree node problem set are selected according to the expected error rate of minimum training data.The problem set pruning is proposed to speed up the construction of decision tree.Chinese continuous speech recognition experiments show that compared with manual selection of context-dependent weighting parameters , This method can significantly reduce the false positive rate under the condition of greatly reducing the number of parameters.