论文部分内容阅读
基于贝叶斯或者全贝叶斯准则的说话人自动聚类或者识别方法,主要采取重复换算全发话语音段的相似量度,再组合相似性较大的语音片段实现说话人的聚类。这种方法中如果发话语音片段数越多,组合计算时间就越长,系统实时性变差,而且各说话人模型用GMM方法建立,发话语音时间短暂时GMM的信赖性降低,最终影响说话人聚类精度。针对上述问题,提出引用i-vector说话人相似度的非负值矩阵分解的高精度快速说话人聚类方法。
Based on the Bayesian or Bayesian criteria, the speaker automatic clustering or recognition method mainly adopts the repeated measure of the whole speech segment of speech similarity measure, and then combines the similar speech segments to realize the speaker clustering. In this method, if the number of voiced speech segments is longer, the combination calculation time is longer and the system real-time performance is worsened, and each speaker model is established by the GMM method, the reliability of the GMM is reduced when the speech voice time is short, and finally the speaker Clustering accuracy. In order to solve the above problems, a high-precision fast speaker clustering method based on non-negative matrix factorization of i-vector speaker similarity is proposed.